https://prefect.io logo
Title
n

Nelson Griffiths

01/27/2023, 2:01 PM
I am running into an error running a deployment on docker infrastructure. When triggering a deployment to run, it immediately turns up with a submission error (shared in comments). I have checked that docker is running on the machine and works. Not sure where else to look to debug? Anyone here in #prefect-docker seen this before?
Submission failed. Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 703, in urlopen httplib_response = self._make_request( ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 398, in _make_request conn.request(method, url, **httplib_request_kw) File "/usr/local/lib/python3.11/http/client.py", line 1282, in request self._send_request(method, url, body, headers, encode_chunked) File "/usr/local/lib/python3.11/http/client.py", line 1328, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/usr/local/lib/python3.11/http/client.py", line 1277, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/local/lib/python3.11/http/client.py", line 1037, in _send_output self.send(msg) File "/usr/local/lib/python3.11/http/client.py", line 975, in send self.connect() File "/usr/local/lib/python3.11/site-packages/docker/transport/unixconn.py", line 30, in connect sock.connect(self.unix_socket) FileNotFoundError: [Errno 2] No such file or directory During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/requests/adapters.py", line 489, in send resp = conn.urlopen( ^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 787, in urlopen retries = retries.increment( ^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/urllib3/util/retry.py", line 550, in increment raise six.reraise(type(error), error, _stacktrace) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/urllib3/packages/six.py", line 769, in reraise raise value.with_traceback(tb) File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 703, in urlopen httplib_response = self._make_request( ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 398, in _make_request conn.request(method, url, **httplib_request_kw) File "/usr/local/lib/python3.11/http/client.py", line 1282, in request self._send_request(method, url, body, headers, encode_chunked) File "/usr/local/lib/python3.11/http/client.py", line 1328, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/usr/local/lib/python3.11/http/client.py", line 1277, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/local/lib/python3.11/http/client.py", line 1037, in _send_output self.send(msg) File "/usr/local/lib/python3.11/http/client.py", line 975, in send self.connect() File "/usr/local/lib/python3.11/site-packages/docker/transport/unixconn.py", line 30, in connect sock.connect(self.unix_socket) urllib3.exceptions.ProtocolError: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory')) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/docker/api/client.py", line 214, in _retrieve_server_version return self.version(api_version=False)["ApiVersion"] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/docker/api/daemon.py", line 181, in version return self._result(self._get(url), json=True) ^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/docker/utils/decorators.py", line 46, in inner return f(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/docker/api/client.py", line 237, in _get return self.get(url, **self._set_request_timeout(kwargs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/requests/sessions.py", line 600, in get return self.request("GET", url, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/requests/sessions.py", line 587, in request resp = self.send(prep, **send_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/requests/sessions.py", line 701, in send r = adapter.send(request, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/requests/adapters.py", line 547, in send raise ConnectionError(err, request=request) requests.exceptions.ConnectionError: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory')) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/prefect/infrastructure/docker.py", line 541, in _get_client docker_client = docker.from_env() ^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/docker/client.py", line 96, in from_env return cls( ^^^^ File "/usr/local/lib/python3.11/site-packages/docker/client.py", line 45, in __init__ self.api = APIClient(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/docker/api/client.py", line 197, in __init__ self._version = self._retrieve_server_version() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/docker/api/client.py", line 221, in _retrieve_server_version raise DockerException( docker.errors.DockerException: Error while fetching server API version: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory')) The above exception was the direct cause of the following exception: RuntimeError: Could not connect to Docker.
It seems to not actually make it to the agent (at least it doesn't generate any logs).
c

Christopher Boyd

01/27/2023, 2:18 PM
That looks like an odd traceback, but are you sure that the docker daemon is running, and in the same namespace (e.g. are you in a conda / poetry environment or something)?
self.connect() File "/usr/local/lib/python3.11/site-packages/docker/transport/unixconn.py", line 30, in connect sock.connect(self.unix_socket) FileNotFoundError: [Errno 2] No such file or directory
The error is saying it can’t find the unix socket
Usually, it’s either the service isn’t started (systemctl start docker or open docker desktop), or you don’t have permissions with your user to access the socker (check /var/run/docker.sock)
n

Nelson Griffiths

01/27/2023, 2:36 PM
I am on a VM running the base python (3.10.6). So no conda or poetry happening. From the exact same place I start the prefect agent:
So it seems like I should have permissions to run docker just fine?
And I just start the prefect agent with
prefect agent start --work-queue dev
c

Christopher Boyd

01/27/2023, 2:43 PM
You can try setting prefect config set PREFECT_LOGGING_LEVEL=DEBUG and re running , but the error looks entirely in the docker traceback stack in trying to connect to the socket
n

Nelson Griffiths

01/27/2023, 2:43 PM
No logs are generated at all even with DEBUG on
The error only shows up on the UI
c

Christopher Boyd

01/27/2023, 2:48 PM
I’m curious - do you have python3.11 installed, anywhere? It’s curious you are using
3.10.6
but all the tracebacks are
/usr/local/lib/python3.11
n

Nelson Griffiths

01/27/2023, 2:51 PM
Hmmm... I hadn't noticed that. It does not seem that I do.
That is weird. I may blow up my VM and start from scratch to see if I can reproduce the issue.
1
c

Christopher Boyd

01/27/2023, 2:57 PM
Sorry to hear that, but keep us informed - I’ll keep an eye out
👍 1
n

Nelson Griffiths

01/27/2023, 4:00 PM
@Christopher Boyd After resetting my vm I ran in to the same error. Interestingly enough, I tried twice. The first time, I had forgotten to configure artifact registry access and I got a permission denied error from python 3.10 which I am running. The second time, I ran into the same error as above from python 3.11 and I am sure that python 3.11 is not on my vm.
c

Christopher Boyd

01/27/2023, 4:01 PM
very odd - I’m checking with the backend team to see if we can determine anything further here, or ask further questions
n

Nelson Griffiths

01/27/2023, 4:06 PM
Thank you!
c

Christopher Boyd

01/27/2023, 4:19 PM
artifact registry?
n

Nelson Griffiths

01/27/2023, 4:19 PM
Yes
Debian 11 VM. Installed docker through apt.
Any luck finding out where the python3.11 is coming from here?
c

Christopher Boyd

01/27/2023, 7:30 PM
Nothing on the prefect side, it’s not used at all
do those paths actually exist on your system?
also whats $DOCKER_HOST set to
and which version of prefect are you using
n

Nelson Griffiths

01/27/2023, 7:34 PM
DOCKER_HOST is not set
Starting v2.7.10 agent
And those paths don't seem to exist... Here is what does exist in `/usr/local/lib`:
nelsongriffiths_doubleriver_com@prefect-agent:~$ ls /usr/local/lib
python2.7  python3.9
c

Christopher Boyd

01/27/2023, 7:38 PM
do you have steps I can take to reproduce this
you said you were able to reproduce by deleting your VM and starting over, I’d like to try
n

Nelson Griffiths

01/27/2023, 7:39 PM
Sure. Do you want just how I set up the vm?
c

Christopher Boyd

01/27/2023, 7:40 PM
Ideally whatever you did in whole to reach this stage - if you have a flow, how you installed prefect (version), setup the agent, anything you can share steps for minus personal information
n

Nelson Griffiths

01/27/2023, 7:40 PM
Okay let me put something together and Ill send it your way later today
1
c

Christopher Boyd

01/27/2023, 7:41 PM
I’m still baffled by the python version - is there a specific image you are trying to use, or a docker image you built / pulled? We aren’t using python 3.11, so the traceback doesn’t seem to be coming from cloud side
so I’m kind of at a loss
n

Nelson Griffiths

01/27/2023, 7:42 PM
Same. The docker image is build with
python:3.10.7-slim-buster
Ill put together a Dockerfile that represents what we are doing as well
Just double checked and the Docker Image for sure does not have 3.11 on it
c

Christopher Boyd

01/27/2023, 7:47 PM
Lets try one more thing
if you can grab an strace of running the prefect deployment apply command
strace -o output.txt <whatever command you used to apply the deployment>
and attach
or you can message me with it
n

Nelson Griffiths

01/27/2023, 7:49 PM
Does it matter that I create the deployment from a different place than I run the agent from?
c

Christopher Boyd

01/27/2023, 7:49 PM
the deployment creation and agent are irrelevant to each other
you don’t need an agent running to create and apply a deployment
the agent running is to poll the API to receive flow runs
Create deployment -> apply deployment -> create flow run -> agent polls and submits flow_run for execution
n

Nelson Griffiths

01/27/2023, 7:52 PM
Okay so I am locally on a macbook. Apparently strace is for Linux only. Any suggestions for alternatives?
c

Christopher Boyd

01/27/2023, 7:57 PM
what version python do you have on your mac
n

Nelson Griffiths

01/27/2023, 7:57 PM
I have multiple. The environment making the deployments uses 3.10.5
But I do have 3.11.1 on my macbook
c

Christopher Boyd

01/27/2023, 7:58 PM
are you in any sort of conda / venv environment when you create and apply the deployment
n

Nelson Griffiths

01/27/2023, 7:58 PM
Yes I use poetry locally.
c

Christopher Boyd

01/27/2023, 7:59 PM
can you share the deployment object that is being applied
and how you are building it
Prefect automatically sets a Docker image matching the Python and Prefect version you're using at deployment time. You can see all available images at Docker Hub.
n

Nelson Griffiths

01/27/2023, 8:02 PM
"""Deploy our flow."""
from prefect.deployments import Deployment
from prefect.filesystems import GCS
from prefect.orion.schemas.schedules import CronSchedule
from prefect_gcp.cloud_run import CloudRunJob
from prefect.infrastructure.docker import DockerContainer

from data_pipelines.falcon_cap_iq.falcon_cap_iq_flow import falcon_cap_iq_dbt_flow

if __name__ == "__main__":
    docker = DockerContainer.load("data-pipelines-image")
    storage: GCS = GCS.load("prefect-flow-storage")

    deployment = Deployment.build_from_flow(
        name="falcon_cap_iq_dbt_deployment",
        description=(
            "Deployment for triggering DBT for transforming CapIQ data "
            "for use with falcon."
        ),
        version="1",
        tags=["dev", "falcon", "dbt", "snowflake"],
        schedule=CronSchedule(
            cron="0 6 * * 1,2,3,4,5", timezone="America/Denver"
        ),  # Cron schedule to run weekdays at 6:00 AM MST
        flow=falcon_cap_iq_dbt_flow,
        work_queue_name="dev",  
        infrastructure=docker, 
        storage=storage,
        skip_upload=False,
    )
    deployment.apply()
c

Christopher Boyd

01/27/2023, 8:03 PM
You have an API key and API url set on the VM as well to communicate with cloud API as well, and the docker infrastructure block?
n

Nelson Griffiths

01/27/2023, 8:04 PM
By that do you mean I logged in via the cli with the correct key?
c

Christopher Boyd

01/27/2023, 8:05 PM
correct
also, I was looking for the object itself taht was being applied, something along the lines of
###
### A complete description of a Prefect Deployment for flow 'my-flow'
###
name: my-flow-deployment
description: null
tags:
- test
schedule: null
parameters: {}
infrastructure:
  type: docker-container
  env: {}
  labels: {}
  name: null
  command:
  - python
  - -m
  - prefect.engine
  image: prefecthq/prefect:dev-python3.9
  image_pull_policy: null
  networks: []
  network_mode: null
  auto_remove: false
  volumes: []
  stream_output: true
###
### DO NOT EDIT BELOW THIS LINE
###
flow_name: my-flow
manifest_path: my_flow-manifest.json
storage:
  bucket_path: bucket-full-of-sunshine
  aws_access_key_id: '**********'
  aws_secret_access_key: '**********'
  _is_anonymous: true
  _block_document_name: anonymous-xxxxxxxx-f1ff-4265-b55c-6353a6d65333
  _block_document_id: xxxxxxxx-06c2-4c3c-a505-4a8db0147011
  _block_type_slug: s3
parameter_openapi_schema:
  title: Parameters
  type: object
  properties: {}
  required: null
  definitions: null
n

Nelson Griffiths

01/27/2023, 8:07 PM
Where can I get that from if I am not using the cli to build deployments?
c

Christopher Boyd

01/27/2023, 8:09 PM
You should at least be able to pull the infrastructure block from the deployment page in the UI
n

Nelson Griffiths

01/27/2023, 8:12 PM
Screen Shot 2023-01-27 at 1.12.31 PM.png
c

Christopher Boyd

01/27/2023, 8:18 PM
I haven’t seen this syntax before, is this accurate?
storage: GCS = GCS.load("prefect-flow-storage")
otherwise, I don’t really see anything stand out
n

Nelson Griffiths

01/27/2023, 8:24 PM
That is just the prefect-gcp block for cloud storage. And it does work. I have used it successfully with other infrastructures
I am going to start fresh and work through it again. Maybe I did something sill somewhere. Ill let you know what I find
c

Christopher Boyd

01/27/2023, 8:31 PM
if you can doc the steps you take
I can try to reproduce it
👍 1
n

Nelson Griffiths

02/01/2023, 3:57 PM
@Christopher Boyd I am pretty baffled by this still. There is something about the way the agent is running that is breaking the docker infrastructure. I can deploy the flow to a local process and it runs. I can also manually load the
DockerContainer
infrastructure and run
_create_and_start_container
and it pulls and runs the image without throwing any errors. I am running all of this in poetry. So it is the exact same environment that I use to run the agent. And it is still throwing 3.11 errors which I do not have on my machine. Is there a better way to step through what the agent is doing and debug it?
c

Christopher Boyd

02/01/2023, 4:01 PM
The agent pulls and deploys infrastructure based on the flow run here: https://github.com/PrefectHQ/prefect/blob/main/src/prefect/agent.py#L411 https://github.com/PrefectHQ/prefect/blob/main/src/prefect/agent.py#L425 You can set prefect config set PREFECT_LOGGING_LEVEL=DEBUG and try to see if we can get any additional details, but I don’t recall if we already did that
n

Nelson Griffiths

02/01/2023, 4:02 PM
Oh my. It appears someone else at my company was messing around with dev and started an agent that didn't have access to Docker and left it running somewhere that was picking up my runs. Moving to a new work queue fixed it all. Well thank you for taking the time to help me with a silly error!
🙌 2