https://prefect.io logo
Title
t

Timo Sugliani

03/13/2023, 6:35 PM
Hi Everyone 👋 I've been testing/evaluating prefect recently, and love it ! 😍 That said I'm currently facing an issue I can't seem to figure out how to troubleshoot/fix. I'm using a a simple deployment/flow to launch the same flow around 40 times right now with just a different value each time as the input parameter. (so for loop 40 times same code/different parameter value) Each flow takes quite a while to execute/run as it's basically downloading files, verifying stuff left and right and doing a few system file operations after. (but this isn't even an issue as it's doesn't even get triggered at this stage) I attached a screenshot of my VSC window with the project docker compose logs where the issue occurs at the beginning after many Submitting flow run $id such as:
tsugliani-zpodengineagent   | Starting v2.8.5 agent connected to <http://zpodengineserver:4200/api>...
tsugliani-zpodengineagent   |
tsugliani-zpodengineagent   |   ___ ___ ___ ___ ___ ___ _____     _   ___ ___ _  _ _____
tsugliani-zpodengineagent   |  | _ \ _ \ __| __| __/ __|_   _|   /_\ / __| __| \| |_   _|
tsugliani-zpodengineagent   |  |  _/   / _|| _|| _| (__  | |    / _ \ (_ | _|| .` | | |
tsugliani-zpodengineagent   |  |_| |_|_\___|_| |___\___| |_|   /_/ \_\___|___|_|\_| |_|
tsugliani-zpodengineagent   |
tsugliani-zpodengineagent   |
tsugliani-zpodengineagent   | Agent started! Looking for work from queue(s): default...
tsugliani-zpodengineagent   | 17:38:43.676 | INFO    | prefect.agent - Submitting flow run '8c51875b-69b7-4847-bc44-cbbabb1b9e58'
tsugliani-zpodengineagent   | 17:38:43.678 | INFO    | prefect.agent - Submitting flow run 'f4cbd07c-d8ef-4e4f-b501-9f62c1ef5cb9'
tsugliani-zpodengineagent   | 17:38:43.679 | INFO    | prefect.agent - Submitting flow run '4eafa4b5-024c-46ac-958d-7b5e4abbeaea'
tsugliani-zpodengineagent   | 17:38:43.679 | INFO    | prefect.agent - Submitting flow run '15c6c370-e807-436f-8c8b-38a2d1b85291'
tsugliani-zpodengineagent   | 17:38:43.680 | INFO    | prefect.agent - Submitting flow run '1cd6de3e-2404-42f4-a9d6-d9335ef42ab9'
tsugliani-zpodengineagent   | 17:38:43.680 | INFO    | prefect.agent - Submitting flow run '465e5392-1b1d-412d-830a-8eea270e8f4c'
tsugliani-zpodengineagent   | 17:38:43.681 | INFO    | prefect.agent - Submitting flow run 'd19d5a9f-5361-480d-9d41-c1b570c09152'
tsugliani-zpodengineagent   | 17:38:43.681 | INFO    | prefect.agent - Submitting flow run '322dae96-9349-4eea-940e-5d10cd5dedec'
tsugliani-zpodengineagent   | 17:38:43.682 | INFO    | prefect.agent - Flow run limit reached; 8 flow runs in progress.
tsugliani-zpodengineagent   | 17:38:44.217 | ERROR   | prefect.agent - Failed to submit flow run '465e5392-1b1d-412d-830a-8eea270e8f4c' to infrastructure.
tsugliani-zpodengineagent   | Traceback (most recent call last):
tsugliani-zpodengineagent   |   File "/usr/local/lib/python3.11/site-packages/prefect/agent.py", line 484, in _submit_run_and_capture_errors
tsugliani-zpodengineagent   |     result = await infrastructure.run(task_status=task_status)
tsugliani-zpodengineagent   |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tsugliani-zpodengineagent   |   File "/usr/local/lib/python3.11/site-packages/prefect/infrastructure/docker.py", line 322, in run
tsugliani-zpodengineagent   |     container = await run_sync_in_worker_thread(self._create_and_start_container)
tsugliani-zpodengineagent   |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tsugliani-zpodengineagent   |   File "/usr/local/lib/python3.11/site-packages/prefect/utilities/asyncutils.py", line 91, in run_sync_in_worker_thread
tsugliani-zpodengineagent   |     return await anyio.to_thread.run_sync(
tsugliani-zpodengineagent   |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tsugliani-zpodengineagent   |   File "/usr/local/lib/python3.11/site-packages/anyio/to_thread.py", line 31, in run_sync
tsugliani-zpodengineagent   |     return await get_asynclib().run_sync_in_worker_thread(
tsugliani-zpodengineagent   |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tsugliani-zpodengineagent   |   File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
tsugliani-zpodengineagent   |     return await future
tsugliani-zpodengineagent   |            ^^^^^^^^^^^^
tsugliani-zpodengineagent   |   File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 867, in run
tsugliani-zpodengineagent   |     result = context.run(func, *args)
tsugliani-zpodengineagent   |              ^^^^^^^^^^^^^^^^^^^^^^^^
tsugliani-zpodengineagent   |   File "/usr/local/lib/python3.11/site-packages/prefect/infrastructure/docker.py", line 425, in _create_and_start_container
tsugliani-zpodengineagent   |     docker_client = self._get_client()
tsugliani-zpodengineagent   |                     ^^^^^^^^^^^^^^^^^^
tsugliani-zpodengineagent   |   File "/usr/local/lib/python3.11/site-packages/prefect/infrastructure/docker.py", line 650, in _get_client
tsugliani-zpodengineagent   |     docker_client = docker.from_env()
tsugliani-zpodengineagent   |                     ^^^^^^^^^^^^^^^
tsugliani-zpodengineagent   | AttributeError: module 'docker' has no attribute 'from_env'
tsugliani-zpodengineagent   | 17:38:44.281 | INFO    | prefect.agent - Completed submission of flow run '465e5392-1b1d-412d-830a-8eea270e8f4c'
Every flow-run is executed using a
Docker Container block
. I did try to limit
task concurrency
,
work pool concurrency
and also
prefect agent --limit
but haven't found a way to understand why this is still happening. This seems to fail at the beginning for a few flow runs, with the following error
State Message
Submission failed. AttributeError: module 'docker' has no attribute 'from_env'
What I don't get is why is this happening on a few flow runs and not others at the initial launch, I can easily launch those failed flow runs later with the same specified custom parameter where It failed the first time and it will succeed without any issues. (I attached a screenshot of a succeeded one that failed previously) If anyone could has any suggestions/pointers/insights to help troubleshoot this issue It would help greatly 🙂 Note: I'm using the
prefecthq/prefect:2.8.5-python3.11
official docker images for this (server & agent are launched as docker containers and the agent container has a volume
/var/run/docker.sock:/var/run/docker.sock
to allow him launching the docker engine blocks from the host environment to avoid "docker in docker" execution (https://jpetazzo.github.io/2015/09/03/do-not-use-docker-in-docker-for-ci/) PS: I tried searching here and GitHub Issues which do mention sometimes this issue but with nothing really conclusive to me. • https://github.com/PrefectHQ/prefect/issues/6519
z

Zanie

03/13/2023, 7:12 PM
Hi! I see you linked the issue already 😄
https://github.com/PrefectHQ/prefect/issues/6519 does appear to be what you’re encountering
We need to rename our
docker
module
Here’s a draft https://github.com/PrefectHQ/prefect/pull/8788 — probably needs a bit more work though
t

Timo Sugliani

03/13/2023, 7:38 PM
@Zanie Thanks for the quick answer ! It is really weird indeed, I don't see why renaming the docker module would help, but as it's sooo random and I don't get how this is happening exactly. I will try to limit concurrency to 1 to see if it helps "for now" as depicted in one of the issues, but it sure does feel bad when you have such a fancy flow engine 😢
Hmm it seems with the prefect agent launched with
--limit 1
I haven't seen the issue so far (completed, running & late
States
which is fine) but nothing failing I tried earlier when writing this post with
--limit 2
and it failed nearly instantly, will confirm with more tests over time.
First time I got 100%
Completed State
with
--limit 1
Hi @Zanie, Just to let you know that I saw the docker PR (https://github.com/PrefectHQ/prefect/pull/8788) was merged into
2.10.11
, tried it, and unfortunately I do not see any change in behavior 😢 We are submitting only 2 flows, and it happens instantly most of the time at this step.
tsugliani-zpodengineagent   | Starting v2.10.11 agent connected to <http://zpodengineserver:4200/api>...
tsugliani-zpodengineagent   |
tsugliani-zpodengineagent   |   ___ ___ ___ ___ ___ ___ _____     _   ___ ___ _  _ _____
tsugliani-zpodengineagent   |  | _ \ _ \ __| __| __/ __|_   _|   /_\ / __| __| \| |_   _|
tsugliani-zpodengineagent   |  |  _/   / _|| _|| _| (__  | |    / _ \ (_ | _|| .` | | |
tsugliani-zpodengineagent   |  |_| |_|_\___|_| |___\___| |_|   /_/ \_\___|___|_|\_| |_|
tsugliani-zpodengineagent   |
tsugliani-zpodengineagent   |
tsugliani-zpodengineagent   | Agent started! Looking for work from queue(s): default...
tsugliani-zpodengineagent   | 09:32:58.245 | INFO    | prefect.agent - Submitting flow run '2c78c02c-ff69-4f44-891a-cbd4bdf4a651'
tsugliani-zpodengineagent   | 09:32:58.248 | INFO    | prefect.agent - Submitting flow run '96491f2d-0b48-416e-9534-64cb3888a4ae'
tsugliani-zpodengineagent   | 09:32:58.404 | ERROR   | prefect.agent - Failed to submit flow run '96491f2d-0b48-416e-9534-64cb3888a4ae' to infrastructure.
tsugliani-zpodengineagent   | Traceback (most recent call last):
tsugliani-zpodengineagent   |   File "/usr/local/lib/python3.11/site-packages/prefect/infrastructure/container.py", line 650, in _get_client
tsugliani-zpodengineagent   |     docker_client = docker.from_env()
tsugliani-zpodengineagent   |                     ^^^^^^^^^^^^^^^
tsugliani-zpodengineagent   | AttributeError: module 'docker' has no attribute 'from_env'
The only way to avoid the docker error so far has been to block the prefect agent to
--limit 1
but that's a bit counterproductive when using such a nice workflow engine. We will very likely try to adapt our code to use https://prefecthq.github.io/prefect-docker/worker/, but will require a bit of testing and fiddling around.
z

Zanie

05/26/2023, 3:15 PM
Oh that’s weird / unfortunate