Oz Shaked
10/16/2024, 8:33 PMtopic_poll_flow.serve(webserver=True)
We enabled the server in order to get /health
endpoint for the EKS
When we send the worker jobs we get the following error in the runner:
20:28:33.879 | INFO | prefect.webserver - Created flow run 'knowing-chamois' from deployment 'topic-poll-flow'
20:28:33.891 | INFO | prefect.flow_runs.runner - Opening process...
20:28:33.894 | ERROR | prefect.flow_runs.runner - Failed to start process for flow run '61719a08-c4d9-49bd-92bb-0124efeb2830'.
Traceback (most recent call last):
File "/workspaces/gigaverse-ai/.venv/lib/python3.11/site-packages/prefect/runner/runner.py", line 1051, in _submit_run_and_capture_errors
status_code = await self._run_process(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/gigaverse-ai/.venv/lib/python3.11/site-packages/prefect/runner/runner.py", line 600, in _run_process
process = await run_process(
^^^^^^^^^^^^^^^^^^
File "/workspaces/gigaverse-ai/.venv/lib/python3.11/site-packages/prefect/utilities/processutils.py", line 258, in run_process
async with open_process(
File "/usr/local/lib/python3.11/contextlib.py", line 204, in __aenter__
return await anext(self.gen)
^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/gigaverse-ai/.venv/lib/python3.11/site-packages/prefect/utilities/processutils.py", line 202, in open_process
process = await anyio.open_process(command, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/gigaverse-ai/.venv/lib/python3.11/site-packages/anyio/_core/_subprocesses.py", line 197, in open_process
return await get_async_backend().open_process(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspaces/gigaverse-ai/.venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2455, in open_process
process = await asyncio.create_subprocess_exec(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/asyncio/subprocess.py", line 223, in create_subprocess_exec
transport, protocol = await loop.subprocess_exec(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/asyncio/base_events.py", line 1694, in subprocess_exec
transport = await self._make_subprocess_transport(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/asyncio/unix_events.py", line 198, in _make_subprocess_transport
with events.get_child_watcher() as watcher:
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/asyncio/events.py", line 811, in get_child_watcher
return get_event_loop_policy().get_child_watcher()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/asyncio/events.py", line 637, in get_child_watcher
raise NotImplementedError
NotImplementedError
20:28:33.935 | INFO | prefect.flow_runs.runner - Reported flow run '61719a08-c4d9-49bd-92bb-0124efeb2830' as crashed: Flow run process could not be started
We get the same error when trying to invoke the deployment or flow directly through the pod's entrypoint.
When webserver=False
everything is fine, but k8s kills our pods (we don't want to disable the liveliness and readiness probes).
And ideas or directions?Jagi Natarajan
11/12/2024, 8:03 PMOz Shaked
11/12/2024, 8:10 PMOz Shaked
11/12/2024, 8:12 PM