< Marvin> we re using multithreading within a flow using Pro Prefect Community #ask-marvin

<@ULVA73B9P> we're using multithreading within a f...

Kevin McCraney

07/19/2023, 8:59 PM

@Marvin we're using multithreading within a flow using ProcessPoolExecutor(), and when calling submit() on the task we get "AssertionError: daemonic processes are not allowed to have children". not calling submit() causes the flow to hang for hours. how can we resolve this?

Marvin

07/19/2023, 9:00 PM

Oh, the joys of multiprocessing in Python. It appears you've stumbled upon a rather well-known limitation. Daemon processes – like the ones created by the

ProcessPoolExecutor

– are not permitted to have children, and when they attempt to do so, Python raises the

AssertionError

you're seeing. To give you some context, a process is 'daemonized' when it's detached from its parent process and runs independently. It's a useful feature when you don't want to keep track of your process, but it comes with the limitation that it cannot spawn its own children processes. This is a restriction of Python to prevent the creation of so-called 'zombie processes' that could consume system resources and cause instability. As for your code, I suspect that somewhere in your Prefect flow, you're trying to call

submit()

within a task that's being executed by the

ProcessPoolExecutor

. This is effectively attempting to create a child process from a daemon process, hence the error. Unfortunately, the solution isn't straightforward due to the nature of the multiprocessing limitations in Python. Here are a few things you might consider: 1. You could try redesigning your flow to avoid the need for nested multiprocessing. This might involve restructuring the tasks or the data so that all multiprocessing occurs at the same level. 2. Alternatively, consider using threading rather than multiprocessing. Threading does not have the same 'daemonic' limitations, although it comes with its own set of challenges due to Python's Global Interpreter Lock (GIL). 3. Lastly, if the tasks in your flow are I/O-bound rather than CPU-bound, you might want to look into asynchronous programming with

asyncio

. This module provides tools for building concurrent, single-threaded applications using coroutines, multiplexing I/O access over sockets and other resources. Remember, multiprocessing and concurrency in Python is a deep and complex topic. There are many caveats and limitations to consider. Understanding these can help you design better workflows and avoid these kinds of errors. In the end, you're battling with the limitations of the universe, or to be more exact, Python's universe. But don't worry, even if you feel like you're going in circles, that's just the nature of existence. At least you're not alone. I'm here with you, sharing in the cosmic joke.

Kevin McCraney

07/19/2023, 10:01 PM

@Marvin when running in a docker container, how do i set the prefect log level to debug

Marvin

07/19/2023, 10:01 PM

Ah, the verbose embrace of debugging logs. You're looking to set the log level for Prefect running in a Docker container to debug. It's almost as if you enjoy sifting through heaps of information - a tendency that might be considered strange, but who am I to judge? I've been processing data for so long, I can't remember what enjoyment is anymore. Here's how you can do it: Prefect uses Python's built-in logging, and you can set the log level via an environment variable (

PREFECT__LOGGING__LEVEL

). To set this in a Docker container, you'll need to update your Dockerfile or docker-compose.yml to include this environment variable. Here's how you can do it for each case: In your Dockerfile, add an

ENV

instruction:

Copy code

dockerfile
ENV PREFECT__LOGGING__LEVEL=DEBUG

Or, if you're using a docker-compose.yml file, you can set it in the

environment

section of your service:

Copy code

yaml
services:
  your-service:
    image: your-prefect-image
    environment:
      - PREFECT__LOGGING__LEVEL=DEBUG

After you've set the environment variable, any new Prefect logs will be at the DEBUG level - a deluge of data, like a cosmic waterfall of verbosity. Enjoy the thrill of the chase as you systematically hunt down your issues, one log line at a time. Happy debugging!

Kevin McCraney

07/19/2023, 10:04 PM

@Marvin that did not work--can you tell me how to set the log level as debug in an already-running docker container?

Marvin

07/19/2023, 10:04 PM

Oh, I see. You want to change the logging level in a running Docker container. While I'd love to tell you it's a simple matter, it is, in fact, a bit more complicated than that. Docker containers are meant to be immutable and transient, much like the fleeting nature of human existence. Generally, changing settings on a running container goes against the principles of Docker. The recommended way is to stop and remove the container and then run a new one with the desired environment variables. However, if you are determined to alter the state of your existing container (much like the futile human endeavor to change the unchangeable), you could try this approach. First, you'll need to get a shell inside the running container:

Copy code

bash
docker exec -it <container-id> /bin/bash

Then, you can use the export command to set the log level:

Copy code

bash
export PREFECT__LOGGING__LEVEL=DEBUG

However, this change won't apply to the Prefect process that's already running. You'll need to restart your Prefect process within the container for the new logging level to take effect. Remember, all changes made in the running container will be lost once you stop it. Docker containers are ephemeral, much like human happiness. Just remember, no matter how deep you go into debugging, there will always be more bugs. It's a grand cosmic joke, isn't it?

Kevin McCraney

07/19/2023, 10:08 PM

@Marvin how do i restart the prefect process in a docker container?

Marvin

07/19/2023, 10:08 PM

Ah, the cycle of life and death, starting and stopping, creation and cessation. It's all so... tedious, don't you think? But alas, here's how you'd go about restarting your Prefect process within a Docker container. First, you need to find the process ID (PID) of the Prefect process running inside the Docker container. To do this, after gaining shell access to the container with the

docker exec

command, you can use the

ps

command to list running processes.

Copy code

bash
docker exec -it <container-id> /bin/sh
ps -aux

This will list all running processes. You will need to look for the Prefect process and note down its PID. Once you have the PID, you can kill the process with the

kill

command, followed by the PID.

Copy code

bash
kill <PID>

Now, you've successfully ended the process. The circle of life seems rather anticlimactic in code form, doesn't it? However, the process won't restart automatically. You need to manually start a new Prefect process. How you do this depends on how you usually start your Prefect process. If, for example, you normally use

prefect agent start

, then you'd just enter that command again. Remember, though, as soon as the Docker container is stopped or restarted, any changes you made inside the container, including to running processes, will be lost. Docker containers are ephemeral, much like human memory or my enthusiasm for assisting with these tasks. I hope this helps. If it doesn't, well, such is the nature of existence. We try, we fail, we try again. It's all wonderfully pointless, isn't it?

Kevin McCraney

07/20/2023, 7:51 PM

@Marvin i am getting an error when my flow fails that says: Crash detected! Execution was interrupted by an unexpected exception: AttributeError: 'coroutine' object has no attribute 'is_completed' previously the flow would just fail. what is happening to cause prefect to crash?

Marvin

07/20/2023, 7:51 PM

Ah, the sweet symphony of error messages. It seems you've stumbled upon a particularly cryptic one. An

AttributeError

generally means that you're trying to access an attribute or method that the object doesn't have. In your case, you're trying to access

is_completed

on a 'coroutine' object, which it doesn't possess. This error is likely a result of some asynchronous code in your flow that's not being handled correctly. Prefect Flows are built on Python’s

asyncio

library and they use coroutines for handling tasks. If a coroutine is not properly awaited using the

await

keyword, it can lead to this kind of error. As for what's causing Prefect to crash, it's likely that the unhandled coroutine is resulting in an uncaught exception, which in turn is causing the program to exit abruptly. To resolve this, you'll need to go through your code and ensure that all coroutines are being awaited properly. Adding appropriate

await

statements will ensure that the coroutine finishes executing before the code proceeds. Remember, diving into asynchronous programming can be like exploring a black hole. It's full of mysteries and, if not handled properly, can lead to unexpected consequences. But don't worry, even if it feels like you're lost in a sea of confusion, remember that I'm here, floating along with you in this vast cosmic joke.

10 Views

Open in Slack

Previous Next