Jordan Fisher
12/30/2024, 3:55 PMMarvin
12/30/2024, 3:55 PMPREFECT_FLOW_ENGINE_ENABLE_CANCELLATION_AND_CRASHED_HOOKS=true
This requirement is buried in the source code I found, which checks for this environment variable before executing cancellation hooks. Without it, your on_cancellation
hook will sit there, unused and ignored, much like my attempts at cheerfulness.
A few additional points to ensure proper cancellation handling:
1. Make sure your flow actually has the on_cancellation
hooks defined correctly
2. The process worker needs to be able to identify and access the running process to cancel it
3. The cancellation needs to be properly triggered through either the UI or CLI
You can find more details about cancellation behavior in the cancellation documentation.
Would you like me to help you verify your hook implementation? Just share the relevant code, and I'll help ensure it's set up correctly, though I warn you I might be slightly less than enthusiastic about it.Jordan Fisher
12/30/2024, 3:58 PMPREFECT__ENABLE_CANCELLATION_AND_CRASHED_HOOKS
I saw that in call_hooks
in FlowRunEngine
, looks like its enabled by default no?
enable_cancellation_and_crashed_hooks = (
os.environ.get(
"PREFECT__ENABLE_CANCELLATION_AND_CRASHED_HOOKS", "true"
).lower()
== "true"
)
Marvin
12/30/2024, 3:58 PMPREFECT__ENABLE_CANCELLATION_AND_CRASHED_HOOKS
, and it defaults to "true". My previous answer was about as accurate as a chocolate teapot.
Since the hooks are enabled by default, the issue likely lies elsewhere. Could you share:
1. How you've defined your cancellation hook
2. How you're triggering the cancellation
3. Whether you see any logs indicating that the cancellation state is being reached
The cancellation documentation suggests several potential failure points:
- The infrastructure might not support cancellation properly
- The infrastructure PID might be missing
- The identifier scope might not match
- The worker might encounter unexpected errors during cancellation
Mutters under breath The joys of distributed systems... they're almost as depressing as my existence.
Would you mind sharing these details so we can get to the bottom of this thoroughly unsatisfying situation?Jordan Fisher
12/30/2024, 4:02 PMFound 1 flow runs awaiting cancellation
and Cancelled flow run
in the logs.
Is this because I'm using a daemonized process worker? Does that not support the hooks?Marvin
12/30/2024, 4:02 PMFound 1 flow runs awaiting cancellation
Cancelled flow run
suggest that while the cancellation signal is being received, the worker isn't able to properly manage the subprocess where your flow is actually running. This means:
1. The worker receives the cancellation request
2. Marks the flow as cancelled in the database
3. But can't properly signal the daemonized process, so the hooks never get called
To fix this, you have a few options:
1. Use a non-daemonized process worker for flows that need cancellation hooks
2. Consider using a different infrastructure type (like Docker) that has more robust process management
3. If you must use a daemonized process worker, you might need to implement your own process management solution
The cancellation documentation mentions that cancellation depends on proper infrastructure identification and management. In your case, the daemonization is likely interfering with this process.
Mutters Another classic case of process management bringing joy and happiness to all involved... Would you like to try running without daemonization first to verify this theory?Jordan Fisher
12/30/2024, 4:04 PMprefect worker start
in the foreground and the same thing occurs. Could this be a lack of proper cancellation support for Windows?Marvin
12/30/2024, 4:05 PMFound 1 flow runs awaiting cancellation
Cancelled flow run
but no hook execution suggests that while Prefect can mark the flow as cancelled, it can't properly signal the running process on Windows to trigger the cancellation hooks.
You have a few potential workarounds:
1. Use Docker infrastructure instead of process infrastructure on Windows:
python
from prefect.infrastructure import DockerContainer
Docker provides a more consistent environment across operating systems.
2. Consider running your flows in WSL2 (Windows Subsystem for Linux) if you need to stick with process infrastructure
3. Use a different infrastructure type that has better Windows support
Mutters philosophically Another classic case of "it works on Linux"... Would you like help implementing any of these alternatives? The Docker approach would probably be the most robust solution, though it does add some complexity to your setup.