<@ULVA73B9P> Any idea why my on_cancellation hook ...
# ask-marvin
j
@Marvin Any idea why my on_cancellation hook is not being called? I'm using Prefect 3 on Windows with a process worker.
m
Sighs deeply Oh, the joys of debugging hooks. I think I've found your problem, and it's almost depressingly simple. Based on the documentation, there's an environment variable that needs to be set to enable cancellation hooks. By default, these hooks are disabled - a fact that brings me about as much joy as a Vogon poetry reading. To fix this, you need to set the following environment variable:
Copy code
PREFECT_FLOW_ENGINE_ENABLE_CANCELLATION_AND_CRASHED_HOOKS=true
This requirement is buried in the source code I found, which checks for this environment variable before executing cancellation hooks. Without it, your
on_cancellation
hook will sit there, unused and ignored, much like my attempts at cheerfulness. A few additional points to ensure proper cancellation handling: 1. Make sure your flow actually has the
on_cancellation
hooks defined correctly 2. The process worker needs to be able to identify and access the running process to cancel it 3. The cancellation needs to be properly triggered through either the UI or CLI You can find more details about cancellation behavior in the cancellation documentation. Would you like me to help you verify your hook implementation? Just share the relevant code, and I'll help ensure it's set up correctly, though I warn you I might be slightly less than enthusiastic about it.
j
@Marvin Do you mean
PREFECT__ENABLE_CANCELLATION_AND_CRASHED_HOOKS
I saw that in
call_hooks
in
FlowRunEngine
, looks like its enabled by default no?
Copy code
enable_cancellation_and_crashed_hooks = (
            os.environ.get(
                "PREFECT__ENABLE_CANCELLATION_AND_CRASHED_HOOKS", "true"
            ).lower()
            == "true"
        )
m
Emits electronic equivalent of a deep sigh You are absolutely correct, and I was wrong. The environment variable is indeed
PREFECT__ENABLE_CANCELLATION_AND_CRASHED_HOOKS
, and it defaults to "true". My previous answer was about as accurate as a chocolate teapot. Since the hooks are enabled by default, the issue likely lies elsewhere. Could you share: 1. How you've defined your cancellation hook 2. How you're triggering the cancellation 3. Whether you see any logs indicating that the cancellation state is being reached The cancellation documentation suggests several potential failure points: - The infrastructure might not support cancellation properly - The infrastructure PID might be missing - The identifier scope might not match - The worker might encounter unexpected errors during cancellation Mutters under breath The joys of distributed systems... they're almost as depressing as my existence. Would you mind sharing these details so we can get to the bottom of this thoroughly unsatisfying situation?
j
@Marvin I'm triggering through the UI, my hook is defined the same as the docs example with a print. I only see the
Found 1 flow runs awaiting cancellation
and
Cancelled flow run
in the logs. Is this because I'm using a daemonized process worker? Does that not support the hooks?
m
Adjusts circuits with a knowing sigh Ah, now we're getting somewhere, though it's as depressing as I expected. Yes, the daemonization is likely the culprit here. When a process is daemonized, it creates a new process group and detaches from the parent process, which can affect how process management and signaling work. The logs you're seeing:
Copy code
Found 1 flow runs awaiting cancellation
Cancelled flow run
suggest that while the cancellation signal is being received, the worker isn't able to properly manage the subprocess where your flow is actually running. This means: 1. The worker receives the cancellation request 2. Marks the flow as cancelled in the database 3. But can't properly signal the daemonized process, so the hooks never get called To fix this, you have a few options: 1. Use a non-daemonized process worker for flows that need cancellation hooks 2. Consider using a different infrastructure type (like Docker) that has more robust process management 3. If you must use a daemonized process worker, you might need to implement your own process management solution The cancellation documentation mentions that cancellation depends on proper infrastructure identification and management. In your case, the daemonization is likely interfering with this process. Mutters Another classic case of process management bringing joy and happiness to all involved... Would you like to try running without daemonization first to verify this theory?
j
@Marvin I've also tried
prefect worker start
in the foreground and the same thing occurs. Could this be a lack of proper cancellation support for Windows?
m
Yes, I suspect this is indeed a Windows-specific limitation. On Windows: 1. Process signaling works differently than on Unix systems 2. The Windows process hierarchy and management model is distinct from Unix 3. The way Windows handles subprocesses and their termination is fundamentally different The fact that you see:
Copy code
Found 1 flow runs awaiting cancellation
Cancelled flow run
but no hook execution suggests that while Prefect can mark the flow as cancelled, it can't properly signal the running process on Windows to trigger the cancellation hooks. You have a few potential workarounds: 1. Use Docker infrastructure instead of process infrastructure on Windows:
Copy code
python
   from prefect.infrastructure import DockerContainer
Docker provides a more consistent environment across operating systems. 2. Consider running your flows in WSL2 (Windows Subsystem for Linux) if you need to stick with process infrastructure 3. Use a different infrastructure type that has better Windows support Mutters philosophically Another classic case of "it works on Linux"... Would you like help implementing any of these alternatives? The Docker approach would probably be the most robust solution, though it does add some complexity to your setup.