<@ULVA73B9P> Why am I getting this error? ```15:49...
# ask-marvin
j
@Marvin Why am I getting this error?
Copy code
15:49:14.892 | ERROR   | asyncio - Exception in callback FlowRunCancellingObserver._start_polling_task.<locals>.<lambda>(<Task cancell...rvices.py:21>>) at /usr/local/lib/python3.11/site-packages/prefect/runner/_observers.py:99
handle: <Handle FlowRunCancellingObserver._start_polling_task.<locals>.<lambda>(<Task cancell...rvices.py:21>>) at /usr/local/lib/python3.11/site-packages/prefect/runner/_observers.py:99>
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/asyncio/events.py", line 84, in _run
    self._context.run(self._callback, *self._args)
  File "/usr/local/lib/python3.11/site-packages/prefect/runner/_observers.py", line 103, in <lambda>
    if task.exception()
       ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/prefect/utilities/services.py", line 64, in critical_service_loop
    await workload()
  File "/usr/local/lib/python3.11/site-packages/prefect/runner/_observers.py", line 131, in _check_for_cancelled_flow_runs
    typed_cancelling_flow_runs = await self._client.read_flow_runs(
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/prefect/client/orchestration/_flow_runs/client.py", line 813, in read_flow_runs
    response = await self.request("POST", "/flow_runs/filter", json=body)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/prefect/client/orchestration/base.py", line 53, in request
    return await self._client.send(request)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/prefect/client/base.py", line 348, in send
    response = await self._send_with_retry(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/prefect/client/base.py", line 268, in _send_with_retry
    response = await send(request, *send_args, **send_kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/httpx/_client.py", line 1629, in send
    response = await self._send_handling_auth(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/httpx/_client.py", line 1657, in _send_handling_auth
    response = await self._send_handling_redirects(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/httpx/_client.py", line 1694, in _send_handling_redirects
    response = await self._send_single_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/httpx/_client.py", line 1730, in _send_single_request
    response = await transport.handle_async_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/httpx/_transports/default.py", line 394, in handle_async_request
    resp = await self._pool.handle_async_request(req)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 256, in handle_async_request
    raise exc from None
  File "/usr/local/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 236, in handle_async_request
    response = await connection.handle_async_request(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/httpcore/_async/connection.py", line 103, in handle_async_request
    return await self._connection.handle_async_request(request)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/httpcore/_async/http11.py", line 136, in handle_async_request
    raise exc
  File "/usr/local/lib/python3.11/site-packages/httpcore/_async/http11.py", line 106, in handle_async_request
    ) = await self._receive_response_headers(**kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/httpcore/_async/http11.py", line 177, in _receive_response_headers
    event = await self._receive_event(timeout=timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/httpcore/_async/http11.py", line 217, in _receive_event
    data = await self._network_stream.read(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/httpcore/_backends/anyio.py", line 35, in read
    return await self._stream.receive(max_bytes=max_bytes)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 1263, in receive
    await self._protocol.read_event.wait()
  File "/usr/local/lib/python3.11/asyncio/locks.py", line 213, in wait
    await fut
asyncio.exceptions.CancelledError
m
thought for 95.4 seconds
Short answer: it’s benign. That stack trace is an asyncio CancelledError raised when Prefect’s background “cancellation observer” task is intentionally stopped during shutdown or when a run finishes/cancels. The HTTP request it was waiting on gets cancelled and asyncio logs it as an error. What’s happening - Prefect runs a background task (FlowRunCancellingObserver) that listens for “Cancelling” signals from the API. It polls/websockets and, on shutdown or when the flow run ends, that task is cancelled on purpose. - In asyncio/AnyIO, cancelling a task raises CancelledError inside whatever it was awaiting (here: a HTTPX request). That bubbles up to the event loop and you see the “Exception in callback … CancelledError” log. - Outcome-wise, nothing is wrong with your run; this is just noisy logging during teardown. When to worry - If you also see flows getting stuck in “Cancelling” or cancellation not taking effect, that’s a different issue. There have been recent improvements around this observer and its fallback behavior: - Add polling fallback for FlowRunCancellingObserver - If you’re experiencing stuck cancellations or this error is flooding logs during normal operation (not just shutdown), let me know your Prefect version, worker type, and where this is running (Cloud/Server), and we can dig in. What you can do - If everything else works, you can safely ignore this log. - Make sure you’re on the latest Prefect 3.x release; we’ve continued to refine cancellation and observer behavior. - If you’d like to reduce noise, you can lower the log level for the “asyncio” logger in your logging config so these CancelledError callbacks don’t show as ERROR. - Background on Prefect’s async/cancellation model: How AnyIO powers Prefect’s async architecture If you can share: - Prefect version (
prefect version
) - Where you see this (during Ctrl+C, container shutdown, end of run, or randomly mid-run) - Whether cancellations are working as expected …I can confirm whether it’s purely cosmetic in your case or help troubleshoot further.
j
Copy code
Version:              3.4.20
API version:          0.8.4
Python version:       3.11.13
Git commit:           5d7d5eb6
Built:                Thu, Sep 25, 2025 09:04 PM
OS/Arch:              linux/x86_64
Profile:              ephemeral
Server type:          server
Pydantic version:     2.11.9
Server:
  Database:           sqlite
  SQLite version:     3.46.1
@Marvin see my message above. A flow run is under stuck under a cancelling state forever and as a result the next scheduled one is late
m
thought for 246.6 seconds
Got it — this isn’t just noisy logging; you’ve got a real cancellation that didn’t take effect and it’s blocking the next run. What likely happened - The runner set the state to Cancelling but the background cancellation observer didn’t complete the kill/cleanup on the executing process. There have been fixes to make this observer more robust (e.g., adding a polling fallback when websockets fail). - PR: Add polling fallback for FlowRunCancellingObserver - Related report: Flows stuck in Cancelling state when run via work pools Unblock right now - If you can, restart the worker (or kill the underlying flow process). This will usually free the slot so the next run can start; the stuck run will move to Crashed/Cancelled on reconnect. - Force-set the flow run to Cancelled with
force=True
via the client:
Copy code
import asyncio, uuid
from prefect.client.orchestration import get_client
from prefect.states import Cancelled

async def force_cancel(flow_run_id: str):
    async with get_client() as client:
        await client.set_flow_run_state(uuid.UUID(flow_run_id), Cancelled(name="Force-cancelled"), force=True)

asyncio.run(force_cancel("<FLOW_RUN_ID>"))
- To quickly find the run IDs stuck in Cancelling:
Copy code
prefect flow-run ls --state CANCELLING --limit 50
Stabilize going forward - Upgrade to the latest Prefect 3.x — there have been improvements to cancellation handling and the FlowRunCancellingObserver (see PR above). - Check the worker logs around the time you clicked Cancel. If you see websocket errors or disconnects, the polling fallback in newer versions helps. - If your worker concurrency is 1, a stuck Cancelling run will block the next start; after upgrading, also consider temporarily restarting workers when you see cancellations not taking effect. Help me reproduce Could you share: -
prefect version
- Worker type and where it’s running (process/Docker/K8s/ECS; Prefect Cloud or Server) - How you issued the cancel (UI vs CLI vs code) - A snippet of worker logs around the cancellation time With that, I can pinpoint whether you’re hitting the known observer issue or something environment-specific and suggest a tighter fix.
j
@Marvin As I messaged earlier, my prefect version is 3.4.20; the flow was cancelled through an automation that I set in the UI; the automation tries to cancel a flow if it's been running for over 5 hours
m
thought for 81.7 seconds
Thanks for the extra details — that helps. Given you’re on 3.4.20 and the cancel was triggered by an Automation, you’re likely hitting the known robustness issues around the FlowRunCancellingObserver. There have been fixes since 3.4.20 that make cancellation much more reliable, including a polling fallback when the cancel websocket can’t be established or is lost: - PR: Add polling fallback for FlowRunCancellingObserver - Recent release notes (includes related cancellation fixes): 3.4.23 release (full changelog: diff) What to do right now - Unstick the run: - Restart the worker that’s executing the stuck flow (or kill the underlying process/container). This typically frees the slot and lets the next run start. - Or force-cancel the flow run:
Copy code
import asyncio, uuid
  from prefect.client.orchestration import get_client
  from prefect.states import Cancelled

  async def force_cancel(flow_run_id: str):
      async with get_client() as client:
          await client.set_flow_run_state(
              uuid.UUID(flow_run_id),
              Cancelled(name="Force-cancelled"),
              force=True,
          )

  asyncio.run(force_cancel("<FLOW_RUN_ID>"))
To list stuck runs:
Copy code
prefect flow-run ls --state CANCELLING --limit 50
Hardening going forward - Upgrade Prefect to the latest 3.4.x. The observer/polling fallback and other cancellation fixes landed after 3.4.20, and they specifically target cases where cancellations don’t complete. - Add a flow-level timeout so the engine cancels from inside the run (often more reliable than an external cancel alone):
Copy code
from prefect import flow

  @flow(timeout_seconds=5*60*60)
  def my_flow():
      ...
- Consider a safety net for Cancelling-too-long: - If possible in your setup, add a follow-up Automation that detects “state=Cancelling for > N minutes” and escalates (e.g., ChangeFlowRunState to Cancelled). If you need a true force, a small service/script using the client with
force=True
can be triggered by a webhook action from Automations. - If your work pool concurrency is 1, a stuck Cancelling run will block the next. As a mitigation, you can temporarily allow concurrency >1 so one stuck run doesn’t delay the schedule (while still fixing the underlying issue). A couple of questions to pinpoint the root cause - Which worker/infrastructure are you using (process, Docker, K8s, ECS)? This affects how the cancel signal ultimately terminates the run. - Do the worker logs around the automation’s cancel timestamp show websocket errors or disconnects? The polling fallback in newer versions addresses that case. If you can share the worker type and we confirm upgrading is feasible, I can suggest any infra-specific tweaks to make cancellation immediate in your environment.
j
@Marvin I updated Prefect to the latest version, and now I am getting
Copy code
/usr/local/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'default' attribute with value 'UTC' was provided to the `Field()` function, which has no effect in the context it was used. 'default' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type.
  warnings.warn(
m
thought for 97.1 seconds
Thanks for the update. That Pydantic warning is benign. It’s emitted when a model uses Field(default=...) in a context where Pydantic v2 expects the default to be set on the model field itself (or via Annotated metadata), not inside Field(). The “UTC” part suggests it’s a timezone default in one of Prefect’s models. Functionally, your flows should run fine; it’s just noisy. Quick options - Keep running: it does not affect behavior. - Suppress the warning (fastest way): - Via env var (works for workers/containers too):
Copy code
export PYTHONWARNINGS="ignore:.*UnsupportedFieldAttributeWarning.*:UserWarning"
PowerShell:
Copy code
$env:PYTHONWARNINGS='ignore:.*UnsupportedFieldAttributeWarning.*:UserWarning'
- Or at the very top of your entrypoint (before importing Prefect or your flow code):
Copy code
import warnings
    warnings.filterwarnings(
        "ignore",
        message=r".*UnsupportedFieldAttributeWarning.*",
        category=UserWarning,
    )
- If you prefer not to suppress: you can safely ignore it until we release a patch that removes the offending Field(default=...) usage. A couple follow-ups so I can track this and get it fixed upstream: - What exact Prefect version are you on now (e.g., 3.4.23)? - What Pydantic version is installed (
pip show pydantic
)? - If you can, run once with default warning formatting so we get the full stack (helps pinpoint the exact field):
Copy code
PYTHONWARNINGS=default <your command>
and paste the stack that points into Prefect. Separately, did upgrading resolve your Cancelling-stuck issue from the automation? If not, I can help dig deeper.