John-Craig Borman
04/05/2023, 2:24 PM13:29:44.526 | INFO | prefect.engine - Engine execution of flow run '0f87d3ab-658a-4fb8-bf9d-9daf903bcaf1' aborted by orchestrator: This run cannot transition to the RUNNING state from the RUNNING state.
13:29:43.035 | INFO | Flow run 'unnatural-leech' - Downloading flow code from storage at None
warn(RuntimeWarning(msg))
/usr/lib/python3.10/runpy.py:126: RuntimeWarning: 'prefect.engine' found in sys.modules after import of package 'prefect', but prior to execution of 'prefect.engine'; this may result in unpredictable behaviour
prefect version
:
Version: 2.8.7
API version: 0.8.4
Python version: 3.10.10
Git commit: a6d6c6fc
Built: Thu, Mar 23, 2023 3:27 PM
OS/Arch: linux/x86_64
Profile: default
Server type: ephemeral
Server:
Database: sqlite
SQLite version: 3.31.1
Zanie
04/05/2023, 2:25 PMJohn-Craig Borman
04/05/2023, 2:26 PMZanie
05/01/2023, 4:33 PMJohn-Craig Borman
05/01/2023, 4:44 PMthe logs there are probably misleading. What most likely happened is:
1. your deployment triggered a flow to run on some Pod 1
2. The Prefect flow enters a running state
a. Pod 1 crashed while running the flow (for some unknown reason, possibly OOM as you mention)
3. K8s sees the crashed pod and tries to restart by creating Pod 2 to execute the same flow
4. The Prefect flow tries to enter a running state but it was already running
a. Prefect's state machine complains that this is invalid
b. Raises the error we see above before an OOM (or other failure) that crashed Pod 1We suspect that this is roughly what happened in our K8s deployment, the error message above being a scapegoat for some root failure