< Marvin> any reason why my prefect server causes all of my Prefect Community #ask-marvin

<@ULVA73B9P>, any reason why my prefect server cau...

William Ouellette

08/25/2025, 1:08 PM

@Marvin, any reason why my prefect server causes all of my runs across all my deployments to systematically crash with the error

Flow run could not be submitted to infrastructure: The service has encountered an internal error. Please try again later

Marvin

08/25/2025, 1:08 PM

✅ thought for 261.4 seconds

Marvin

08/25/2025, 1:13 PM

Sorry you’re hitting that. That message means the API service returned a 500 during the handoff to infrastructure, so it’s typically a server-side issue (DB/services/version) rather than a single deployment misconfig. Common root causes in Prefect 3.x - Server/worker version mismatch after an upgrade - If your server and workers are on different minor versions, API contracts can diverge and cause 500s during submission. - Database migrations not applied (or DB issues) - If the server was upgraded but the DB schema wasn’t, submissions can 500. SQLite “database is locked” is also a frequent culprit under concurrency. - Server services not running or crashed - Starting the API with services disabled or a services crash can break submission paths. - Work pool / worker plugin mismatch - E.g., using a Docker/Kubernetes work pool but the worker environment doesn’t have the matching extra installed; worker logs will show the real error but the UI often shows the generic “internal error”. Quick triage checklist 1) Check server health - Hit the health endpoint and confirm status is ok:

Copy code

curl http://<your-server-host>:<port>/api/health

Docs: Server health check 2) Inspect server logs around the failure time - If running locally, start with more verbosity:

Copy code

prefect server start --log-level DEBUG

- If running via Docker/Helm, check the API container/pod logs for stack traces, “database is locked”, “no such column”, or IntegrityError. 3) Ensure the database is migrated - Run:

Copy code

prefect server database upgrade -y

- If you’re on SQLite and see “database is locked”, reduce concurrent load or move to Postgres for reliability. 4) Verify version parity (server vs. workers and deployments) - On each machine (server and worker hosts), run:

Copy code

prefect version

- Keep versions aligned (ideally same minor/patch). Release notes: Prefect 3.x release notes 5) Confirm server services are enabled and running - List services:

Copy code

prefect server services list-services

- If needed, start them in-process (for debugging/local):

Copy code

prefect server services start-services

6) Check work pools and workers - Verify work pools exist and types are correct:

Copy code

prefect work-pool ls -v

- Ensure workers are running and polling the right pool:

Copy code

prefect worker start -p <your-pool-name>

- If you’re using Docker/Kubernetes workers, ensure the worker environment has the required extras installed (e.g.,

pip install "prefect[docker]"

prefect[kubernetes]

). Useful docs - Server CLI: Self-hosted server CLI - Health check API: Health check If you can share a bit more, I can be more specific: - How are you running the server (local

prefect server start

, Docker, or Helm)? Which database (SQLite vs Postgres)? - Prefect versions on server and workers (

prefect version

output) - Work pool type(s) and how your workers are started - A few lines of server API logs (around the time a run fails) and any worker logs for a failing run Those details usually pinpoint whether it’s a DB migration/lock, a services issue, or a version/plugin mismatch.

William Ouellette

08/25/2025, 1:14 PM

it was working without any change to the docker container or the dependencies versions, and from 3 AM CET it stopped working without any deployment changes

5 Views

Open in Slack

Previous Next