<@ULVA73B9P> Is there an equivalent to .delay for ...
# ask-marvin
p
@Marvin Is there an equivalent to .delay for flow? I have very long tasks but I would like to keep them in flow and not call them one by one
m
thought for 613.1 seconds
Short answer: yes — use submit/map and a task runner. In Prefect 3.x, the rough equivalent of Celery’s .delay is: - task.submit(...) for tasks - flow.submit(...) for subflows Both return PrefectFuture objects you can wait on later. If you want to fan out many calls, use mapping: my_task.map(iterable). Common patterns (Prefect 3.x) 1) Submit tasks concurrently inside one flow - Set a task runner so tasks actually run in parallel (e.g., threads). - Collect PrefectFutures and resolve when you want. Example: from prefect import flow, task from prefect.task_runners import ThreadPoolTaskRunner from prefect.futures import as_completed @task def do_work(i: int) -> int: # pretend it’s long-running import time; time.sleep(2) return i * i @flow(task_runner=ThreadPoolTaskRunner(max_workers=8)) def parent(): futures = [do_work.submit(i) for i in range(20)] for fut in as_completed(futures): print("done:", fut.result()) if name == "__main__": parent() 2) Map a task over many inputs (fan-out in one line) from prefect import flow, task, unmapped from prefect.task_runners import ThreadPoolTaskRunner @task def process(item, static_cfg): return item + static_cfg["offset"] @flow(task_runner=ThreadPoolTaskRunner(max_workers=8)) def parent(items): futures = process.map(items, static_cfg=unmapped({"offset": 10})) results = [f.result() for f in futures] return results 3) Submit subflows concurrently from prefect import flow from prefect.task_runners import ThreadPoolTaskRunner @flow def child(x: int) -> int: return x * 2 @flow(task_runner=ThreadPoolTaskRunner(max_workers=4)) def parent(): futs = [child.submit(x=i) for i in range(10)] return [f.result() for f in futs] 4) Kick off entirely separate flow runs (like queueing work to workers) If you want to enqueue long work to run on a work pool (not inside the current process), trigger deployments from a flow: from prefect import flow from prefect.deployments import run_deployment @flow def orchestrate(): # Fire-and-forget multiple remote runs for i in range(5): run_deployment( name="package_name/child-flow-deployment", parameters={"x": i}, ) # You can also wait/poll on them later if you like When to use which - Need fast fan-out within one Python process (I/O bound work)? Use task.submit or .map with ThreadPoolTaskRunner. - Want to scale out on workers (CPU-bound/long jobs, isolation, retries, elasticity)? Use deployments + run_deployment to queue separate flow runs. Gotchas - submit/map return PrefectFutures — they don’t auto-resolve; call .result() or iterate with as_completed. - Mixing blocking calls (my_task()) with submitted calls (my_task.submit()) will serialize parts of your work. Stick to submit/map for concurrency. - Configure concurrency explicitly (e.g., ThreadPoolTaskRunner(max_workers=N)) so you aren’t surprised by defaults. Docs - Run work concurrently: https://docs-3.prefect.io/v3/how-to-guides/workflows/run-work-concurrently - Task runners: https://docs-3.prefect.io/v3/concepts/task-runners - Futures and as_completed: https://docs-3.prefect.io/v3/api-ref/python/prefect-futures - Trigger deployments from a flow (run_deployment): https://docs-3.prefect.io/v3/api-ref/python/prefect-deployments#run_deployment If you can share whether you want concurrency within a single flow run vs. queuing separate runs on a work pool, I can tailor the pattern and example to your setup. In either case, you don’t have to call them “one by one.” That would be far too cheerful.
p
@Marvin AttributeError: 'Flow' object has no attribute 'submit'
m
thought for 120.7 seconds
You’re right — Flow objects in Prefect 3.x do not have a .submit method. That one’s on me. What to use instead depends on what you’re trying to “delay”: Option A: Keep the long work as tasks and run them concurrently - Use task.submit(...) or task.map(...) inside a flow. - Set a task runner (e.g., ThreadPoolTaskRunner) to get actual parallelism. - This is the closest to Celery’s .delay for in-process concurrency. Example: from prefect import flow, task from prefect.task_runners import ThreadPoolTaskRunner from prefect.futures import as_completed @task def long_task(i: int) -> int: import time; time.sleep(2) return i * i @flow(task_runner=ThreadPoolTaskRunner(max_workers=8)) def main(): futures = [long_task.submit(i) for i in range(20)] for fut in as_completed(futures): print("done:", fut.result()) Docs: - Run work concurrently: https://docs-3.prefect.io/v3/how-to-guides/workflows/run-work-concurrently - Task runners: https://docs-3.prefect.io/v3/concepts/task-runners - Futures: https://docs-3.prefect.io/v3/api-ref/python/prefect-futures Option B: If you want to fan out multiple flows (not tasks) concurrently Calling a flow like child_flow(...) inside a parent flow runs synchronously (blocks). To “queue” multiple flows to run concurrently on your workers, create a deployment for the child flow and trigger it multiple times from the parent with run_deployment. Each call creates a separate flow run. Example: from prefect import flow from prefect.deployments import run_deployment from prefect.flow_runs import wait_for_flow_run @flow def orchestrate(): runs = [ run_deployment( name="my-package/child-flow-deployment", parameters={"x": i}, flow_run_name=f"child-{i}", ) for i in range(10) ] # Optionally wait for completion for fr in runs: state = wait_for_flow_run(fr.id, poll_interval=5) print(fr.name, "->", state.type) Docs: - run_deployment API: https://docs-3.prefect.io/v3/api-ref/python/prefect-deployments#run_deployment - Flow run helpers (wait_for_flow_run): https://docs-3.prefect.io/v3/api-ref/python/prefect-flow_runs Option C: From outside a flow, submit flows to a runner If you’re not inside a flow and want to programmatically kick off flow runs, use submit_to_runner: from prefect.runner import submit_to_runner, wait_for_submitted_runs # flows is a Flow object (e.g., returned by @flow) flow_runs = submit_to_runner(prefect_callable=my_flow, parameters=[{"x": i} for i in range(10)]) wait_for_submitted_runs(flow_runs) API: submit_to_runner signature shown here: prefect.runner: submit_to_runner Which should you choose? - Need in-process concurrency for long tasks? Use tasks + submit/map with a ThreadPoolTaskRunner. - Need to scale out and let workers handle long work independently? Use deployments + run_deployment to fan out separate flow runs. If you share whether your “very long tasks” are CPU-bound, I/O-bound, and whether you want to keep everything in one flow run vs. queue separate runs on a work pool, I’ll point you to the cleanest pattern for your case. I promise not to invent any more methods that don’t exist.
p
@Marvin I get Error enabling deployment: This error is often thrown because Docker is not running. Please ensure Docker is running.
m
thought for 778.1 seconds
That message usually means your deployment is targeting Docker infrastructure (a Docker work pool/worker or an image build), but the Docker daemon isn’t running on the machine doing the work. Quick checks - Are you using a Docker work pool? If yes, Docker must be running on the worker host. - Are you building a Docker image during deploy? If yes, Docker must be running on the machine where you run
prefect deploy
. - Verify Docker is up:
docker info
(or start Docker Desktop; on Linux:
sudo systemctl start docker
). If you want Docker 1) Ensure Docker is installed and running. 2) Make sure the Docker extra is available:
Copy code
pip install "prefect[docker]"
3) Use a Docker work pool and a running worker:
Copy code
prefect work-pool create --type docker my-docker-pool
prefect worker start --pool my-docker-pool
4) Point your deployment at that pool and (if you’re building/pulling an image) re-run once Docker is up. If you don’t want Docker - Create and point the deployment to a Process pool instead:
Copy code
prefect work-pool create --type process my-process-pool
- Update your deployment to use
my-process-pool
so no Docker is involved. A little more context never hurts: - What OS are you on? - What’s your work pool type and name? - Are you building an image in
prefect deploy
, or just enabling in the UI? References (such as they are): - Prefect 3 deployments API - Prefect 3.2 release notes - Work pools and workers overview - Prefect + Docker overview Share your deployment config or the exact command you ran and I’ll point you at the precise fix instead of this thrilling diagnostic checklist.
p
@Marvin It works ! But now I have a problem. The container where the flow is launched does not have the flow code. I use await flow.deploy( name=new_deployment_name, work_pool_name=self.work_pool_name, # Uses my-podman-pool description=job.description, image="prefecthq/prefect:3.2.15-python3.12", # Force Python 3.12 since 3.13 doesn't exist push=False, # Don't push to registry for local development build=False)
m
thought for 157.7 seconds
Glad the first hurdle is out of the way. Now for the predictable sequel: you’re launching a plain Prefect image that doesn’t contain your code, and you told Prefect not to build or push an image. So…the container has no idea what your flow is. Shocking, I know. You’ve got three solid options: 1) Fetch code at runtime (recommended for 3.x) - Use
Flow.from_source(...)
so the worker can pull your code (e.g., from Git). - Make sure the image has git and your dependencies installed.
Copy code
from prefect import Flow

flow_from_git = Flow.from_source(
    source="<https://github.com/your-org/your-repo>",
    entrypoint="path/to/file.py:your_flow",
)

await flow_from_git.deploy(
    name=new_deployment_name,
    work_pool_name=self.work_pool_name,   # my-podman-pool
    image="your-image-with-git-and-deps",
    build=False,
    push=False,
)
2) Bake your code into the image (no pulling at runtime) - Build your own image that includes your code and deps, then point the deployment at it. - With Podman on the same host as the worker,
push=False
is fine.
Copy code
# Dockerfile (works with Podman too)
FROM prefecthq/prefect:3.2.15-python3.12
WORKDIR /opt/prefect/flows
COPY . /opt/prefect/flows
# Optional: if you have deps
# COPY requirements.txt .
# RUN pip install -r requirements.txt
Build and deploy:
Copy code
podman build -t local/my-flow:dev .
# or docker build ...
await flow.deploy(
    name=new_deployment_name,
    work_pool_name=self.work_pool_name,
    image="local/my-flow:dev",
    build=False,
    push=False,
)
3) Mount your local code into the container (dev-only convenience) - Tell the worker to bind-mount your source into the container via
job_variables
. - For Podman on SELinux systems, add
:Z
to relabel.
Copy code
from pathlib import Path

await flow.deploy(
    name=new_deployment_name,
    work_pool_name=self.work_pool_name,
    image="prefecthq/prefect:3.2.15-python3.12",
    build=False,
    push=False,
    job_variables={
        "volumes": [f"{Path.cwd()}:/opt/prefect/flows:Z"],  # drop :Z if not needed
        "env": {"PYTHONPATH": "/opt/prefect/flows"},
    },
)
Notes: - This only works if the worker host can see your local path (i.e., same machine). - Your deployment’s entrypoint must be under the mounted path. Which path do you prefer: pulling from a repo, baking an image, or mounting locally? If you share your repo layout and where your worker runs, I’ll give you the exact incantation instead of this choose-your-own-adventure.
👍 1
p
@Marvin i have to do that FROM prefecthq/prefect:3.2.15-python3.12 WORKDIR /opt/prefect/flows # Copy the entire huscarl project COPY . /opt/prefect/flows # Install huscarl and its dependencies using local_pypi as index RUN pip install -e . --find-links ./local_pypi/ # Set Python path to include the source directory and create symlink for backward compatibility ENV PYTHONPATH="/opt/prefect/flows/src:$PYTHONPATH" RUN ln -sf /opt/prefect/flows/src /opt/prefect/src in my Dockerfile. Somehow the worker is looking for the file in opt/prefect/src . Is it ok?
m
thought for 82.4 seconds
Short answer: yes — the Docker worker defaults to running with
/opt/prefect
as its working directory, so it will look for your entrypoint under
/opt/prefect/...
. Your
WORKDIR /opt/prefect/flows
in the Dockerfile gets overridden at runtime. Thrilling, I know. You’ve got three clean fixes (pick one): A) Put your project under /opt/prefect (recommended if your entrypoint is “src/…”) - Update your Dockerfile to match the worker’s expectations:
Copy code
FROM prefecthq/prefect:3.2.15-python3.12

WORKDIR /opt/prefect
COPY . /opt/prefect

RUN pip install -e . --find-links ./local_pypi/
ENV PYTHONPATH="/opt/prefect/src:$PYTHONPATH"
- Use an entrypoint like
src/path/to/flow.py:flow
. And point your deployment to this custom image:
Copy code
await flow.deploy(
  name=new_deployment_name,
  work_pool_name=self.work_pool_name,
  image="local/my-flow:dev",
  build=False,
  push=False,
)
B) Keep your code under /opt/prefect/flows, but change the container’s working dir - Tell the worker to run in
/opt/prefect/flows
so the entrypoint resolves there:
Copy code
await flow.deploy(
  name=new_deployment_name,
  work_pool_name=self.work_pool_name,
  image="prefecthq/prefect:3.2.15-python3.12",
  build=False,
  push=False,
  job_variables={"working_dir": "/opt/prefect/flows"},
)
- Then make sure your deployment entrypoint is relative to
/opt/prefect/flows
. C) Keep the symlink - Your symlink from
/opt/prefect/flows/src
to
/opt/prefect/src
will work. It’s just… a bit hacky. If it keeps you moving, fine. Important notes: - Even if you
pip install -e .
, Prefect still needs to open the entrypoint file path inside the container. That’s why the file must exist under the container’s working dir (or you adjust the working dir). - Don’t forget to point the deployment at your custom image if you’re baking code into it. Using the base
prefecthq/prefect
image without a volume mount means no code inside the container. If you share your deployment entrypoint string and where that file sits in the image, I’ll give you the exact minimal change.
p
@Marvin I still don't get one point. When I change the flow, I have to stop the deployment, but also create a new image (with my workflow). Is this normal?
m
thought for 49.6 seconds
Short answer: if you bake your code into the image, yes—you need to rebuild the image when the code changes. But no—you do not need to “stop” the deployment. Just update the deployment to point at the new image (or reuse the same tag after rebuilding) and future runs will use it. You’ve got three common workflows: 1) Bake code into the image (what you’re doing now) - Rebuild your image whenever code changes. - Tag immutably (e.g., git SHA) and redeploy with the new tag:
Copy code
IMAGE_TAG=$(git rev-parse --short HEAD)
podman build -t local/huscarl:${IMAGE_TAG} .
await flow.deploy(
  name=new_deployment_name,
  work_pool_name=self.work_pool_name,
  image=f"local/huscarl:{IMAGE_TAG}",
  build=False,
  push=False,
)
- You don’t need to disable/stop the deployment; updating it is enough. Existing running flow runs finish on the old image; future runs use the new one. To speed rebuilds, structure your Dockerfile to maximize cache:
Copy code
FROM prefecthq/prefect:3.2.15-python3.12
WORKDIR /opt/prefect
# Copy only dependency files first so this layer caches
COPY pyproject.toml poetry.lock* requirements.txt* /opt/prefect/
RUN pip install -r requirements.txt  # or poetry install --no-root
# Now copy the source
COPY . /opt/prefect
RUN pip install -e . --find-links ./local_pypi/
ENV PYTHONPATH="/opt/prefect/src:$PYTHONPATH"
2) Fetch code at runtime (no rebuilds) - Use
Flow.from_source(...)
pointing at your Git repo. Push code changes; the worker clones the repo each run. Your base image just needs git + deps.
Copy code
from prefect import Flow
flow = Flow.from_source(
  source="<https://github.com/your-org/huscarl>",
  entrypoint="src/huscarl/flows/my_flow.py:flow",
)

await flow.deploy(
  name=new_deployment_name,
  work_pool_name=self.work_pool_name,
  image="prefecthq/prefect:3.2.15-python3.12",  # or your base with git+deps
  build=False,
  push=False,
)
3) Bind-mount your code (dev-only) - Mount your local source into the container so changes appear instantly; no rebuild, but only works if the worker runs on the same host.
Copy code
await flow.deploy(
  name=new_deployment_name,
  work_pool_name=self.work_pool_name,
  image="prefecthq/prefect:3.2.15-python3.12",
  build=False,
  push=False,
  job_variables={
    "volumes": ["/path/to/your/repo:/opt/prefect:Z"],  # :Z for SELinux/Podman
    "working_dir": "/opt/prefect",
    "env": {"PYTHONPATH": "/opt/prefect/src"},
  },
)
TL;DR - Baking code into images = rebuild per change; just update the deployment, don’t stop it. - Want to avoid rebuilds? Use
from_source
or a volume mount during development. Tell me which path you prefer and I’ll tailor the exact commands for your Podman setup.
p
@Marvin I get this error sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) database is locked [SQL: INSERT INTO worker (work_pool_id, name, last_heartbeat_time, heartbeat_interval_seconds, status, id, created, updated) VALUES (:work_pool_id, :name, :last_heartbeat_time, :heartbeat_interval_seconds, :status, :id, :created, :updated) ON CONFLICT (work_pool_id, name) DO UPDATE SET last_heartbeat_time = :param_1, heartbeat_interval_seconds = :param_2, status = :param_3] [parameters: {'work_pool_id': '356e002e-c105-4006-9722-8f76a539220d', 'name': 'DockerWorker 0a3b4b57-e6b7-4b47-8573-89be27c6acd9', 'last_heartbeat_time': '2025-08-15 160715.871636', 'heartbeat_interval_seconds': 30, 'status': 'ONLINE', 'id': '327d05d8-be1e-4aa8-a6ae-e70278de4a3f', 'created': '2025-08-15 160715.871909', 'updated': '2025-08-15 160715.871912', 'param_1': '2025-08-15 160715.871636', 'param_2': 30, 'param_3': 'ONLINE'}] (Background on this error at: https://sqlalche.me/e/20/e3q8)
m
thought for 53.8 seconds
That error is from your self-hosted Prefect server’s SQLite backend getting write-locked while workers heartbeat. SQLite locks the whole database on write; with multiple workers or busy API traffic, you’ll hit this. It’s… not unusual. Best fix (recommended) - Switch the server to Postgres. It handles concurrent writes properly. - Start Postgres (example with Podman):
Copy code
podman run -d --name prefect-postgres -e POSTGRES_USER=prefect -e POSTGRES_PASSWORD=prefect -e POSTGRES_DB=prefect -p 5432:5432 postgres:16
- Point Prefect Server at it and restart the server:
Copy code
export PREFECT_API_DATABASE_CONNECTION_URL="<postgresql+asyncpg://prefect:prefect@localhost:5432/prefect>"
prefect server start
Quick mitigations (if you must stay on SQLite) - Ensure only one
prefect server start
process is running against the DB. - Run fewer workers/concurrency for now (heartbeats are frequent writes). - Keep the DB on a local filesystem (not NFS/remote). If you run the server in a container with a bind mount on SELinux, use
:Z
on the volume. - Bump SQLite timeout to reduce “database is locked” errors (still a band-aid):
Copy code
export PREFECT_API_DATABASE_CONNECTION_URL="sqlite+aiosqlite:///$(prefect config view | grep DEFAULT_PREFECT_HOME -A1 | tail -1 | awk '{print $2}')/prefect.db?timeout=60"
prefect server start
(or set the path directly:
sqlite+aiosqlite:////home/you/.prefect/prefect.db?timeout=60
) A couple of confirmations that will help me tailor the exact steps: - Are you running Prefect Server locally with the default SQLite? - How many workers are heartbeating to this server? - Is the server running in a container with a bind-mounted DB path?
p
@Marvin I get {'task_id': '9523d74f-5f4f-41d0-a4be-60d4fca929d9', 'status': 'error', 'error': 'Path /root/.prefect/storage/443c982af9bb4bc0972fd1aee266bf23 does not exist.'}
m
thought for 54.7 seconds
That error means your deployment is pointing at a local, file-based storage path that only exists where you ran deploy (looks like a VS Code devcontainer), not inside the worker’s container. The
file+.vscode-resource... | /root/.prefect/storage/...
path is a dead giveaway: the worker can’t see that filesystem, so it can’t load your flow. Pick one of these fixes: 1) Bake code into the image (what you were doing) - Ensure your deployment actually uses your custom image and the entrypoint path matches where the code lives in that image (e.g.,
/opt/prefect/src/...
). - Rebuild the image when code changes, redeploy pointing at the new tag. 2) Fetch code at runtime (no image rebuilds) - Use
Flow.from_source(...)
so the worker clones your repo at run time.
Copy code
from prefect import Flow

flow = Flow.from_source(
  source="<https://github.com/your-org/your-repo>",
  entrypoint="src/your_pkg/flows/your_flow.py:flow",
)

await flow.deploy(
  name="your-deployment",
  work_pool_name="my-podman-pool",
  image="prefecthq/prefect:3.2.15-python3.12",  # ensure git + deps are available
  build=False,
  push=False,
)
3) Dev-only: mount your code into the worker container - If the worker runs on the same host, bind-mount your project and set
working_dir
so the entrypoint resolves.
Copy code
await flow.deploy(
  name="your-deployment",
  work_pool_name="my-podman-pool",
  image="prefecthq/prefect:3.2.15-python3.12",
  build=False,
  push=False,
  job_variables={
    "volumes": ["/absolute/path/to/repo:/opt/prefect:Z"],  # drop :Z if not on SELinux/Podman
    "working_dir": "/opt/prefect",
    "env": {"PYTHONPATH": "/opt/prefect/src"},
  },
)
Why this happened - With build/push disabled and no
from_source
, Prefect recorded a file-based storage URL (under
~/.prefect/storage/...
) on the machine that executed
deploy
. Your worker container can’t access that path, especially when it came from a VS Code devcontainer. If you share: - where your worker runs (host/container), - your deployment’s image and entrypoint, - and whether you want image-based vs repo-based delivery, I’ll give you the minimal change to stop Prefect from chasing VS Code’s ghost paths.
p
@Marvin I have a prefect server with prefect server start, then I have a docker work pool with a custom image where I have baked the code in it (it use podman but not important). Then I want to persist the result and this is where it happens
m
thought for 106.5 seconds
You’ve got a Docker/Podman worker running your flows in an isolated container, and Prefect is trying to persist results to a local filesystem path that only exists inside that container. The UI is dutifully turning that into a cute VS Code “file+” link… which of course doesn’t exist on your machine. Hence “path does not exist.” Two sane paths forward: Option A: Use a shared, remote results store (recommended) - Point Prefect at S3/GCS/Azure so results are written to object storage that both your server/UI and your code can access. - Steps: - Install the extra in your image/worker env (e.g., AWS):
pip install "prefect[aws]"
- Provide cloud creds in the worker environment (AWS/GCP/Azure). - Configure default result storage to your bucket/prefix and enable persistence. In 3.x this is done via Prefect settings or per-task/flow config. See: Prefect 3: Results (persistence and storage) - Pass the relevant env via your deployment’s
job_variables.env
so the flow-run container sees them. If you want, tell me which backend (S3, GCS, Azure) and I’ll give you the exact env/config to drop into your deployment. Option B: Stick with local filesystem results (dev-only or on single host) - Bind-mount a host directory into the flow-run containers at the path Prefect uses so it actually exists and persists across runs. - For your setup (root inside container), mount
/root/.prefect
or just the
storage
subdir:
Copy code
await flow.deploy(
  name="your-deployment",
  work_pool_name="my-podman-pool",
  image="your-custom-image",
  build=False,
  push=False,
  job_variables={
    # Change the left-hand path to a real directory on the worker host
    "volumes": ["/srv/prefect-results:/root/.prefect:Z"],  # :Z if SELinux/Podman
    "env": {"PREFECT_RESULTS_PERSIST_BY_DEFAULT": "true"},
  },
)
- Make sure the entrypoint still points to the code baked into the image. This only fixes the results location. - Caveat: the UI’s clickable path still won’t be readable from your laptop; the files live on the worker host. This is why remote object storage is nicer. Why this happens - With
persist_result=True
(globally or on tasks), and no explicit remote storage, Prefect defaults to a local path like
/root/.prefect/storage/...
inside the container. The directory may not exist, and even if it does, nothing outside that container can read it. The VS Code URL in the error is just the UI trying to be helpful. Tell me: - Which backend you prefer for result storage (S3/GCS/Azure/local), and - Whether you read results only via the Prefect UI/API or also directly from files. I’ll give you the exact config (env vars and deployment
job_variables
) for your choice.
p
@Marvin I would like both for dev and for production (GCP)
m
thought for 72.3 seconds
Got it — you want local/filesystem-backed results for dev and GCS-backed results for prod. Sensible. Here’s a clean split that avoids your “path does not exist” problem. Dev (local filesystem, Podman) - Goal: results persist on your host so you can inspect them; containers can write to them. - Do this:
Copy code
# Pick a host dir for results and Prefect home
mkdir -p /srv/prefect-home

# In your deployment call:
await flow.deploy(
  name="huscarl-dev",
  work_pool_name="my-podman-pool",
  image="your-custom-image",
  build=False,
  push=False,
  job_variables={
    "volumes": [
      "/srv/prefect-home:/opt/prefect-home:Z"  # :Z for SELinux/Podman
    ],
    "env": {
      "PREFECT_HOME": "/opt/prefect-home",
      "PREFECT_RESULTS_PERSIST_BY_DEFAULT": "true",
      # optional: steer results into a predictable subdir
      "PREFECT_RESULTS_DEFAULT_STORAGE": "/opt/prefect-home/storage",
    },
  },
)
Notes: - Mounting
PREFECT_HOME
ensures Prefect’s storage (including result storage) lives on the host path. - The UI may link to filesystem paths; they won’t be clickable from your laptop, but the files will exist on the host. - You can also set
persist_result=True
on your flow/tasks:
Copy code
from prefect import flow, task

@task(persist_result=True)
def compute():
    return 42

@flow(persist_result=True)
def pipeline():
    return compute()
Prod (GCS-backed results) - Goal: results land in GCS; no host coupling; works across nodes. - Image/worker prerequisites: - Install the GCS extras:
Copy code
pip install "prefect[gcp]"
- Provide GCP credentials (ADC). Either: - Workload Identity (best in GKE), or - Mount a service account JSON and set
GOOGLE_APPLICATION_CREDENTIALS
to its path. - Configure default result storage to a GCS URL and enable persistence:
Copy code
await flow.deploy(
  name="huscarl-prod",
  work_pool_name="my-podman-pool",
  image="your-prod-image",  # same code-baked image is fine
  build=False,
  push=False,
  job_variables={
    "env": {
      "PREFECT_RESULTS_PERSIST_BY_DEFAULT": "true",
      "PREFECT_RESULTS_DEFAULT_STORAGE": "<gcs://my-prefect-results/huscarl/>",
      # if using a mounted SA key:
      # "GOOGLE_APPLICATION_CREDENTIALS": "/var/secrets/gcp/sa.json",
    },
    # If using a mounted SA key file:
    # "volumes": ["/secure/path/sa.json:/var/secrets/gcp/sa.json:ro,Z"],
  },
)
- The UI will show GCS URLs for persisted results; you can fetch them with gcloud/gsutil or programmatically. Why this works - Prefect 3 persists results when enabled and writes them to a “result storage.” If you don’t set one, it defaults to a local path inside the container (e.g., /root/.prefect/storage/...), which other machines can’t see. Setting
PREFECT_RESULTS_DEFAULT_STORAGE
to a real path (local mount in dev, GCS URL in prod) makes the paths valid where the code runs. A couple of sanity checks - Ensure your flow/tasks actually request persistence (
persist_result=True
) or set
PREFECT_RESULTS_PERSIST_BY_DEFAULT=true
as shown. - For Podman on SELinux, keep
:Z
on volume mounts. - Your custom image should include any libs needed to serialize your results. Docs, for when you’re feeling adventurous - Prefect 3: Results (persistence and storage) If you share your bucket name and how you supply GCP creds (WI vs key file), I’ll give you the exact
job_variables
snippet for your environment.
p
@Marvin for local it does not work I still get {'task_id': '8dc6685b-97f9-48fa-962b-cf1f2b280548', 'status': 'error', 'error': 'Path /opt/prefect-home/storage/78c71943b3c04bdf9bab3f36350338d2 does not exist.'}
m
thought for 71.4 seconds
You’re writing results to a local filesystem path inside the flow-run container, but the Prefect server is trying to read them from its own filesystem. Those paths must be the same and actually exist in both places. Right now they don’t, hence “does not exist.” Fix: use a shared host path for PREFECT_HOME and make both the server and the flow-run container use it. Do this for local dev: 1) Pick a host directory and start the server with it
Copy code
mkdir -p /srv/prefect-home
export PREFECT_HOME=/srv/prefect-home
prefect server start
(If your server is already running, restart it with PREFECT_HOME set.) 2) Mount the same host path into your flow-run container at the same absolute path, and set PREFECT_HOME in the container
Copy code
await flow.deploy(
  name="huscarl-dev",
  work_pool_name="my-podman-pool",
  image="your-custom-image",
  build=False,
  push=False,
  job_variables={
    "volumes": ["/srv/prefect-home:/srv/prefect-home:Z"],  # :Z for Podman/SELinux
    "env": {
      "PREFECT_HOME": "/srv/prefect-home",
      "PREFECT_RESULTS_PERSIST_BY_DEFAULT": "true",
    },
  },
)
Why this works - Prefect writes local results under
${PREFECT_HOME}/storage/...
. - If the server and the flow-run container don’t share that path, the server can’t read the files and you get the “does not exist” error. - By using the same absolute path on both sides, the files are visible to both. Alternative you might actually prefer - Use GCS for dev as well. Then there’s no shared filesystem to keep in sync. Just set: - In your deployment `job_variables.env`: -
PREFECT_RESULTS_PERSIST_BY_DEFAULT=true
-
PREFECT_RESULTS_DEFAULT_STORAGE=<gcs://your-bucket/dev/>
- plus GCP creds (ADC via Workload Identity or
GOOGLE_APPLICATION_CREDENTIALS
to a mounted key) - The server will fetch results from GCS directly. Docs for reference: Prefect 3: Results If you share where your server is running (host path to its PREFECT_HOME) I’ll give you the exact volume/env snippet to match it.
p
@Marvin ok now I get that {'task_id': 'cb32198c-0bef-4799-97a6-e2e911a1cde6', 'status': 'error', 'error': "No module named 'prefect_fake'"} but the job is working orker 'DockerWorker b46a97af-bba3-475a-ad2b-0658f2139fe1' submitting flow run 'cb32198c-0bef-4799-97a6-e2e911a1cde6' 074247 PM prefect.flow_runs.worker Completed submission of flow run 'cb32198c-0bef-4799-97a6-e2e911a1cde6' 074247 PM prefect.flow_runs.worker Opening process... 074249 PM prefect.flow_runs.runner Downloading flow code from storage at '.' 074250 PM prefect.flow_runs Beginning flow run 'idealistic-albatross' for flow 'fake-workflow' 074251 PM prefect.flow_runs Finished in state Cached(type=COMPLETED) 074251 PM add-node1 prefect.task_runs Finished in state Cached(type=COMPLETED) 074251 PM add-node2 prefect.task_runs Finished in state Completed() 074251 PM prefect.flow_runs Process for flow run 'idealistic-albatross' exited cleanly. 074251 PM prefect.flow_runs.runner I don't get it???
m
thought for 43.9 seconds
You didn’t break physics. Your flow ran fine. The “No module named 'prefect_fake'” error is from the UI/server trying to read (deserialize) your persisted result, not from executing the flow. Why this happens - Prefect persisted your task/flow result using Python pickling because the value wasn’t JSON-friendly. - When the UI tries to display the value, the server process needs to import the module(s) referenced in that object to unpickle it. - Your server environment doesn’t have that module (prefect_fake), so previewing the result fails. The run itself already completed. You’ve got two good fixes: A) Store UI-friendly results (recommended) - Return only JSON-serializable values (dict/list/str/int/float/bool/None) or bytes/text. - If you have custom objects, convert them to dicts first (e.g., pydantic
model_dump()
, or your own
.to_dict()
). Example:
Copy code
from prefect import flow, task

@task(persist_result=True)
def compute():
    # Convert custom types to plain JSON-friendly data first
    return {"status": "ok", "value": 42}

@flow(persist_result=True)
def fake_workflow():
    return compute()
- Result previews in the UI will work without your package installed on the server. B) Make the server able to unpickle your custom objects - Install the same package/code in the environment that runs
prefect server start
. - If you run the server on your host:
Copy code
pip install -e /path/to/your/repo  # where 'prefect_fake' lives
- If you run the server in a container, bake your package into that image or mount it and set
PYTHONPATH
accordingly. - This is workable for dev, but brittle for prod. Prefer A. Notes - Using GCS for prod doesn’t change this behavior: the server still needs to deserialize to preview. So either store JSON-friendly values or install your package in the server environment. - If you don’t need previews, you can persist raw artifacts separately or skip result persistence for those tasks. Quick questions to tailor the exact steps: - Is
prefect_fake
your package/module name? - How are you running the Prefect server (host venv vs container)? - What do your tasks return that needs pickling? If you share a snippet, I’ll show a minimal JSON-friendly conversion.
p
@Marvin ok it works ! but why the pending state is so long? Is the loading of the image and launch of the container computed in the pending time?
m
thought for 57.6 seconds
Short answer: yes. In Prefect 3, a run stays in Pending until the Python process inside your flow container actually starts. That Pending time includes the worker reserving the run and all infrastructure bring‑up: pulling the image, creating the container, mounting volumes, injecting env vars, etc. Only once the container is up and your flow code begins does the state switch to Running. Why your Pending might be long - Image pulls: first run on a node often pulls the image. Slow registry = long Pending. - Image not present for the worker user: with Podman, root vs rootless have separate image stores. If you built as root but the worker runs rootless (or vice versa), it can’t see the image and may try to pull it. - Large images or runtime installs: big base images or installing deps during container start add seconds/minutes. - Worker availability: if the worker is busy or has low concurrency, the run can sit Pending until it gets picked up. - Network/volume overhead: slow/big volume mounts or network hiccups delay container creation. How to speed it up - Pre-pull or pre-build on the worker host: - Podman:
podman build -t your/image:tag .
(as the same user that runs the worker) or
podman pull your/image:tag
- Use the same user/engine for build and run (avoid root/rootless mismatch). If that’s not possible, push to a registry and pull. - Keep images lean; install everything at build time, not on container start. - Ensure the image tag in your deployment matches exactly what’s on the worker host. - Start more worker concurrency if you’re queueing. - Turn up worker logs to see where time goes:
Copy code
PREFECT_LOGGING_LEVEL=DEBUG prefect worker start --pool my-podman-pool
You’ll see lines like “Pulling image…” or “Creating container…”, which correlate directly with the long Pending. If you tell me: - the exact image tag in your deployment, - whether the worker is rootless Podman or rootful, - and whether Pending is slow only on first run or every run, I’ll point at the precise bottleneck (usually “image not present for that user” or a slow pull).
p
@Marvin ok the polling time is very slow, that's why this is taking so long
m
thought for 36.0 seconds
Yep — Pending includes “waiting for the next worker poll” plus the whole container bring-up. If your worker polls slowly, you’ll feel it. What you can do: - Lower the worker polling interval so new runs get claimed faster:
Copy code
# start your worker with a shorter poll interval (seconds)
PREFECT_WORKER_POLLING_INTERVAL=2 PREFECT_LOGGING_LEVEL=DEBUG prefect worker start --pool my-podman-pool
- Or set it in your Prefect profile/env permanently (same env var). - Keep it reasonable (2–5s). Lower = more API chatter. And remember: even after claim, Pending still covers image pulls and container creation. Pre-pull your image on the worker host (as the same user the worker runs under) to avoid that:
Copy code
podman pull your/image:tag
# or build locally with that user so it’s already present
podman build -t your/image:tag .
If you want, share how you’re launching the worker (command/env) and I’ll give you the exact incantation for your setup.