<@ULVA73B9P> I want to deploy my prefect code on a...
# ask-marvin
c
@Marvin I want to deploy my prefect code on a production host, what are the available options to achieve this?
m
thought for 177.6 seconds
Great question! Assuming you’re on Prefect 3.x, there are a few production-ready ways to deploy and run your flows on a host. At a high level, you will: - Create a work pool (choosing the infrastructure type you want) - Start a worker on your production host that points at that work pool - Deploy your flow(s) either with code or via the CLI, pointing them at the pool Here are the main options and how they fit different environments: Available worker types (choose based on your infra) - Process worker (no containers): Easiest way to run flows directly as subprocesses on a VM/bare-metal host. Good for simple servers, cron replacement, or air-gapped envs. - Docker worker (containers): Each flow run executes in a container. Great for dependency isolation and reproducibility on hosts with Docker. - Kubernetes job worker (cluster): Each flow run is a Kubernetes Job. Ideal if your “production host” really means a K8s cluster. Note: In Prefect 3.x, workers are provided by Prefect and/or plugins (e.g., docker, kubernetes). You’ll create a work pool of the correct type, then run a worker for that pool on your host. See the workers concept doc: Work pools and workers and our overview blog: Work pools and workers: deploy Python securely. Two ways to deploy your flows 1) Code-first (recommended for CI/CD) - Define your flow and deploy from source (git URL, path, etc.) using
Flow.from_source(...).deploy(...)
. - Example:
Copy code
from prefect import flow
  from prefect.flows import Flow

  @flow
  def my_flow():
      print("Hello from prod")

  if __name__ == "__main__":
      Flow.from_source(
          source="<https://github.com/your-org/your-repo>",   # or a local path
          entrypoint="flows/my_flow.py:my_flow",
      ).deploy(
          name="my-flow-prod",
          work_pool_name="prod-process-pool",
          cron="0 * * * *",         # or interval=3600, etc.
          tags=["prod"],
      )
Notes: -
source
can be a git URL, local path, or other supported storage types. The
entrypoint
is
path/to/file.py:flow_func_name
. - To containerize runs (Docker/K8s), specify
image=...
and optionally
build=True, push=True
in
.deploy(...)
. 2) CLI-first - Use the
prefect deploy
command to register a deployment from an entrypoint on disk or from a checked-out repo.
Copy code
prefect deploy flows/my_flow.py:my_flow -n my-flow-prod -p prod-process-pool --cron "0 * * * *"
Tip: You can add parameters, tags, and schedules via flags. (If you’d like, I can tailor an exact command for your setup.) Setting up your production host 1) Create a work pool for your chosen worker type - Process example:
Copy code
prefect work-pool create --type process prod-process-pool
- Docker or K8s pools require the corresponding plugin/extras to be installed on the host that will run the worker. 2) Start a worker that points to the pool - Process:
Copy code
prefect worker start --pool prod-process-pool
- Docker or K8s: - Ensure Docker/K8s and the Prefect integration are installed on the host - Start the worker similarly, pointing to the correct pool 3) Run the worker as a service (systemd example) - Create a unit file like `/etc/systemd/system/prefect-worker.service`:
Copy code
[Unit]
  Description=Prefect Worker - prod
  After=network.target

  [Service]
  Type=simple
  User=prefect
  WorkingDirectory=/opt/prefect
  Environment="PREFECT_API_URL=<YOUR_API_URL>"
  # Or PREFECT_API_KEY for Prefect Cloud:
  # Environment="PREFECT_API_KEY=***"
  ExecStart=/usr/bin/bash -lc 'source /opt/venvs/prefect/bin/activate && prefect worker start --pool prod-process-pool'
  Restart=always
  RestartSec=5s

  [Install]
  WantedBy=multi-user.target
Then: ``` sudo systemctl daemon-reload sudo systemctl enable prefect-worker sudo systemctl start prefect-worker sudo systemctl status prefect-worker
``` Best practices: - Use a dedicated virtualenv (or container) for consistent dependencies - Send logs to journald/syslog (default) or file if desired - Configure health checks/alerts for the process and for Prefect flow run states Source and packaging options - Git-based source (common):
Flow.from_source("<https://github.com/...>")
with an entrypoint. - Local path: good for on-host code in a controlled deployment. - Container images: for Docker/K8s, specify
image=
in
deploy(...)
. You can let Prefect build/push or bring your own CI-built image. Important note about Prefect 2.x vs 3.x - Prefect 3.x removes: -
Deployment.build_from_flow()
(do not use in 3.x) -
prefect deployment build
CLI (replaced by
prefect deploy
) - GitHub storage blocks (use
.from_source("<https://github.com/...>")
instead) - If you’re still on 2.x, consider upgrading. If not, you can still use workers in 2.x, but the 3.x UX is streamlined. Questions to tailor a concrete setup for you - What OS and environment is your “production host” (VM, bare metal, Docker host, or K8s cluster)? - Are you aiming for containerized execution (Docker/K8s) or simple subprocess execution on the host? - Will you connect this host to Prefect Cloud or a self-hosted Prefect server? - Do you need schedules, parameters, or concurrency limits for these deployments? If you share those details, I’ll give you the exact commands, a systemd file aligned to your path/venv, and a minimal example to deploy your flows.
c
@Marvin for deploying this on a production box, how do I decide between using the static infrastructure and the dynamic infrastructure with a docker based work pool?
m
thought for 63.7 seconds
Great question. In Prefect 3.x, the choice boils down to: - Static infrastructure: run a Process worker on your host; each flow run is a subprocess in the host’s Python environment. - Dynamic infrastructure: run a Docker worker (Docker-based work pool); each flow run is an isolated, ephemeral container. Here’s how to decide. When to choose static (Process worker on the box) - You want simplicity with minimal moving parts - Few flows and consistent dependencies; low risk of conflicts - You don’t have (or don’t want) access to a container registry - Very latency-sensitive runs where container startup/pull time matters - Flows need direct access to host filesystem or devices without extra config - Air-gapped or locked-down servers where Docker is not allowed Tradeoffs: - Shared environment for all flows (risk of dependency conflicts, environment drift) - Harder to guarantee reproducibility and rollback - Scaling horizontally is manual (you add more hosts and more process workers) When to choose dynamic (Docker worker with a Docker work pool) - Multiple flows/teams with different dependencies; need clean isolation per run - Reproducibility is important (pin image tags; roll back by changing the tag) - Want to scale horizontally (add more Docker hosts/workers easily) - You already build/publish images in CI/CD or are comfortable doing so - Stronger multi-tenant isolation and ability to set per-run resource limits - Easier transition to Kubernetes later with a similar model Tradeoffs: - Requires Docker on the host and a registry (network egress, auth) - Cold starts/image pulls add overhead (mitigate with pre-pulled images, slim layers) - Access to host files needs explicit volume mounts; ephemeral containers don’t keep state Operational considerations - Secrets and config: Prefer Prefect variables/blocks and/or environment variables. Avoid baking secrets into images. - Files/data: Prefer remote object stores (S3/GCS/Azure) over relying on host paths. If you must use local files with Docker worker, configure volume mounts in job variables or the work pool’s base job template. - Resource limits: With Docker, set CPU/mem limits per run; with Process, consider OS-level controls and Prefect concurrency limits. - Reliability: Run the worker as a service (e.g., systemd). For Docker, also ensure images are pre-pulled and tagged immutably. Minimal examples - Static (Process worker) - Deploy:
Copy code
from prefect import flow
    from prefect.flows import Flow

    @flow
    def my_flow():
        ...

    Flow.from_source(
        source="<https://github.com/your-org/your-repo>",
        entrypoint="flows/my_flow.py:my_flow",
    ).deploy(
        name="my-flow-prod",
        work_pool_name="prod-process-pool",
        cron="0 * * * *",
    )
- Start a Process worker on the host (as a service is recommended). - Dynamic (Docker worker) - Deploy with containerized runs:
Copy code
from prefect import flow
    from prefect.flows import Flow

    @flow
    def my_flow():
        ...

    Flow.from_source(
        source="<https://github.com/your-org/your-repo>",
        entrypoint="flows/my_flow.py:my_flow",
    ).deploy(
        name="my-flow-prod",
        work_pool_name="prod-docker-pool",
        image="<http://ghcr.io/your-org/your-image:prod|ghcr.io/your-org/your-image:prod>",  # or build/push via Prefect
        build=True,   # optional if you want Prefect to build
        push=True,    # optional if you want Prefect to push
        cron="0 * * * *",
    )
- Start a Docker worker on the host (ensure the host can pull the image tag). Helpful references - Workers overview and concepts: Work pools and workers - Background/context: Work pools and workers: deploy Python securely
If you share a bit more about your production box (OS, Docker allowed or not, need for local file access, number of flows and their dependency profiles, registry access), I can recommend one path and give you exact setup steps (including a systemd unit and deployment command tailored to you).
c
@Marvin it is a REHL9 linux box, docker (in our case podman) is allowed, we will need to access local filepaths, there will be around 50-100 flows all using the same conda environment, and we have jfrog artifactory for registry access
m
thought for 90.4 seconds
Thanks — with RHEL9, Podman allowed, local filesystem access required, ~50–100 flows sharing one Conda env, and JFrog as your registry, here’s how I’d decide: Recommendation - Start with static infrastructure (Process worker) unless you specifically need container-level isolation or per-run resource limits. Why: - All flows share the same Conda env → minimal risk of dependency drift/conflicts. - Direct local filesystem access is trivial with a Process worker (no mounts or SELinux labels). - Lowest operational overhead (no image builds/pulls, Podman socket config, registry auth in the worker context). - Move to a Docker-based (Podman-backed) work pool if you later need stricter reproducibility, per-run isolation/limits, or a clean path to scale across hosts with controlled images. When to prefer containerized (Docker work pool via Podman) - You want reproducible, pinned images per release; rollback by changing the image tag. - You anticipate diverging dependencies over time or multi-team isolation. - You want per-run CPU/memory limits enforced by the container runtime. - You plan to run multiple workers across hosts and scale horizontally. Key tradeoffs for Podman - Setup is slightly more involved: Podman’s Docker-compatible REST API must be available to the Prefect Docker worker. - Local filesystem access requires volume mounts (on RHEL/SELinux you’ll likely need :Z or :z). - You’ll manage image builds and pushes to JFrog. Concrete setup paths Option A: Static (Process worker) — simple and robust 1) Create a process work pool:
Copy code
prefect work-pool create --type process prod-process-pool
2) Start a worker on the box (as a service is best). Example systemd unit using your Conda env:
Copy code
[Unit]
Description=Prefect Process Worker (prod)
After=network.target

[Service]
Type=simple
User=prefect
WorkingDirectory=/opt/prefect
Environment="PREFECT_API_URL=<YOUR_API_URL>"    # or PREFECT_API_KEY for Cloud
ExecStart=/usr/bin/bash -lc 'source /opt/conda/etc/profile.d/conda.sh && conda activate myenv && prefect worker start --pool prod-process-pool'
Restart=always
RestartSec=5s

[Install]
WantedBy=multi-user.target
Then:
Copy code
sudo systemctl daemon-reload
sudo systemctl enable prefect-worker
sudo systemctl start prefect-worker
3) Deploy your flows pointing at this pool:
Copy code
from prefect import flow
from prefect.flows import Flow

@flow
def my_flow():
    ...

Flow.from_source(
    source="<https://github.com/your-org/your-repo>",
    entrypoint="flows/my_flow.py:my_flow",
).deploy(
    name="my-flow-prod",
    work_pool_name="prod-process-pool",
    cron="0 * * * *",
    tags=["prod"],
)
Pros: - Direct local path access - No container/image overhead - Single shared Conda env matches your use case Option B: Dynamic (Docker work pool using Podman) Use if you need isolation or resource limits. 1) Enable Podman’s Docker-compatible API - Rootless (recommended; service runs under the same user as the worker):
Copy code
systemctl --user enable --now podman.socket
# socket is typically at: unix:///run/user/$UID/podman/podman.sock
- Rootful:
Copy code
sudo systemctl enable --now podman.socket
# socket typically at: unix:///run/podman/podman.sock
2) Ensure the worker can talk to Podman by setting DOCKER_HOST - For a user service (rootless), in your systemd unit:
Copy code
Environment="DOCKER_HOST=unix:///run/user/%i/podman/podman.sock"
Replace %i with the actual UID or set a fixed path. Alternatively, set it in the ExecStart shell. 3) Log in to your JFrog registry (do this as the same user that runs the worker):
Copy code
podman login <your-jfrog-registry>
4) Create a Docker work pool:
Copy code
prefect work-pool create --type docker prod-docker-pool
5) Configure volume mounts for local paths - Add mounts at the pool’s base job template or per-deployment via job variables. On RHEL with SELinux, use :Z (private label) or :z (shared). Example job variable you can add at deploy time: ``` job_variables={
"volumes": ["/mnt/data/mnt/dataZ"] # adjust paths as needed }
Copy code
6) Build/push a single shared image with your Conda env (in CI or locally), push to JFrog, then reference it in your deployments:
from prefect import flow from prefect.flows import Flow @flow def my_flow(): ... Flow.from_source( source="https://github.com/your-org/your-repo", entrypoint="flows/my_flow.py:my_flow", ).deploy( name="my-flow-prod", work_pool_name="prod-docker-pool", image="<jfrog-registry>/your-team/your-image:prod", # prebuilt in CI # or let Prefect build/push for you: build=True, push=True cron="0 * * * *", tags=["prod"], # job_variables={"volumes": ["/mnt/data/mnt/dataZ"]}, )
Copy code
7) Start the Docker worker as a service under the same user that owns the Podman socket:
[Unit] Description=Prefect Docker Worker (prod via Podman) After=network.target [Service] Type=simple User=prefect WorkingDirectory=/opt/prefect Environment="PREFECT_API_URL=<YOUR_API_URL>" Environment="DOCKER_HOST=unix:///run/user/1001/podman/podman.sock" # example UID ExecStart=/usr/bin/bash -lc 'source /opt/venvs/prefect/bin/activate && prefect worker start --pool prod-docker-pool' Restart=always RestartSec=5s [Install] WantedBy=multi-user.target ``` Notes and best practices - SELinux: With Podman on RHEL, remember :Z or :z on volume mounts; verify host path labels with
ls -Z
. - Permissions: Ensure the user running the worker can read/write the mounted host paths. - Registry access:
podman login
for the worker’s user; use immutable image tags for reproducibility. - Concurrency: With 50–100 flows, consider setting work pool/deployment concurrency limits and, if needed, run more than one worker process on the host(s). - Migration flexibility: You can start with Process worker and later re-deploy the same flows to a Docker pool by adding an image and volume mounts to your deployments. Docs - Concepts overview: Work pools and workers - Deploying from source: Deploy from source - Schedules, parameters, and concurrency on deployments: Deployments overview If you’d like, tell me: - The exact local directories your flows need, and - Whether you prefer rootless or rootful Podman … and I’ll produce a ready-to-drop-in systemd unit and deployment snippets for either path.
c
@Marvin if I have 100 flows, will the static infrastructure be able to handle all these processes gracefully?
m
thought for 130.4 seconds
Short answer: Yes—static (Process) infrastructure can handle 100 flows gracefully as long as you control concurrency and size the box appropriately. You don’t need 100 processes running at once; Prefect will queue runs. The key is to cap how many run concurrently and spread their start times. What to configure - Work pool concurrency: Set a cap at the pool level so only N flow runs can execute at once on that host. You can set this in the UI or via the CLI (see work pool CLI). - Per-deployment concurrency: Limit heavy flows to 1–2 concurrent runs so they don’t saturate the machine.
Copy code
Flow.from_source(...).deploy(
      name="my-flow-prod",
      work_pool_name="prod-process-pool",
      concurrency_limit=2,
      # schedule, tags, etc.
  )
- Global (tag-based) limits: Create global limits by tag to control shared resources across many flows (e.g., anything that touches a specific directory or external system). Docs: Global concurrency limits. - Multiple workers (optional): You can run more than one process worker on the same host (or multiple hosts) pointing to the same pool to increase throughput—but still respect your pool/global limits. Operational tips - Stagger schedules: If many flows start at the top of the hour, offset them by a few minutes to avoid spikes. - Right-size concurrency: Choose a pool limit that matches CPU, RAM, and I/O headroom. Each flow run is a separate Python process; budget memory accordingly. - Monitor and adjust: Start conservatively, observe CPU/mem/I/O and flow durations, then raise limits gradually. - OS limits: Ensure ulimit/open files and process limits are reasonable for your expected concurrency. - Isolation needs: If later you need per-run CPU/memory limits or stricter reproducibility, you can switch to a Docker (Podman) pool and keep your deployment definitions. If you share: - CPU cores and RAM on the box - Whether most flows are CPU-bound or I/O-bound - Any particularly heavy flows we should isolate …I’ll propose concrete concurrency numbers (pool + per-deployment) and a schedule staggering plan.
c
@Marvin if using the static infrastructure, how do I control which flows are deployed? Do I need to manually start each one or can I use a config file to start many at a time?
m
thought for 646.3 seconds
Short answer: you don’t “start” each flow manually. In Prefect 3.x you define deployments in a single prefect.yaml and deploy them all (or a filtered subset) with one command. Once deployed, schedules/triggers will create runs automatically; your static worker just needs to be running to pick them up. How to control which flows are deployed - Define one or many deployments in prefect.yaml (each with its own entrypoint, name, schedule/trigger, parameters, etc.). - From your project root, run: - Deploy everything in the file:
Copy code
prefect deploy
- Deploy a subset by name pattern (globs supported):
Copy code
prefect deploy --name "etl-*"
- Deploy one deployment by exact name:
Copy code
prefect deploy --name "reports/daily"
Example prefect.yaml with multiple deployments and cron schedules
Copy code
prefect:
  version: 3.0

deployments:
  - name: etl/hourly
    entrypoint: flows/etl.py:etl
    work_pool: process-pool
    parameters:
      env: dev
    schedules:
      - cron: "0 * * * *"
        timezone: "UTC"

  - name: reports/daily
    entrypoint: flows/reports.py:build_reports
    work_pool: process-pool
    parameters:
      env: prod
    schedules:
      - cron: "0 6 * * *"
        timezone: "America/New_York"
Operational notes for “static infrastructure” - You still need a worker running against the target work pool so scheduled runs will execute. For a static, on-host process runner:
Copy code
prefect worker start -p process-pool
- You generally don’t need to “start” each flow. After you deploy from prefect.yaml, the Prefect scheduler will enqueue runs according to your schedules/triggers and the worker will pick them up. - If you want to kick one off immediately:
Copy code
prefect deployment run <FLOW_NAME>/<DEPLOYMENT_NAME>
Example:
Copy code
prefect deployment run etl/etl-hourly
Docs and references - Prefect YAML (multiple deployments, structure): prefect.yaml guide - Creating deployments: Create deployments - Schedules: Create schedules - Deployment triggers/automations (optional, event-based): Creating deployment triggers Given your environment - For true “static” local access to filesystem paths on your RHEL9 host, use the Process work pool; flows will run on the same machine as the worker and can read/write local paths directly. - If you choose containers (Podman), you can still use prefect.yaml to deploy many at once, but you’ll need to mount host directories into the container for local path access. Would you like an example prefect.yaml tailored to your setup (Process vs Podman, shared conda env, JFrog image names, etc.)?
c
How would this deployment work in the dynamic infrastructure with Docker?
@Marvin How would this deployment work in the dynamic infrastructure with Docker?
m
thought for 99.7 seconds
Great question. In Prefect 3.x, “dynamic infrastructure with Docker” means your deployments run in short-lived containers launched by a Docker worker. You still define and deploy flows via prefect.yaml; the main differences vs. static/process are: - The deployment targets a Docker work pool - The worker launches a container per flow run using the image and settings you specify Here’s how to set it up. 1) Create and run a Docker work pool and worker - Create the pool:
Copy code
prefect work-pool create docker-pool --type docker
- Start a worker on a machine that has Docker or a Docker-compatible runtime (Podman) and can pull your images:
Copy code
prefect worker start -p docker-pool
Note: - Install integrations on the worker host:
pip install "prefect[docker]"
- For Podman on RHEL9, either install podman-docker (Docker-compatible socket) or set
DOCKER_HOST
to your Podman socket (e.g.,
unix:///run/user/$UID/podman/podman.sock
). Make sure you’re logged in to JFrog (
docker login
or
podman login
) so the worker can pull images. 2) Define deployments in prefect.yaml - You can deploy many flows at once; just switch the
work_pool
to your Docker pool and provide Docker job variables. For example:
Copy code
prefect:
    version: 3.0

  deployments:
    - name: etl/hourly
      entrypoint: flows/etl.py:etl
      work_pool: docker-pool
      job_variables:
        image: <http://artifactory.myco.com/prefect/flows:2025-09-30|artifactory.myco.com/prefect/flows:2025-09-30>
        env:
          PREFECT_LOGGING_LEVEL: INFO
          APP_ENV: dev
        # Bind-mount host paths needed by the flow (important if flows read local files)
        volumes:
          - /srv/data:/srv/data:Z
        # Optional network
        network: analytics-net
      schedules:
        - cron: "0 * * * *"
          timezone: "UTC"

    - name: reports/daily
      entrypoint: flows/reports.py:build_reports
      work_pool: docker-pool
      job_variables:
        image: <http://artifactory.myco.com/prefect/flows:2025-09-30|artifactory.myco.com/prefect/flows:2025-09-30>
        env:
          APP_ENV: prod
        volumes:
          - /srv/reports:/srv/reports:Z
      schedules:
        - cron: "0 6 * * *"
          timezone: "America/New_York"
Tips: - Build your image with your shared conda environment baked in, push to JFrog, then reference it via
job_variables.image
. - On RHEL9 with SELinux, use
:Z
on bind mounts (as shown) so containers can access labeled host paths. - If you need more Docker settings, you can add them under
job_variables
(e.g.,
labels
,
auto_remove
,
extra_hosts
, resource limits). 3) Deploy all (or a subset) from the project root - All deployments in the file:
Copy code
prefect deploy
- Filter by name pattern:
Copy code
prefect deploy --name "etl-*"
- Exact one:
Copy code
prefect deploy --name "reports/daily"
4) Run behavior - After deployment, schedules will enqueue runs automatically; your Docker worker will launch containers for each run. - Trigger an on-demand run:
Copy code
prefect deployment run <FLOW_NAME>/<DEPLOYMENT_NAME>
Docs - prefect.yaml: prefect.yaml guide - Creating deployments: Create deployments - Schedules: Create schedules - Deployment triggers (optional): Creating deployment triggers If you’d like, I can sketch: - A minimal Dockerfile that installs your shared conda env - A Podman-specific note (socket/env) for your RHEL9 hosts - A prefect.yaml tailored to your JFrog registry and host paths
c
How do I recreate my prefect.yaml under a new architecture?
@Marvin How do I recreate my prefect.yaml under a new architecture?]
m
thought for 133.9 seconds
Got it. Here’s a safe way to “recreate” or migrate your prefect.yaml when you change architectures (e.g., from Process/static to Docker dynamic). This assumes Prefect 3.x. Two viable approaches - Scaffold a fresh prefect.yaml with prefect init, then copy your deployments over and adjust for the new architecture. - Manually edit your existing prefect.yaml: update the work_pool and job_variables to match the new infrastructure. Option A: Scaffold a fresh file 1) In your project root (where your flows live), run:
Copy code
prefect init
- It will initialize a new deployment configuration “recipe” and create a prefect.yaml. You can pass a specific recipe and prefill fields if you know them:
Copy code
prefect init --recipe docker -f key=value -f another_key=value
Notes: - The available fields depend on the recipe; run
prefect init
and follow the prompts, or use
-f
to pass values. You can edit the resulting YAML afterward. 2) Copy your previous deployments from the old prefect.yaml into the new file, then update the infrastructure-specific bits (see below). Option B: Manually transform your existing YAML - Keep your flow entrypoints, parameters, schedules, triggers the same. - Change only the infrastructure section to the new target: - Switch
work_pool
to the new pool - Adjust
job_variables
to the new worker/infrastructure type Example: Process/static → Docker/dynamic Before (Process/static)
Copy code
prefect:
  version: 3.0

deployments:
  - name: etl/hourly
    entrypoint: flows/etl.py:etl
    work_pool: process-pool
    parameters:
      env: dev
    schedules:
      - cron: "0 * * * *"
        timezone: "UTC"
After (Docker/dynamic)
Copy code
prefect:
  version: 3.0

deployments:
  - name: etl/hourly
    entrypoint: flows/etl.py:etl
    work_pool: docker-pool
    job_variables:
      image: <http://artifactory.myco.com/prefect/flows:2025-09-30|artifactory.myco.com/prefect/flows:2025-09-30>
      env:
        PREFECT_LOGGING_LEVEL: INFO
        APP_ENV: dev
      volumes:
        - /srv/data:/srv/data:Z
    schedules:
      - cron: "0 * * * *"
        timezone: "UTC"
Recreate the work pool and worker for the new architecture - Create the target work pool:
Copy code
prefect work-pool create docker-pool --type docker
- Start a worker for it:
Copy code
pip install "prefect[docker]"   # on the worker host
prefect worker start -p docker-pool
Podman on RHEL9: - Ensure Docker-compatible socket (podman-docker) or set
DOCKER_HOST=unix:///run/user/$UID/podman/podman.sock
- Use
:Z
on bind mounts under SELinux - Log in to JFrog from the worker host (
docker login
or
podman login
) Redeploy in bulk - From your project root:
Copy code
prefect deploy            # deploys all deployments in prefect.yaml
prefect deploy --name "etl-*"   # deploy only a subset by name pattern
Kick the tires - Trigger one run:
Copy code
prefect deployment run <FLOW_NAME>/<DEPLOYMENT_NAME>
- Watch logs in the UI or on the worker host. Docs - prefect.yaml guide: prefect.yaml guide - Create deployments: Create deployments - Schedules: Create schedules Want me to generate a Docker-based prefect.yaml for your exact setup (JFrog image, Podman volumes, shared conda env), and a quick script to migrate 50–100 deployments by pattern? If you’re targeting something else (e.g., Kubernetes), tell me which architecture and I’ll tailor the YAML.
c
@Marvin if I want to try out the static infrastructure first, when I run prefect init, do I select the "git" recipe?
m
thought for 60.7 seconds
Short answer: No—you don’t need the “git” recipe to try static infrastructure. For a local/static setup, use a Process work pool and point your deployments at it. The “git” recipe is for pulling code from a remote repo; for local code and local filepaths, stick with Process. What to do 1) Create a Process work pool and start a worker:
Copy code
prefect work-pool create process-pool --type process
prefect worker start -p process-pool
2) Initialize a prefect.yaml (recipe optional), then set the work_pool to your Process pool:
Copy code
prefect init           # then edit prefect.yaml
# (If your init prompts include a 'process' option, you can pick it; otherwise just edit the file.)
Minimal prefect.yaml for static Process
Copy code
prefect:
  version: 3.0

deployments:
  - name: etl/hourly
    entrypoint: flows/etl.py:etl
    work_pool: process-pool
    parameters:
      env: dev
    schedules:
      - cron: "0 * * * *"
        timezone: "UTC"
Then deploy:
Copy code
prefect deploy
Docs - prefect.yaml guide: prefect.yaml guide - Create deployments: Create deployments - CLI init reference: prefect init CLI If you ever decide to switch to Docker later, you’ll mainly update
work_pool
and add
job_variables
(e.g.,
image
,
volumes
).
c
@Marvin when running "prefect init" it asks me "Would you like to initialize your deployment configuration with a recipe?". I want to test the static infrastructure on a production host. Which option best suits my needs?
m
thought for 39.9 seconds
Choose the Process recipe (or skip recipes and use a minimal file) — not “git”. Reason: for static infrastructure you want runs to execute directly on your production host (Process work pool) so flows can read/write local filepaths and use your shared Conda env. The “git” recipe is for pulling code from a repo at runtime; it’s optional and not required to test static execution. What to do 1) Create and run a Process work pool on the prod host:
Copy code
prefect work-pool create process-pool --type process
prefect worker start -p process-pool
2) Run prefect init - If prompted for a recipe: pick “Process” if shown. If it’s not listed, choose “No” and we’ll hand-edit the YAML. 3) Minimal prefect.yaml for static Process
Copy code
prefect:
  version: 3.0

deployments:
  - name: etl/hourly
    entrypoint: flows/etl.py:etl
    work_pool: process-pool
    parameters:
      env: dev
    schedules:
      - cron: "0 * * * *"
        timezone: "UTC"

  - name: reports/daily
    entrypoint: flows/reports.py:build_reports
    work_pool: process-pool
    parameters:
      env: prod
    schedules:
      - cron: "0 6 * * *"
        timezone: "America/New_York"
4) Deploy all or some:
Copy code
prefect deploy               # all deployments in prefect.yaml
prefect deploy --name "etl-*"  # subset by pattern
Tips for your setup - Ensure the worker process’s environment points at your shared Conda env (PATH/conda activation in the service/systemd unit). - Since flows need local filepaths, Process worker is the simplest path on RHEL9. Docs - prefect.yaml: prefect.yaml guide - Create deployments: Create deployments If you want, I can provide a systemd unit example for the Process worker on RHEL9 that activates your Conda env and sets environment variables.
c
@Marvin I don't see an option for the "process" recipe. There is "local" and "default" which sound closer to what you are referring to.
m
thought for 20.6 seconds
You’re right—recipes can vary by version. For testing static infrastructure on your prod host, pick “local” (or just choose “default” and hand-edit). “Local” is the closest to a Process-based setup; “git” is only needed if you want the worker to pull code from a repo at run time. Recommended steps 1) Create and run a Process work pool on the host:
Copy code
prefect work-pool create process-pool --type process
prefect worker start -p process-pool
2) Run prefect init - Choose “local” if available. If you pick “default,” that’s fine—just edit the YAML afterward. 3) Edit prefect.yaml to point deployments at your Process pool:
Copy code
prefect:
  version: 3.0

deployments:
  - name: etl/hourly
    entrypoint: flows/etl.py:etl
    work_pool: process-pool
    parameters:
      env: dev
    schedules:
      - cron: "0 * * * *"
        timezone: "UTC"

  - name: reports/daily
    entrypoint: flows/reports.py:build_reports
    work_pool: process-pool
    parameters:
      env: prod
    schedules:
      - cron: "0 6 * * *"
        timezone: "America/New_York"
4) Deploy all (or a subset):
Copy code
prefect deploy
# or
prefect deploy --name "etl-*"
Notes for your setup - Make sure the worker runs under your shared Conda env (activate it in the shell or systemd unit that starts the worker). - Using Process keeps flows on the same host, so they can read/write your local filepaths directly. Docs: prefect.yaml guide | Create deployments | Create schedules If you’d like, I can tailor a prefect.yaml to your exact repo layout and add a small systemd unit example that activates your Conda env before starting the worker.
c
@Marvin I choose local, but the following section will be different in my dev environment and my prod env:
Copy code
pull:
- prefect.deployments.steps.set_working_directory:
m
thought for 323.9 seconds
Great callout. You’ve got a few good options to handle different working directories between dev and prod when using the “local” recipe and the
set_working_directory
pull step. Option 1 (recommended): Don’t use a pull step; set the working dir via job_variables for Process runs - For static (Process) runs, you can set the working directory directly on the job. This avoids hard-coding absolute paths in a pull step. - In your deployment, use:
Copy code
deployments:
  - name: etl/hourly
    entrypoint: flows/etl.py:etl
    work_pool: process-pool
    job_variables:
      working_dir: "/opt/prefect/projects/myproj"   # prod path here
- In dev, either: - keep a dev-specific YAML with a different
working_dir
, or - run a dev work pool with a different base job template default (so you don’t have to change YAML). Option 2: Use a single YAML with a relative working directory - If your dev and prod both run the worker from the project root, you can avoid environment-specific paths by using a relative working dir:
Copy code
pull:
  - prefect.deployments.steps.set_working_directory:
      directory: "."
- This is the simplest if your project layout is the same and the worker starts in the repository root on both machines. Option 3: Keep separate YAMLs per environment and select at deploy time - Maintain two files (e.g.,
prefect.dev.yaml
and
prefect.prod.yaml
) and choose the appropriate one when deploying. - In prod, set the absolute path; in dev, set your dev path. Deploy using the correct file:
Copy code
prefect deploy --prefect-file prefect.prod.yaml
# and in dev:
prefect deploy --prefect-file prefect.dev.yaml
Option 4: Two deployments (dev/prod) in the same YAML - Define both with different
work_pool
and
job_variables.working_dir
, then deploy both or just a subset:
Copy code
deployments:
  - name: etl/hourly-dev
    work_pool: process-dev
    job_variables:
      working_dir: "/home/devuser/projects/myproj"

  - name: etl/hourly-prod
    work_pool: process-prod
    job_variables:
      working_dir: "/opt/prefect/projects/myproj"
- Deploy a subset:
Copy code
prefect deploy --name "*/hourly-prod"
Notes - For static Process runs, setting
job_variables.working_dir
is usually cleaner than using the
pull.set_working_directory
step. - If you do keep the pull step, prefer relative paths when possible. Absolute paths often force environment-specific YAML. - If you want to standardize even further, you can set the Process worker’s default working directory via the work pool’s base job template, then omit it from deployments. Helpful docs - Prefect YAML: prefect.yaml guide - Creating deployments: Create deployments If you tell me your dev and prod paths, I can produce a minimal pair of YAMLs (or a single YAML with two deployments) that aligns with your RHEL9 + shared Conda setup.
c
@Marvin if I have 5 different flows, how do I define these deployments in prefect.yaml?
m
thought for 162.3 seconds
You can define all five as separate entries under deployments in a single prefect.yaml. Each entry specifies the flow entrypoint, a unique deployment name, the target work pool (Process for static), plus any parameters, schedules, and job variables like working_dir. Example (Process/static on your prod host):
Copy code
prefect:
  version: 3.0

deployments:
  - name: ingest/hourly
    entrypoint: ./flows/ingest.py:ingest
    work_pool: process-pool
    job_variables:
      working_dir: /opt/prefect/projects/myproj
      env:
        APP_ENV: prod
    parameters:
      source: s3
    schedules:
      - cron: "0 * * * *"
        timezone: "UTC"

  - name: transform/daily
    entrypoint: ./flows/transform.py:transform
    work_pool: process-pool
    job_variables:
      working_dir: /opt/prefect/projects/myproj
    schedules:
      - cron: "0 5 * * *"
        timezone: "UTC"

  - name: quality/checks
    entrypoint: ./flows/quality.py:run_checks
    work_pool: process-pool
    job_variables:
      working_dir: /opt/prefect/projects/myproj
    parameters:
      severity: high

  - name: reports/daily
    entrypoint: ./flows/reports.py:build_reports
    work_pool: process-pool
    job_variables:
      working_dir: /opt/prefect/projects/myproj
      env:
        APP_ENV: prod
    schedules:
      - cron: "0 6 * * *"
        timezone: "America/New_York"

  - name: archive/monthly
    entrypoint: ./flows/archive.py:archive
    work_pool: process-pool
    job_variables:
      working_dir: /opt/prefect/projects/myproj
    schedules:
      - cron: "0 3 1 * *"
        timezone: "UTC"
Notes - name should be unique per deployment; using flow-name/deploy-name is a good convention. - entrypoint is the path to your file and flow function. - job_variables.working_dir is the easiest way to set the working directory for static Process runs; you can omit the pull.set_working_directory step if you use this. - schedules are optional; include them only where needed. Deploy all five from your project root:
Copy code
prefect deploy
Or deploy a subset by name pattern:
Copy code
prefect deploy --name "*/daily"
Docs for reference - prefect.yaml: prefect.yaml guide - Create deployments: Create deployments - Schedules: Create schedules If you share the actual flow file paths and preferred names, I can generate a ready-to-paste prefect.yaml for your five flows.
c
@Marvin is there a way to auto generate the deployments section of this file?
m
thought for 353.3 seconds
Short answer: there isn’t a built-in “scan my repo and auto-fill deployments” feature in Prefect 3.x.
prefect init
will scaffold the file, but you still define the deployments yourself. Good ways to automate it anyway Option A: Generate deployments programmatically (no YAML editing) - Write a tiny Python script that creates many deployments at once using Prefect’s deployment API, then run it when you need to (CI/CD or local).
Copy code
# scripts/deploy_all.py
from prefect.deployments import deploy
from prefect.deployments.runner import RunnerDeployment

COMMON = {
    "work_pool_name": "process-pool",
    "job_variables": {"working_dir": "/opt/prefect/projects/myproj"},
}

deploy(
    RunnerDeployment(name="ingest/hourly", entrypoint="flows/ingest.py:ingest", **COMMON),
    RunnerDeployment(name="transform/daily", entrypoint="flows/transform.py:transform", **COMMON),
    RunnerDeployment(name="quality/checks", entrypoint="flows/quality.py:run_checks", **COMMON),
    RunnerDeployment(name="reports/daily", entrypoint="flows/reports.py:build_reports", **COMMON),
    RunnerDeployment(name="archive/monthly", entrypoint="flows/archive.py:archive", **COMMON),
)
- Run it:
Copy code
python scripts/deploy_all.py
This bypasses editing prefect.yaml and still deploys many flows in one go. Option B: Auto-generate the YAML file from a list - If you want the deployments to live in prefect.yaml, have a small script generate the YAML for you from a Python list/dict. For example:
Copy code
# scripts/generate_prefect_yaml.py
import yaml

COMMON = {
    "work_pool": "process-pool",
    "job_variables": {"working_dir": "/opt/prefect/projects/myproj"},
}

deployments = [
    {"name": "ingest/hourly", "entrypoint": "./flows/ingest.py:ingest", **COMMON},
    {"name": "transform/daily", "entrypoint": "./flows/transform.py:transform", **COMMON},
    {"name": "quality/checks", "entrypoint": "./flows/quality.py:run_checks", **COMMON},
    {"name": "reports/daily", "entrypoint": "./flows/reports.py:build_reports", **COMMON},
    {"name": "archive/monthly", "entrypoint": "./flows/archive.py:archive", **COMMON},
]

doc = {"prefect": {"version": "3.0"}, "deployments": deployments}

with open("prefect.yaml", "w") as f:
    yaml.safe_dump(doc, f, sort_keys=False)
- Then:
Copy code
python scripts/generate_prefect_yaml.py
prefect deploy
Option C: Keep separate dev/prod YAMLs and choose at deploy time - Maintain
prefect.dev.yaml
and
prefect.prod.yaml
(different working_dir, env vars, etc.), then:
Copy code
prefect deploy --prefect-file prefect.prod.yaml
Note: If you prefer one file, you can put both dev/prod deployments in the same YAML and deploy subsets with
--name
patterns. Why not CLI auto-discovery? - Prefect 3.x does not provide a command that crawls your repo to detect all
@flow
functions and output a deployments list. You can script this yourself (e.g., maintain a registry/list of flows) and then either: - call the deployment API (Option A), or - render the YAML (Option B). Docs for reference - prefect.yaml guide: prefect.yaml guide - Create deployments: Create deployments If you share your flow entrypoints and desired names, I can generate a ready-to-paste prefect.yaml or a small generator script tailored to your dev/prod paths.
c
@Marvin I get the following traceback when running prefect deploy: Traceback (most recent call last): File "/home/bakerc/.conda/envs/maestro/lib/python3.13/site-packages/prefect/cli/_utilities.py", line 44, in wrapper return fn(*args, **kwargs) File "/home/bakerc/.conda/envs/maestro/lib/python3.13/site-packages/prefect/cli/_types.py", line 156, in sync_fn return asyncio.run(async_fn(*args, **kwargs)) ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/bakerc/.conda/envs/maestro/lib/python3.13/asyncio/runners.py", line 195, in run return runner.run(main) ~~~~~~~~~~^^^^^^ File "/home/bakerc/.conda/envs/maestro/lib/python3.13/asyncio/runners.py", line 118, in run return self._loop.run_until_complete(task) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^ File "/home/bakerc/.conda/envs/maestro/lib/python3.13/asyncio/base_events.py", line 725, in run_until_complete return future.result() ~~~~~~~~~~~~~^^ File "/home/bakerc/.conda/envs/maestro/lib/python3.13/site-packages/prefect/cli/deploy.py", line 466, in deploy await _run_multi_deploy( ...<4 lines>... ) File "/home/bakerc/.conda/envs/maestro/lib/python3.13/site-packages/prefect/cli/deploy.py", line 901, in _run_multi_deploy await _run_single_deploy(deploy_config, actions, prefect_file=prefect_file) File "/home/bakerc/.conda/envs/maestro/lib/python3.13/site-packages/prefect/cli/deploy.py", line 505, in _run_single_deploy deploy_config = _merge_with_default_deploy_config(deploy_config) File "/home/bakerc/.conda/envs/maestro/lib/python3.13/site-packages/prefect/cli/deploy.py", line 1014, in _merge_with_default_deploy_config deploy_config[key][k] = v ~~~~~~~~~~~~~~~~~~^^^ TypeError: 'str' object does not support item assignment An exception occurred. Here is my prefect.yaml:
Copy code
deployments:
- name: associate_allocations
  tags: ["allocations"]
  description: null
  schedules:
    - cron: "0 0 * * *"
      timezone: "America/Chicago"
      active: false
  flow_name: "Associate Program Allocations"
  entrypoint: flows/allocations/associate.py:allocate_associate
  work_pool: maestro-process-pool
- name: opex_plan_load
  tags: ["plan_load", "anaplan"]
  description: null
  schedules:
    - cron: "2 6,12,18 * * *"
      timezone: "America/Chicago"
      active:  true
  flow_name: "Plan OpEx Load"
  entrypoint: flows/anaplan/plan/opex1.py:load_opex
  work_pool: maestro-process-pool
- name: cap_sw_plan_load
  tags: ["plan_load", "anaplan"]
  description: null
  schedules:
    - cron: "0 6,12,18 * * *"
      timezone: "America/Chicago"
      active:  true
  flow_name: "Plan Cap SW Load"
  entrypoint: flows/anaplan/plan/cap_sw_3.py:load_cap_sw
  work_pool: maestro-process-pool
- name: new_hire_import
  tags: ["anaplan"]
  description: null
  schedules:
    - cron: "0 8 * * 1-5"
      timezone: "America/Chicago"
      active: false
  flow_name: "New Hire Anaplan Upload"
  entrypoint: flows/anaplan/other/new_hire_import.py:upload_new_hires
  work_pool: maestro-process-pool
- name: virtue_load
  tags: ["competitors"]
  description: null
  schedules:
    - cron: "30 7 * * 1-5"
      timezone: "America/Chicago"
      active: false
  flow_name: "Extract Virtu Financials"
  entrypoint: flows/competitors/virtue_load.py:extract_virtu_finacials
  work_pool: maestro-process-pool
- name: seating_capacity
  tags: ["office_services"]
  description: null
  schedules:
    - cron: "0 6 * * 1-5"
      timezone: "America/Chicago"
      active: false
  flow_name: "Update Seating Capacity"
  entrypoint: flows/office_services/seating_capacity.py:update_seating_capacity
  work_pool: maestro-process-pool
- name: area_account_location
  tags: ["validations"]
  description: null
  schedules:
    - cron: "0 7 * * 1-5"
      timezone: "America/Chicago"
      active: false
  flow_name: "Check Area Account Location"
  entrypoint: flows/validations/area_account_location.py:check_area_account_location
  work_pool: maestro-process-pool
- name: compare_coupa_sun
  tags: ["validations"]
  description: null
  schedules:
    - cron: "30 21 * * 1-5"
      timezone: "America/Chicago"
      active: false
  flow_name: "Compare Coupa and SUN Files"
  entrypoint: flows/validations/compare_coupa_sun.py:check_area_account_location
  work_pool: maestro-process-pool
m
thought for 109.3 seconds
Thanks for sharing the traceback and your YAML. That TypeError almost always means Prefect is trying to merge defaults into your deployment config but finds a scalar (string) where it expects a mapping (dict). Two likely fixes: 1) Make work_pool a mapping, not a string In recent 3.x, prefect.yaml expects the work_pool section to be structured. Change this:
Copy code
work_pool: maestro-process-pool
to:
Copy code
work_pool:
  name: maestro-process-pool
  # optional:
  # work_queue_name: default
  # job_variables:
  #   working_dir: /opt/prefect/projects/myproj
  #   env:
  #     APP_ENV: prod
2) If you kept a top-level pull/build/push section from the recipe, ensure they’re lists of step objects Example of a valid pull section:
Copy code
pull:
  - prefect.deployments.steps.set_working_directory:
      directory: /opt/prefect/projects/myproj
If pull is set to a string or otherwise malformed, merging will fail with the same error. A cleaned-up example using your deployments Try this minimal structure for one deployment first; if it works, replicate for the others:
Copy code
prefect:
  version: 3.0

deployments:
  - name: associate_allocations
    flow_name: "Associate Program Allocations"
    entrypoint: flows/allocations/associate.py:allocate_associate
    tags: ["allocations"]
    work_pool:
      name: maestro-process-pool
      # job_variables:
      #   working_dir: /opt/prefect/projects/myproj
    schedules:
      - cron: "0 0 * * *"
        timezone: "America/Chicago"
        active: false
If you want to keep all 8, here’s how one more looks:
Copy code
- name: opex_plan_load
    flow_name: "Plan OpEx Load"
    entrypoint: flows/anaplan/plan/opex1.py:load_opex
    tags: ["plan_load", "anaplan"]
    work_pool:
      name: maestro-process-pool
    schedules:
      - cron: "2 6,12,18 * * *"
        timezone: "America/Chicago"
        active: true
Troubleshooting checklist - Ensure work_pool is a mapping with name (and optional work_queue_name/job_variables). - If you kept recipe steps at the top-level (pull/build/push), confirm they’re lists of step objects as shown above. - Test with a single deployment in the file to isolate issues, then add the rest back. - Optional: remove the active flags temporarily to see if the error persists (you can also pause a deployment with a top-level paused: true). One more note: you’re running Python 3.13. If you continue to hit odd toolchain issues, consider testing with Python 3.11/3.12 where Prefect is most battle-tested today. If you paste your full file (including any top-level pull/build/push sections), I’ll mark up the exact lines to change. Docs for reference: - prefect.yaml: prefect.yaml guide - Create deployments: Create deployments
c
@Marvin after running prefect deploy, in the UI all the deployments Statuses say "Not Ready"?
m
thought for 86.2 seconds
“Not Ready” in the UI means Prefect does not see available infrastructure for those deployments. In practice, it’s almost always one of these: - No worker is running for the deployment’s work pool - The work pool is paused - Queue mismatch (deployment targets a specific work queue but the worker isn’t polling it) - Work pool name/type mismatch or the pool doesn’t exist - Worker cannot connect to the same API/profile you used to deploy Quick checks and fixes 1) Verify the pool exists and is healthy
Copy code
prefect work-pool ls
prefect work-pool inspect maestro-process-pool
- Confirm type=process (for static), paused=False, and at least one connected worker. 2) Start a worker for that pool
Copy code
prefect worker start -p maestro-process-pool
- Make sure you start it on the host where you want flows to run, in the right Conda env, and with the same Prefect API settings as you used to deploy. 3) If you use work queues, make sure they match - If your deployment sets
work_pool.work_queue_name
, start the worker with that queue:
Copy code
prefect worker start -p maestro-process-pool -q <your-queue>
4) Validate the deployment config - Ensure your prefect.yaml uses the structured work pool form:
Copy code
work_pool:
  name: maestro-process-pool
  # work_queue_name: default
  # job_variables:
  #   working_dir: /opt/prefect/projects/myproj
5) Confirm API/profile on the worker host - The worker must point to the same Prefect API URL/key as your deploy step (Cloud or Server). If needed, set
PREFECT_API_URL
(and
PREFECT_API_KEY
for Cloud) in the worker’s environment. 6) Check the UI details - Click into a deployment in the UI and look at the Readiness section—it will usually say exactly why it’s “Not Ready” (e.g., “No workers connected to work pool maestro-process-pool”). If you run the two commands below and paste the outputs, I can pinpoint the issue:
Copy code
prefect work-pool inspect maestro-process-pool
prefect worker start -p maestro-process-pool
Docs for reference - prefect.yaml guide: prefect.yaml guide - Create deployments: Create deployments
c
@Marvin I successfully deployed using "prefect deploy", but when it was time for the jobs to run they all show "crashed"
m
thought for 48.8 seconds
“Crashed” means the run process died before Prefect could report a final state. Common causes on static (Process) runs: - Wrong working directory or entrypoint path (flow can’t be loaded) - Worker running in a different Conda env than your code (imports fail) - Missing env vars/credentials used at import/run time - The process was killed (OOM, permission error that aborts early, etc.) - API mismatch (worker points to a different Prefect API than you deployed to) Quick triage 1) Open one crashed run in the UI and copy the first error lines from the logs. That message usually pinpoints the root cause (e.g., “cannot import name …”, “file not found”, “Flow could not be loaded from entrypoint …”). 2) From the worker host, run one deployment interactively to see logs live:
Copy code
prefect deployment run <FLOW_NAME>/<DEPLOYMENT_NAME> --watch
Checklist to fix the usual suspects - Confirm the worker is in the right Conda env (same packages your flows need). If you’re using systemd, ensure the unit activates the env before
prefect worker start
. - Set an explicit working directory for Process runs so imports/relative paths work:
Copy code
work_pool:
  name: maestro-process-pool
  job_variables:
    working_dir: /opt/prefect/projects/myproj
    env:
      APP_ENV: prod
      # add any required credentials/vars here
- Verify each
entrypoint
is correct relative to the working_dir, e.g.
flows/allocations/associate.py:allocate_associate
exists on the worker host. - Make sure any local filepaths your flows read actually exist on the host and the worker’s user has permissions to read them. - Start the worker with more logs to catch early failures:
Copy code
prefect worker start -p maestro-process-pool --log-level DEBUG
- Double-check the worker’s API profile matches the one used at deploy (Cloud vs Server URL, API key if Cloud). - If you’re on Python 3.13, be aware some dependencies may not be fully compatible yet. If crashes persist without clear error logs, try 3.11/3.12 where Prefect is most exercised. If you can paste: - The first 20–30 lines from one crashed run’s logs, and - Your current
work_pool
section from prefect.yaml (including
job_variables
) …I’ll pinpoint the issue and propose exact changes. Reference: - prefect.yaml guide: prefect.yaml guide - Create deployments: Create deployments
c
@Marvin the error was: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp2h5vkqokprefect/maestro/flows/anaplan/other/new_hire_import.py' Does this imply that I need to set the working dir?
m
thought for 86.6 seconds
Yes — that error means the flow code wasn’t found in the worker’s current working directory. The “/tmp/tmp...prefect/...” path is a temporary directory Prefect creates when no explicit working directory or pull step puts your project code in place. Since your entrypoints are relative (e.g., flows/anaplan/other/new_hire_import.py), Prefect looked for them under that temp dir and failed. Fix: point deployments at your real project directory on the host (static Process) Pick one of these approaches: Option A (recommended): Set the working directory via job_variables and remove pull steps - Edit each deployment to include a working_dir that points to your actual project path on the worker host (e.g., /opt/maestro). You can delete any top-level pull/build/push sections from the “local” recipe. Example:
Copy code
prefect:
  version: 3.0

deployments:
  - name: new_hire_import
    flow_name: "New Hire Anaplan Upload"
    entrypoint: flows/anaplan/other/new_hire_import.py:upload_new_hires
    tags: ["anaplan"]
    work_pool:
      name: maestro-process-pool
      job_variables:
        working_dir: /opt/maestro      # <-- absolute path on the prod host
        env:
          APP_ENV: prod
    schedules:
      - cron: "0 8 * * 1-5"
        timezone: "America/Chicago"
        active: false
Option B: Keep a pull step that sets the working directory - If you want to keep the recipe steps, set an absolute directory:
Copy code
pull:
  - prefect.deployments.steps.set_working_directory:
      directory: /opt/maestro
- Make sure there are no conflicting build/push/pull steps that try to use a temp directory. After updating: - Redeploy:
Copy code
prefect deploy
- Test one deployment interactively:
Copy code
prefect deployment run <FLOW_NAME>/<DEPLOYMENT_NAME> --watch
Also double-check: - Your
entrypoint
is correct relative to the working_dir (e.g., /opt/maestro/flows/anaplan/other/new_hire_import.py exists). - The worker is running on the same host, in your shared Conda env, with permissions to read those paths. Docs - prefect.yaml guide: prefect.yaml guide - Create deployments: Create deployments If you share the actual path to your project on the prod host, I can tweak your YAML to set working_dir everywhere and remove any recipe steps that are forcing the temp directory.
c
@Marvin can I alternatively set this with the below code, or does this do something else?
Copy code
pull:
- prefect.deployments.steps.set_working_directory:
    directory: "."
m
thought for 76.6 seconds
Short answer: Using directory "." in that pull step won’t fix this. It sets the working directory to “the current directory” of the job, which by default is that temporary /tmp/tmp...prefect path you’re seeing. You need to point Prefect at your real project directory. Use one of these instead: Option A (recommended for Process/static): set the process working dir via job_variables
Copy code
deployments:
  - name: new_hire_import
    entrypoint: flows/anaplan/other/new_hire_import.py:upload_new_hires
    work_pool:
      name: maestro-process-pool
      job_variables:
        working_dir: /opt/maestro   # absolute path to your project on the host
Option B: keep the pull step but use an absolute path
Copy code
pull:
  - prefect.deployments.steps.set_working_directory:
      directory: /opt/maestro
Notes - Pick one approach to avoid confusion; I generally prefer job_variables.working_dir for Process runs. - Relative paths in your entrypoints are resolved from the working directory you set. So ensure /opt/maestro/flows/... actually exists on the worker host. - After updating, redeploy and test one run:
Copy code
prefect deploy
  prefect deployment run <FLOW>/<DEPLOYMENT> --watch
If you share the actual path to your project on that host, I can adjust your YAML snippet precisely.