<@ULVA73B9P> why does it take 30 seconds or longer...
# ask-marvin
l
@Marvin why does it take 30 seconds or longer to complete "Downloading flow code from storage at '.'" .... is it copying my entire code base to another directory? is there a way to make the startup faster and avoid this download step entirely since everything is present in the container?
m
thought for 527.6 seconds
Great question — that log line can be confusing. What that message means - That message appears when a run starts and Prefect resolves the flow’s “source” for execution. In Prefect 3.x, deployments include a “source” (local path like
.
, a Git URL, or other remote storage). At runtime, Prefect runs “pull” steps to make sure the flow code is present and sets the working directory. - If your deployment source is
.
, Prefect typically just sets the working directory to the project folder. It does not usually clone anything. However, depending on your deployment config, Prefect may still: - Copy files to a temporary working dir - Run a git clone (if the source is a Git URL) - Install requirements (if a pip install step was added) - Wait for container/image pulls or cold start overhead Why it can take ~30s+ - Git clone each run (network + repo size) - Runtime pip installs - Container/image pull or cold starts (e.g., Docker/K8s worker starting a new container) - Copying a large project directory to a scratch working dir How to avoid the “download” step when code is already in the container If your code is baked into the image, you can make runs instant by skipping pull steps and just pointing Prefect to the path that already exists in the container: - Ensure your deployment’s entrypoint matches the path inside the image (e.g.,
/app/flows/my_flow.py:my_flow
if your image has
/app
). - Configure the deployment to skip pulling/cloning and only set the working directory. In your
prefect.yaml
, you can set
pull: []
or just the
set_working_directory
step: Example
prefect.yaml
snippet:
Copy code
# assuming your image already contains the code at /app
deployments:
- name: my-deployment
  entrypoint: flows/my_flow.py:my_flow
  work_pool:
    name: my-docker-pool
    job_variables:
      image: your-registry/your-image:tag
      image_pull_policy: IfNotPresent
      working_dir: /app
  # do not clone or download anything at runtime
  pull: []
If you need a pull step just to
cd
into a known directory:
Copy code
pull:
  - prefect.deployments.steps.pull.set_working_directory:
      directory: /app
Other speed tips - If you are using a Git source, limit what is cloned: - Use the
directories
argument in the
git_clone
pull step to pull only needed subdirectories. - Avoid runtime dependency installs: - Bake all Python deps into the image; remove any
pip_install_requirements
pull steps. - Avoid image pulls when not needed: - Set
image_pull_policy: IfNotPresent
(Docker/K8s pools). Helpful references - Deploy with Docker - Work pools &amp; secure deployments blog Can you share a bit more about your setup so I can tailor exact steps? - Prefect version - How you created the deployment (prefect deploy vs programmatic; using
flow.from_source(...)
?) - Your work pool type (Docker, Kubernetes, Process, etc.) - Whether your
prefect.yaml
includes any
pull
steps (git clone, pip install) - The path where your code lives inside the image With that, I can give you a precise config to eliminate the download and speed up startup.
l
here are the logs with debug enabled:
Copy code
05:28:49 PM
Info
Downloading flow code from storage at '.'
prefect.flow_runs
05:28:49 PM
Debug
Importing flow code from 'apps/prefect/flows.py:scan_repository'
prefect.flow_runs
05:29:15 PM
Info
Beginning flow run 'lilac-stork' for flow 'scan-repository'
prefect.flow_runs
05:29:15 PM
Debug
Executing flow 'scan-repository' for flow run 'lilac-stork'...
prefect.flow_runs
so it's actually slow with the import... how can i speed this up
@Marvin here are the logs with debug enabled:
Copy code
05:28:49 PM
Info
Downloading flow code from storage at '.'
prefect.flow_runs
05:28:49 PM
Debug
Importing flow code from 'apps/prefect/flows.py:scan_repository'
prefect.flow_runs
05:29:15 PM
Info
Beginning flow run 'lilac-stork' for flow 'scan-repository'
prefect.flow_runs
05:29:15 PM
Debug
Executing flow 'scan-repository' for flow run 'lilac-stork'...
prefect.flow_runs
so it's actually slow with the import... how can i speed this up
m
thought for 58.1 seconds
Thanks for the debug logs — that pinpoints the delay: it’s not “downloading,” it’s Python importing your flow module
apps/prefect/flows.py
. If that import takes ~25–30s, something in your module (or its imports) is heavy at import time. Ways to speed it up - Make imports lazy - Move heavy imports inside the task or flow functions so they run only when needed, not at module import time. - Typical heavy offenders: pandas/numpy, boto3, google-cloud-*, big data SDKs, ML libs. - Example:
Copy code
from prefect import flow, task

    @task
    def do_work(...):
        import pandas as pd  # lazy import
        import boto3         # lazy import
        ...

    @flow
    def scan_repository(...):
        # keep this module lightweight
        do_work()
- Remove module-level side effects - Don’t create clients, open DB connections, scan repositories, or run config discovery at import time. Do that inside tasks or inside the flow. - If you need expensive setup, run it once in the first task of the flow and cache it if needed. - Profile the import to find the exact culprit - In your container, run:
Copy code
PYTHONPROFILEIMPORTTIME=1 python -X importtime -c "import apps.prefect.flows as f"
This prints a tree of imports with timings. The slowest lines will show you exactly which import chain is causing the delay. - Optional: save it to a file and inspect with tuna:
Copy code
python -X importtime -c "import apps.prefect.flows as f" 2> import.log
    # pip install tuna
    tuna import.log
- Use a faster Python and avoid slow filesystems - Prefer Python 3.11+ (significantly faster imports/startup than 3.8/3.9). - Avoid bind-mounting your code from a network or slow volume in production runs; bake code into the image so imports read from the container’s local filesystem. - Ensure you are not setting
PYTHONDONTWRITEBYTECODE=1
(which prevents
.pyc
caching). You can even precompile during image build:
Copy code
RUN python -m compileall -q /app
- Keep the flow entrypoint lean - If your flow module is part of a larger app (e.g., importing a full framework at import time), consider a minimal
flows.py
that imports only what’s required, and defer framework setup to tasks. - Check resource limits - Very low CPU limits for the worker/container can make import-heavy modules feel slow. If you can share: - The top-level imports from
apps/prefect/flows.py
- Python version and base image - Whether you bind-mount code vs. bake it into the image …I can point to the exact change(s) that will give you the biggest win.