<@ULVA73B9P> I deploy a flow like this: s3_bucket...
# ask-marvin
t
@Marvin I deploy a flow like this: s3_bucket = RemoteFileSystem.load("minio") s3_bucket.put_directory(_to_path_='test', _ignore_file_='.prefectignore') flow.from_source( source=s3_bucket, entrypoint='test/test.py:test', ).deploy( name="test", _work_pool_name_='default', version=os.getenv("GIT_COMMIT_SHA"), build=False, push=False, image='prefect-default-worker:2.20.22', _job_variables_={"pull_policy": "Never", "network_mode": "bridge", "auto_remove": False}, ) In the flow's code I open a file to read I have deployed to minio with with open('file.csv') as ...: But the current work dir is not test, where the file exists. It is the one above it. How can I change the working dir of the flow? I want to do this to run the flow locally and not change the code
m
thought for 345.5 seconds
You’re running from remote storage, so Prefect materializes your project into a temp “project root” then runs your entrypoint from there. By default, the flow’s cwd is that project root (the parent of
test
), not the entrypoint’s folder. That’s why
open('file.csv')
fails unless you’re in
test/
. You have two good options depending on your work pool type. 1) Process work pool (local runs) - Set the
working_dir
job variable to the subfolder you want - Example:
Copy code
flow.from_source(
    source=s3_bucket,
    entrypoint='test/test.py:test',
).deploy(
    name="test",
    work_pool_name="process-pool",
    build=False,
    push=False,
    job_variables={
        "working_dir": "test"   # cwd will be the test/ directory
    },
)
This uses the Process worker’s
working_dir
variable. You can see it in the process job template:
Copy code
prefect work-pool get-default-base-job-template --type process
2) Docker work pool From your
job_variables
(e.g.
network_mode
,
auto_remove
) it looks like Docker. Docker job variables don’t include
working_dir
directly, but you can set the container’s working directory via
container_create_kwargs
. Prefect downloads your source into
/opt/prefect/flows
inside the container, so point the working dir at your subfolder:
Copy code
flow.from_source(
    source=s3_bucket,
    entrypoint='test/test.py:test',
).deploy(
    name="test",
    work_pool_name="my-docker-pool",
    version=os.getenv("GIT_COMMIT_SHA"),
    build=False,
    push=False,
    image='prefect-default-worker:2.20.22',
    job_variables={
        # note: the correct field is image_pull_policy (not pull_policy)
        "image_pull_policy": "Never",
        "network_mode": "bridge",
        "auto_remove": False,
        "container_create_kwargs": {
            "working_dir": "/opt/prefect/flows/test"
        }
    },
)
Tip: If unsure of the exact path, add a quick
print(os.getcwd())
once to confirm where the project root is inside the container, then append
/test
. Notes - Default cwd when loading from remote storage is the materialized project root, not the entrypoint folder. - For Docker, prefer
container_create_kwargs.working_dir
over overriding
command
(overriding
command
can break Prefect’s auto-generated command). - In your current snippet, change
pull_policy
to
image_pull_policy
to match the Docker worker template. References - Process/Docker work pool templates can be inspected with:
Copy code
prefect work-pool get-default-base-job-template --type process
prefect work-pool get-default-base-job-template --type docker
- Docs: Upgrading agents to workers, Settings and Profiles Quick check: are you running on a Process pool or a Docker pool? If Docker, the
container_create_kwargs.working_dir
approach above will let you keep
open('file.csv')
working without changing your code.
t
@Marvin give the a link to the docs showing all container_create_kwargs options
m
thought for 198.8 seconds
There isn’t a Prefect-specific list —
container_create_kwargs
is passed straight through to the Docker SDK’s container create API. The authoritative reference for all available options is the Docker SDK docs: - Docker SDK: ContainerCollection.create kwargs
t
@Marvin container_create_kwargs doesn't seem to work
m
thought for 290.9 seconds
Got it — a few things can cause that. Quick checks and fixes: 1) Confirm your work pool type and worker version - Are you using a Docker work pool or a Process work pool? - If you’re using the Docker pool but running a Prefect 2.x worker image (e.g.,
prefect-default-worker:2.20.22
), Prefect 3’s
container_create_kwargs
will be ignored. In Prefect 3, Docker is provided by the prefect-docker collection; the 2.x worker image isn’t compatible with 3.x job variables. - Inspect your pool to confirm the template includes `container_create_kwargs`:
Copy code
prefect work-pool inspect <your-pool-name>
# or show the default template for docker pools
prefect work-pool get-default-base-job-template --type docker
You should see
"container_create_kwargs": "{{ container_create_kwargs }}"
in the job_configuration. 2) Use the correct job variable names - For Docker, the key is
image_pull_policy
(not
pull_policy
). Example:
Copy code
job_variables={
  "image": "your-image:tag",
  "image_pull_policy": "Never",
  "network_mode": "bridge",
  "auto_remove": False,
  "container_create_kwargs": {
    "working_dir": "/opt/prefect/flows/test"
  }
}
3) Verify the path inside the container - Add a quick
print(os.getcwd())
at flow start to confirm where Prefect materialized your project. Typically it’s under
/opt/prefect/flows
. Then target that subfolder in
working_dir
. 4) If you’re actually on a Process work pool (local runs) -
container_create_kwargs
won’t apply there. Instead, set the Process worker’s `working_dir`:
Copy code
job_variables={"working_dir": "test"}
Reference for Docker kwargs -
container_create_kwargs
maps directly to Docker SDK container create options: Docker SDK: ContainerCollection.create kwargs If you can share: - Your work pool type - The worker image/version you’re running - The exact
job_variables
you used …I can pinpoint why it’s being ignored and suggest the precise fix.
t
@Marvin Docker work pool
m
thought for 137.1 seconds
Great — for a Docker work pool,
container_create_kwargs
should work, but a few gotchas can make it look like it doesn’t. Here’s a tight checklist and a known-good example. Quick checks - Make sure the pool template actually includes `container_create_kwargs`:
Copy code
prefect work-pool get-default-base-job-template --type docker
prefect work-pool inspect <your-docker-pool>
You should see
"container_create_kwargs": "{{ container_create_kwargs }}"
. - Ensure the worker is Prefect 3.x with the docker extra installed (where the worker process runs):
Copy code
prefect version
pip show prefect-docker  # or ensure you started the worker in an env with `prefect[docker]`
- Use the correct key names: -
image_pull_policy
(not
pull_policy
) -
container_create_kwargs
values must match Docker SDK kwargs (e.g.,
working_dir
, lowercase) - Turn on DEBUG logs for the worker to see the exact kwargs passed to Docker:
Copy code
PREFECT_LOGGING_LEVEL=DEBUG prefect worker start -p <your-docker-pool>
Known-good deployment example
Copy code
flow.from_source(
    source=s3_bucket,
    entrypoint='test/test.py:test',
).deploy(
    name="test",
    work_pool_name="my-docker-pool",
    build=False,
    push=False,
    image='your-runtime-image:tag',  # must contain Prefect 3.x
    job_variables={
        "image_pull_policy": "Never",
        "network_mode": "bridge",
        "auto_remove": False,
        "container_create_kwargs": {
            "working_dir": "/opt/prefect/flows/test"
        }
    },
)
Tips - Inside the container, Prefect typically materializes your project under
/opt/prefect/flows
. Add a quick
print(os.getcwd())
at flow start once to confirm and adjust
working_dir
to
.../test
. - If your image does not contain Prefect 3.x, the auto-generated run command can fail or behave unexpectedly. Use an image that has Prefect 3.x installed. If this still doesn’t take effect, please share: - Output of
prefect work-pool inspect <pool>
- The worker’s
prefect version
and whether
prefect-docker
is installed - A snippet of the worker logs around “Creating Docker container …” (DEBUG level) Reference for valid kwargs - Docker SDK: ContainerCollection.create kwargs
t
@Marvin Version: 2.20.22 API version: 0.8.4 Python version: 3.10.10 Git commit: 7bdb7b8e Built: Thu, Sep 11, 2025 9:55 AM OS/Arch: darwin/arm64 Profile: pd-flow-local Server type: server Current working directory: /opt/prefect/remote-file-system-minio "container_create_kwargs": { "anyOf": [ { "additionalProperties": true, "type": "object" }, { "type": "null" } ], "default": null, "description": "Configuration for containers created by workers. See the [
docker-py
documentation](https://docker-py.readthedocs.io/en/stable/containers.html) for accepted values.", "title": "Container Configuration" } }, "type": "object" } } 185458.662 | DEBUG | Flow run 'fervent-jerboa' - Changing working directory to 'remote-file-system-minio'
m
thought for 213.9 seconds
Thanks — that info helps. You’re running Prefect 2.20.22 on a Docker work pool. In Prefect 2, the engine changes the working directory after your container starts to the materialized project root (you can see it in your log: “Changing working directory to 'remote-file-system-minio'”). This happens after Docker creates the container, so
container_create_kwargs.working_dir
won’t stick — the engine overrides it. That’s why it “doesn’t work.” Best no-code-change workaround in 2.x - Make the directory you want to be the cwd the project root in remote storage. In your case, upload the contents of
test/
to the remote root and point your entrypoint at
test.py
directly:
Copy code
# upload the contents of ./test into the remote root
s3_bucket.put_directory(from_path='test', to_path='.')

flow.from_source(
    source=s3_bucket,
    entrypoint='test.py:test',   # now lives at project root
).deploy(
    name="test",
    work_pool_name='default',
    version=os.getenv("GIT_COMMIT_SHA"),
    build=False,
    push=False,
    image='prefect-default-worker:2.20.22',
    job_variables={"image_pull_policy": "Never", "network_mode": "bridge", "auto_remove": False},
)
Now Prefect will set cwd to the project root (which used to be your
test/
), so
open('file.csv')
works without changing code. Notes - Keep using
image_pull_policy
(not
pull_policy
) for Docker pools. - If you decide to move to Prefect 3.x later, Docker pools support passing container create kwargs, and Process pools have a
working_dir
variable. But in 2.x the engine-level chdir takes precedence. Reference for Docker kwargs (if you need them later) - Docker SDK: ContainerCollection.create kwargs If adjusting the upload layout isn’t feasible, tell me a bit more about your constraints and we can consider other options.
t
@Marvin is there a way to specify the source as a bucket and a folder in the bucket? All my flows are in one bucket and when I execute one flow I don't want to pull all other flows Currently I upload my code in the S3 bucket and pass it to flow.from_source() s3_bucket = RemoteFileSystem.load("minio") flow.from_source( source=s3_bucket, entrypoint='., ).deploy()
m
thought for 252.4 seconds
Yes. Point your RemoteFileSystem’s basepath at the specific subfolder you want, then make the
entrypoint
relative to that folder. Prefect will only pull that subfolder instead of your whole bucket. Two easy ways to do this: 1) Create/use a block that’s already scoped to the subfolder - Configure your
RemoteFileSystem
block with
basepath = <s3://my-bucket/path/to/one-flow>
- Then deploy with an entrypoint relative to that folder:
Copy code
from prefect import flow
from prefect.filesystems import RemoteFileSystem

s3 = RemoteFileSystem.load("minio-one-flow")  # basepath points to <s3://my-bucket/path/to/one-flow>

Flow.from_source(
    source=s3,
    entrypoint="main.py:my_flow",  # relative to basepath
).deploy(name="my-flow")
2) Temporarily override the basepath in code (no need to save a new block) - Load your existing block, modify
basepath
in memory, and use it:
Copy code
from prefect import Flow
from prefect.filesystems import RemoteFileSystem

s3 = RemoteFileSystem.load("minio")  # e.g., currently points to <s3://my-bucket>
s3.basepath = "<s3://my-bucket/path/to/one-flow>"  # narrow to a subfolder

Flow.from_source(
    source=s3,
    entrypoint="main.py:my_flow",
).deploy(name="my-flow")
Notes: -
entrypoint
should be
<relative-path-to-file>.py:<flow_object_name>
relative to the basepath. - When using a
RemoteFileSystem
source, Prefect will fetch everything under
basepath
. There isn’t a per-file include/exclude for remote storage today, so scoping
basepath
to the exact folder you need is the best way to avoid pulling other flows. - If you have multiple flows in the same bucket, consider creating one
RemoteFileSystem
block per flow (each with its own subfolder basepath), or override
basepath
in code as shown above. If you share your current
minio
block’s
basepath
and your desired folder layout, I can suggest the exact
entrypoint
and basepath strings to use.
t
@Marvin it is not working. The change exists only in the current context and is not persisted aftehr the deployment. So when I execute the flow it pulls from s3://my-bucket everythin
m
thought for 119.1 seconds
t
@Marvin is there another way to make the deployment to pull the code from remote file system?
m
thought for 125.1 seconds
Thanks for the detail — what you’re seeing is expected. When you call
Flow.from_source(...).deploy()
, the deployment stores a reference to the saved block. At run time, the worker loads the block from Prefect and uses the block’s persisted
basepath
. Any in-memory change like
s3.basepath = "..."
that you didn’t save will be lost, so it will still pull from
<s3://my-bucket>
. You have a few solid options: 1) Create a block per subfolder (recommended) - Keep your existing
minio
block for the bucket root. - Create a new
RemoteFileSystem
block scoped to the subfolder and use that in your deployment.
Copy code
from prefect.filesystems import RemoteFileSystem
from prefect import Flow

base = RemoteFileSystem.load("minio")  # has your MinIO credentials/settings

# Create a new block for this one flow
flow_fs = RemoteFileSystem(
    basepath="<s3://my-bucket/path/to/one-flow>",
    settings=base.settings,  # reuse credentials/settings
)
flow_fs.save("minio-one-flow", overwrite=True)

Flow.from_source(
    source=RemoteFileSystem.load("minio-one-flow"),
    entrypoint="main.py:my_flow",  # relative to the basepath above
).deploy(name="my-flow")
- At run time, Prefect will only pull keys under
<s3://my-bucket/path/to/one-flow>
. 2) Persist your change to the existing block (affects anything else using it) - If you really want to reuse the same block name, you must save the change before deploying:
Copy code
from prefect.filesystems import RemoteFileSystem
from prefect import Flow

s3 = RemoteFileSystem.load("minio")
s3.basepath = "<s3://my-bucket/path/to/one-flow>"
s3.save("minio", overwrite=True)  # persist; careful: this impacts other deployments using this block

Flow.from_source(
    source=RemoteFileSystem.load("minio"),
    entrypoint="main.py:my_flow",
).deploy(name="my-flow")
3) Pass a plain S3 URL as the source (no block), using environment credentials - If you don’t want additional blocks, you can pass the URL directly and rely on env vars for credentials (e.g.,
AWS_ACCESS_KEY_ID
,
AWS_SECRET_ACCESS_KEY
, and your MinIO S3 endpoint env var).
Copy code
from prefect import Flow

Flow.from_source(
    source="<s3://my-bucket/path/to/one-flow>",
    entrypoint="main.py:my_flow",
).deploy(name="my-flow")
- This will also pull just the
path/to/one-flow
prefix. Ensure your MinIO credentials/endpoint are available in the environment where the flow runs. Notes: - There is no include/exclude filter for remote storage pulls today; scoping via
basepath
(or a direct prefix URL) is the way to avoid pulling other flows. -
entrypoint
must be relative to the folder you’ve scoped as the
basepath
. If you share your bucket layout and flow filenames, I can give you the exact
basepath
and
entrypoint
strings, or help script a quick “one block per flow” setup.
Yes—besides
Flow.from_source(...)
, you can have deployments pull code from a remote file system using project “pull steps” in a prefect.yaml, or by baking your code into a container image. Here are your main options: 1) Use prefect.yaml pull steps (CLI-based) - Define steps that fetch code from your remote FS at run time, then deploy via
prefect deploy
. - You can either reference a
RemoteFileSystem
block or a direct S3/MinIO URL. Example A: pull with a block
Copy code
# prefect.yaml
name: my-project

pull:
  - prefect.deployments.steps.pull.pull_with_block:
      block_type_slug: remote-file-system
      block_document_name: minio-one-flow   # a RemoteFileSystem block scoped to the subfolder

deployments:
  - name: my-flow
    entrypoint: main.py:my_flow
Example B: pull from a URL (no block)
Copy code
# prefect.yaml
name: my-project

pull:
  - prefect.deployments.steps.pull.pull_from_remote_storage:
      url: <s3://my-bucket/path/to/one-flow>
      # these kwargs mirror the RemoteFileSystem settings for s3fs/minio
      key: ${AWS_ACCESS_KEY_ID}
      secret: ${AWS_SECRET_ACCESS_KEY}
      client_kwargs:
        endpoint_url: <https://minio.example.com>

deployments:
  - name: my-flow
    entrypoint: main.py:my_flow
Then run:
Copy code
prefect deploy
Notes: -
pull_with_block
uses an existing
RemoteFileSystem
block;
pull_from_remote_storage
uses a URL + settings directly. - Your
entrypoint
is relative to the pulled code root. - Make sure your worker environment includes the required driver (e.g.,
s3fs
for S3/MinIO). 2) Keep Python-only but scope the block to the subfolder - What you’re doing works; just ensure
RemoteFileSystem.basepath
points to the specific subfolder so only that path is fetched:
Copy code
s3 = RemoteFileSystem.load("minio")
s3.basepath = "<s3://my-bucket/path/to/one-flow>"

Flow.from_source(source=s3, entrypoint="main.py:my_flow").deploy(name="my-flow")
3) Package code into a container image - If you set
image=...
when deploying and run on a Docker/Kubernetes work pool, your code is in the image so no remote pull is needed at run time. If you’d like, share your bucket layout and preferred method (CLI with prefect.yaml or Python-only), and I’ll provide an exact config. For reference, the built-in pull steps are in: - pull_from_remote_storage and pull_with_block: prefect/deployments/steps/pull.py
t
@Marvin with git-based storage can I clone only a specific part of the repo e.g the flow directory?
m
thought for 41.8 seconds
Yes. With git-based storage you can clone only specific paths using the git pull steps in prefect.yaml. Both git_clone and agit_clone support a directories parameter for sparse checkout so only those subfolders are fetched. Example prefect.yaml:
Copy code
name: my-project

pull:
  - prefect.deployments.steps.pull.git_clone:
      repository: <https://github.com/your-org/your-repo.git>
      branch: main
      # only fetch these subfolders (sparse checkout)
      directories:
        - flows/my-flow-dir
      # for private repos, use one of:
      # access_token: ${GITHUB_TOKEN}
      # credentials: your-git-credentials-block-name

  # optional: set the working directory for the entrypoint
  - prefect.deployments.steps.pull.set_working_directory:
      directory: flows/my-flow-dir

deployments:
  - name: my-flow
    entrypoint: main.py:my_flow
Then run:
Copy code
prefect deploy
Notes: - directories is a list of paths relative to the repo root. - You can use agit_clone instead of git_clone; it also supports directories. - If you use Flow.from_source("https://github.com/org/repo", ...), it will clone the repo; partial checkout is not configurable there—use the prefect.yaml pull step for sparse checkout. - Make sure git is available in the worker environment; include credentials via access_token or a credentials block for private repos. References: - Pull steps source (shows directories support): prefect/deployments/steps/pull.py
t
@Marvin Can I do it with a deployment scirpt flow.from_source?
m
thought for 17.3 seconds
Short answer: not with Flow.from_source alone. In Prefect 3.x, Flow.from_source("https://github.com/org/repo", ...) will clone the whole repo; it doesn’t expose a way to limit to specific subdirectories. Workable pattern in a deployment script: use the git pull step to do a sparse checkout (directories=...), then pass the local path to Flow.from_source. Example:
Copy code
import os
from prefect import Flow
from prefect.deployments.steps.pull import git_clone

# Sparse clone only the flow directory you need
res = git_clone(
    repository="<https://github.com/your-org/your-repo.git>",
    branch="main",
    directories=["flows/my-flow-dir"],  # relative to repo root
    # for private repos, one of:
    # access_token=os.environ["GITHUB_TOKEN"],
    # credentials=YourGitCredentialsBlock.load("my-git-creds"),
)

clone_dir = res["directory"]  # path to the checked-out code

Flow.from_source(
    source=clone_dir,  # local path
    entrypoint="flows/my-flow-dir/main.py:my_flow",
).deploy(name="my-flow")
Notes: - Requires git in the environment. - directories uses sparse checkout so only those paths are fetched. - If you prefer to stay fully declarative, the same sparse checkout is available in prefect.yaml via the pull step: - prefect.deployments.steps.pull.git_clone with directories=[...] Source for the pull steps (shows directories support): prefect/deployments/steps/pull.py If you share your repo layout and target flow path, I can tailor the exact directories and entrypoint.
t
@Marvin can I have different prefect.yaml for each flow?
m
thought for 26.6 seconds
Yes. You can keep a separate prefect.yaml per flow and deploy them independently. You have a few ways to do it: 1) One YAML per flow (recommended for different pull/clone logic) - Put a prefect.yaml next to each flow (or in each flow’s folder). - Run deploy pointing to the file:
Copy code
prefect deploy --prefect-file path/to/flowA/prefect.yaml
prefect deploy --prefect-file path/to/flowB/prefect.yaml
- Or cd into the folder and run:
Copy code
cd path/to/flowA
prefect deploy
2) Programmatic deploy with a specific YAML file - If you’re using a deployment script, you can call the programmatic API and pass the YAML file path:
Copy code
from prefect import deploy

deploy(
    entrypoint="./flows/my_flow/main.py:my_flow",
    names=["my-flow"],
    prefect_file="path/to/flowA/prefect.yaml",
)
3) Single repo, multiple deployments in one YAML - You can also keep one top-level prefect.yaml and define multiple deployments (each with its own entrypoint). If each flow needs different pull/clone behavior (e.g., different sparse-checkout directories), keeping separate YAMLs is usually cleaner. If you share your folder layout, I can sketch the exact commands/YAML structure for each flow.
t
@Marvin Can I have a single yaml and execute flow deployment per flow
m
thought for 15.2 seconds
Yes. You can keep a single prefect.yaml with multiple deployments and deploy them one-by-one. How to structure it:
Copy code
name: my-project

# optional shared pull steps if all flows use the same
# pull:
#   - prefect.deployments.steps.pull.git_clone:
#       repository: <https://github.com/org/repo.git>
#       branch: main

deployments:
  - name: flow-a/prod
    entrypoint: flows/flow_a.py:flow_a
    # work_pool, parameters, schedule, etc.

  - name: flow-b/prod
    entrypoint: flows/flow_b.py:flow_b
Deploy only one deployment from the file: - Using the CLI:
Copy code
prefect deploy --name flow-a/prod
# or select with a pattern
prefect deploy --name 'flow-b/*'
# you can also pass multiple --name flags
prefect deploy --name flow-a/prod --name flow-b/prod
- Programmatically:
Copy code
from prefect import deploy

deploy(
    names=["flow-a/prod"],  # deploy just this one
    # prefect_file defaults to ./prefect.yaml; set explicitly if needed:
    # prefect_file="path/to/prefect.yaml",
)
Notes: - Running
prefect deploy
with no
--name
filter will deploy all deployments defined in the YAML. - If different flows require different pull/clone logic, you can either: - Keep separate prefect.yaml files, or - Do programmatic deploys where you run the appropriate pull step first, then call
deploy(names=[...])
.
t
@Marvin change this to baking the code of the flow into a separate image. I have the docker file in the current directory s3_bucket = RemoteFileSystem.load("minio") s3_bucket.put_directory(to_path='generate_reports_flow', ignore_file='.prefectignore') s3_bucket.put_directory(to_path='generate_reports_flow/common', local_path='common', ignore_file='.prefectignore') flow.from_source( source=s3_bucket, entrypoint='generate_reports_flow/generate_reports.py:generate_reports_flow', ).deploy( name="generate_reports_flow_deployment", work_pool_name='default', version=os.getenv("GIT_COMMIT_SHA"), build=False, push=False, image='prefect-flow-container:2.20.22', job_variables={ "image_pull_policy": "Never", "network_mode": "bridge", "auto_remove": False }, parameters={'clients': [""], 'languages': ["bg"], 'gdb_diff_recipients': [}, )
m
thought for 141.3 seconds
Got it — to bake your flow code into a Docker image (using the Dockerfile in your current directory) you can drop the remote storage and build the image during deploy. Here’s a minimal change: - Remove the
RemoteFileSystem
and
flow.from_source(...)
- Use a
DockerImage
with your Dockerfile and set
build=True
- Keep
push=False
if the worker can access the local image; set
push=True
if it needs to pull from a registry Example (Prefect 3.x):
Copy code
import os
from prefect.docker import DockerImage
from generate_reports_flow.generate_reports import generate_reports_flow

generate_reports_flow.deploy(
    name="generate_reports_flow_deployment",
    work_pool_name="default",
    version=os.getenv("GIT_COMMIT_SHA"),
    image=DockerImage(
        name="prefect-flow-container",
        tag="2.20.22",
        dockerfile="Dockerfile",  # uses Dockerfile in the current directory
    ),
    build=True,
    push=False,  # set True if your worker needs to pull from a registry
    job_variables={
        "image_pull_policy": "Never",  # K8s only; ignored by Docker workers
        "network_mode": "bridge",
        "auto_remove": False,
    },
    parameters={
        "clients": ["Електрохолд"],
        "languages": ["bg"],
        "gdb_diff_recipients": [
            "<mailto:cvetelin.andreev@dreamix.eu|cvetelin.andreev@dreamix.eu>",
            "<mailto:georgi.yanev@dreamix.eu|georgi.yanev@dreamix.eu>",
            "<mailto:yordan.merakov@power-drone.com|yordan.merakov@power-drone.com>",
        ],
    },
)
Important notes: - Make sure your Dockerfile copies the project into the image at a path that matches your import/module layout. A common pattern is:
Copy code
FROM python:3.11-slim
  WORKDIR /opt/prefect/flows
  COPY . .
  RUN pip install -U pip && pip install -r requirements.txt && pip install "prefect>=3,<4"
With this, the entrypoint
generate_reports_flow.generate_reports:generate_reports_flow
will resolve inside the container as long as your folder layout is the same. - If you’re using a Docker work pool on the same machine where you’re building,
push=False
is fine and
image_pull_policy
is ignored. - If you’re using a Kubernetes work pool, you’ll typically set
push=True
, push to a registry your cluster can access, and keep
image_pull_policy
as
"IfNotPresent"
or
"Never"
depending on your workflow. Docs for reference: - Create deployments - Store flow code (bake into image vs remote storage) - Kubernetes work pools and images
t
2
@Marvin does this mean that in order to make the flow available in production, I need to build the image, push it to the registry and then the prod worker will pull it from the registry? Is there another way?
m
thought for 27.6 seconds
Short answer: If you bake the code into the image, then yes — in production you typically need to build the image, push it to a registry your prod workers can access, and the workers will pull it. Alternatives exist if you want to avoid pushing a new image for each code change. Common production patterns: 1) Bake code into the image (immutable, recommended for reproducibility) - Build the image (locally or in CI), push to a registry, have workers pull it. - Pros: reproducible, single artifact. Cons: requires registry and push/pull on every change. - If your prod worker runs on the same node where you built the image, you can skip pushing and let it run the local image — but that doesn’t scale to multiple nodes. - K8s note:
image_pull_policy: Never
only works if the image already exists on every node. 2) Store code remotely and pull at runtime (no need to push images for each change) - Use
flow.from_source(...)
to point at: - Git:
source="<https://github.com/org/repo>"
with an
entrypoint="path/to/flow.py:my_flow"
and an optional ref (branch/tag/SHA) to pin versions. - Object storage (S3/MinIO/GCS/Azure): your existing
RemoteFileSystem
approach. - Workers run a stable base image (with Python/Prefect/deps) and download the code at run time. - Pros: fast iteration, no registry churn. Cons: you must ensure the source is reachable and versioned. - Docs: Store flow code, Create deployments 3) Shared volume mount with code - Mount a shared NFS/PVC (K8s) or Docker volume containing your code into the worker/flow container; set the deployment entrypoint to that path. - Pros: no image rebuilds. Cons: infra complexity; careful with consistency across nodes. - K8s infra reference: Kubernetes work pool guide 4) Package-and-install approach - Package your flow as a Python package, publish to a private index, and either: - Bake the package into a base image (rebuild only when deps change), or - Install the package at container startup (dynamic, but adds startup time and requires network/private index access). Which option is best depends on your infra and release cadence: - Need strict immutability/reproducibility? Bake and push images. - Want faster iteration with fewer image pushes? Use
from_source
(Git or MinIO/S3) with a stable base image. - Have a shared filesystem across workers? Consider a volume mount. If you share a bit about your prod environment (Docker or Kubernetes work pool? single node vs cluster? access to a registry? MinIO available in prod?), I can recommend the most practical setup and provide a tailored example.
t
@Marvin all options are too complex. I don't want to introduce a registry or other infrastructure. I want to put my flow's code somewhere and make the worker pull only that code and execute it.
m
thought for 89.5 seconds
Yes — you can avoid a registry entirely by having the worker pull just the code at runtime. Simplest setup: Process work pool + remote code (MinIO/S3) - Keep your code in MinIO using your existing
RemoteFileSystem
block. - Run a Process work pool (no containers). The worker downloads the code at run time and executes it in the host Python environment. - Ensure the worker machine has Prefect 3.x and your dependencies installed once. Example (adjusting your snippet to use Process work pool — no image needed):
Copy code
import os
from prefect.filesystems import RemoteFileSystem
from generate_reports_flow.generate_reports import generate_reports_flow

# Upload your code to MinIO (same as you do now)
s3_bucket = RemoteFileSystem.load("minio")
s3_bucket.put_directory(to_path="generate_reports_flow", ignore_file=".prefectignore")
s3_bucket.put_directory(to_path="generate_reports_flow/common", local_path="common", ignore_file=".prefectignore")

# Create a deployment that references the code in MinIO
generate_reports_flow.from_source(
    source=s3_bucket,
    entrypoint="generate_reports_flow/generate_reports.py:generate_reports_flow",
).deploy(
    name="generate_reports_flow_deployment",
    work_pool_name="my-process-pool",  # a Process work pool
    version=os.getenv("GIT_COMMIT_SHA"),
    build=False,
    push=False,
    parameters={
        "clients": ["Електрохолд"],
        "languages": ["bg"],
        "gdb_diff_recipients": [
            "<mailto:cvetelin.andreev@dreamix.eu|cvetelin.andreev@dreamix.eu>",
            "<mailto:georgi.yanev@dreamix.eu|georgi.yanev@dreamix.eu>",
            "<mailto:yordan.merakov@power-drone.com|yordan.merakov@power-drone.com>",
        ],
    },
)
What this does - The deployment stores a reference to your MinIO block + entrypoint. - When a run starts, the Process worker fetches the code to a working directory and runs it with the host’s Python environment. - No Docker image builds, no registry, no shared volumes. Requirements - Install Prefect 3.x and your flow’s dependencies on the worker host. - Make sure the worker has access to the MinIO block (credentials set in the block). Docs - Store flow code (Git/S3/MinIO) - Create deployments If this matches your constraints, I can help you set up the Process work pool and confirm what to install on the worker host.
t
@Marvin same thing. Make me a yaml file for that deployment. pull only the code flow.
m
thought for 402.6 seconds
Absolutely — here’s a minimal prefect.yaml that makes the worker pull only your flow code from your existing MinIO RemoteFileSystem block (no images/registry). It targets a Process work pool. Place this file at your project root, adjust the block name/path if needed, then run
prefect deploy
from that directory.
Copy code
name: generate-reports
prefect-version: ">=3.0.0"

deployments:
  - name: generate_reports_flow_deployment
    entrypoint: generate_reports_flow/generate_reports.py:generate_reports_flow

    # Source: pull code at runtime from your RemoteFileSystem block ("minio")
    # path is the root within the bucket where your code lives.
    source:
      block: remote-file-system/minio
      path: ""  # if your code is under "generate_reports_flow", you can leave entrypoint as-is

    work_pool:
      name: default  # a Process work pool; no containers, no registry

    parameters:
      clients: ["Електрохолд"]
      languages: ["bg"]
      gdb_diff_recipients:
        - "<mailto:cvetelin.andreev@dreamix.eu|cvetelin.andreev@dreamix.eu>"
        - "<mailto:georgi.yanev@dreamix.eu|georgi.yanev@dreamix.eu>"
        - "<mailto:yordan.merakov@power-drone.com|yordan.merakov@power-drone.com>"

    # optional metadata
    version: "{{ env.GIT_COMMIT_SHA }}"
    tags: []
    description: "Generate reports flow"
Notes: - Ensure the RemoteFileSystem block named
minio
exists and points to your MinIO bucket. - Make sure your code is uploaded to that bucket (e.g., the
generate_reports_flow/
folder and its contents). You can keep using your existing
put_directory
step in CI or a one-time upload. - The Process worker host must have Prefect 3.x and your Python dependencies installed once (no Docker). Docs: - Store flow code - prefect.yaml reference If your code in MinIO is rooted at
generate_reports_flow/
, you can also set: -
source.path: "generate_reports_flow"
- and shorten
entrypoint
to
generate_reports.py:generate_reports_flow
Want me to tailor this to a different work pool name or MinIO path?
t
@Marvin I'm using prefect 2. Where I can see the push action full docs. I need to push the code to remote file system minio
m
thought for 178.2 seconds
Thanks for clarifying — since you’re on Prefect 2.x: Where to read about “push” in Prefect 2 - The push step is part of the Prefect 2 “projects/prefect.yaml” deployment workflow. See: - Projects & prefect.yaml (2.x): https://docs-2.prefect.io/latest/guides/prefect-deploy/ - Filesystems / Storage blocks (2.x): https://docs-2.prefect.io/latest/concepts/filesystems/ Practical way to push code to MinIO in Prefect 2 In Prefect 2, the most straightforward path is to set a storage block on your deployment. When you build/apply the deployment, Prefect will upload (“push”) your code to that storage. For MinIO, use the RemoteFileSystem block (S3-compatible). Example: create a RemoteFileSystem block for MinIO and use it on a deployment
Copy code
from prefect.filesystems import RemoteFileSystem
from prefect.deployments import Deployment
from generate_reports_flow.generate_reports import generate_reports_flow

# 1) Create and save a RemoteFileSystem block that points to MinIO
#    - basepath uses the s3:// scheme for s3fs
#    - settings pass the custom endpoint to MinIO
minio_fs = RemoteFileSystem(
    basepath="<s3://my-bucket/prefix>",
    settings={
        "client_kwargs": {
            "endpoint_url": "<http://minio:9000>"  # or your MinIO URL
        }
    },
)
minio_fs.save("minio", overwrite=True)

# Ensure AWS creds for MinIO are available (e.g., env vars):
#   AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY
# Optionally AWS_DEFAULT_REGION (any value, e.g., "us-east-1")

# 2) Load the block and build/apply a deployment with that storage
storage = RemoteFileSystem.load("minio")

Deployment.build_from_flow(
    flow=generate_reports_flow,
    name="generate_reports_flow_deployment",
    storage=storage,
    parameters={
        "clients": ["Електрохолд"],
        "languages": ["bg"],
        "gdb_diff_recipients": [
            "<mailto:cvetelin.andreev@dreamix.eu|cvetelin.andreev@dreamix.eu>",
            "<mailto:georgi.yanev@dreamix.eu|georgi.yanev@dreamix.eu>",
            "<mailto:yordan.merakov@power-drone.com|yordan.merakov@power-drone.com>",
        ],
    },
).apply()
What this does - The deployment will reference your MinIO storage. - Building/applying uploads the flow code to MinIO (the “push”). - Your worker will pull the code from MinIO at run time. If you’d prefer doing this via prefect.yaml and the CLI, the same concept applies: configure your project to use a storage block and run
prefect deploy
. The docs above walk through the project/prefect.yaml workflow in 2.x. If you share your MinIO endpoint/credentials model (env vars vs. blocks), I can provide a prefect.yaml template for 2.x that wires it up for
prefect deploy
. Also, if you can plan a future upgrade, Prefect 3 simplifies this flow with
flow.from_source(...).deploy(...)
.
t
@Marvin How do I push the code to minio remote file system with prefect yaml
m
thought for 61.9 seconds
Great — here’s a Prefect 2.x-prefect.yaml that pushes your project code to MinIO (S3-compatible) during deploy, then has workers pull only that code at run time. This uses the prefect-aws collection’s S3 push/pull steps. Prereqs - Install prefect-aws where you run prefect deploy: pip install "prefect-aws>=0.3.0" - Create an AwsCredentials block configured for MinIO (endpoint URL points to MinIO):
Copy code
from prefect_aws.credentials import AwsCredentials

  AwsCredentials(
      aws_access_key_id="YOUR_MINIO_ACCESS_KEY",
      aws_secret_access_key="YOUR_MINIO_SECRET_KEY",
      aws_endpoint_url="<http://minio:9000>",  # adjust to your MinIO URL
  ).save("minio-creds", overwrite=True)
- Ensure your worker uses a Process work pool (no containers needed) and has your Python deps installed. prefect.yaml (2.x)
Copy code
name: generate-reports
prefect-version: ">=2.14.0"

# Nothing to build (no images)
build: []

# Push your project files to MinIO (S3-compatible) on deploy
push:
  - prefect_aws.deployments.steps.push_to_s3:
      requires: "prefect-aws>=0.3.0"
      id: code_push
      bucket: "my-minio-bucket"              # <-- change
      folder: "generate-reports/{{ git_sha }}"  # versioned folder; change if you like
      credentials: "{{ prefect.blocks.aws-credentials.minio-creds }}"
      # endpoint_url can also be moved into the AwsCredentials block; keeping here is OK
      endpoint_url: "<http://minio:9000>"      # <-- change

# Workers pull just the code when the flow runs
pull:
  - prefect_aws.deployments.steps.pull_from_s3:
      requires: "prefect-aws>=0.3.0"
      id: code_pull
      bucket: "{{ code_push.bucket }}"
      folder: "{{ code_push.folder }}"
      credentials: "{{ prefect.blocks.aws-credentials.minio-creds }}"
      endpoint_url: "<http://minio:9000>"

deployments:
  - name: generate_reports_flow_deployment
    entrypoint: generate_reports_flow/generate_reports.py:generate_reports_flow
    work_pool:
      name: my-process-pool                 # a Process work pool; no registry needed
    parameters:
      clients: ["Електрохолд"]
      languages: ["bg"]
      gdb_diff_recipients:
        - "<mailto:cvetelin.andreev@dreamix.eu|cvetelin.andreev@dreamix.eu>"
        - "<mailto:georgi.yanev@dreamix.eu|georgi.yanev@dreamix.eu>"
        - "<mailto:yordan.merakov@power-drone.com|yordan.merakov@power-drone.com>"
How it works - prefect deploy runs the push step, uploading your project (respecting .prefectignore) to s3://my-minio-bucket/generate-reports/&lt;git_sha>. - At run time, the Process worker executes the pull step, downloads the code, and runs your flow from the entrypoint. - No Docker images or registries involved. Tips - Add a .prefectignore file to exclude large/unneeded files from upload. - If you don’t have git available, you can change folder to include a timestamp or a static folder name. - If you prefer using an S3 Bucket block instead of steps, you can also set storage on the deployment, but steps are the most direct way to “push code” via prefect.yaml in 2.x. Docs - Prefect 2 Projects & Deploy (prefect.yaml, build/push/pull): https://docs-2.prefect.io/latest/guides/prefect-deploy/ - Filesystems / Storage (2.x): https://docs-2.prefect.io/latest/concepts/filesystems/ - prefect-aws collection (steps and credentials): https://prefecthq.github.io/prefect-aws/ If you share your bucket name and MinIO URL, I can plug them in and sanity-check the YAML for you.
t
@Marvin How do I skip the questions when prefect deploy
m
thought for 12.1 seconds
You can run deploy non-interactively by disabling CLI prompts. Options: - Prefix the command with --no-prompt (global flag)
Copy code
prefect --no-prompt deploy
- Or set the environment variable (useful in CI)
Copy code
export PREFECT_CLI_PROMPT=false
prefect deploy
Also make sure your prefect.yaml fully specifies everything (entrypoint, work_pool, push/pull steps, any required fields) so there’s nothing to ask about. Docs: https://docs-2.prefect.io/latest/guides/prefect-deploy/
t
For the record: I was able to fix my issue with s3_bucket = RemoteFileSystem.load("minio") s3_bucket.put_directory(to_path='generate_reports_flow', ignore_file='.prefectignore') s3_bucket.put_directory(to_path='generate_reports_flow/common', local_path='common', ignore_file='.prefectignore') s3_bucket.basepath = s3_bucket.basepath + '/generate_reports_flow' s3_bucket.save("minio-generate-reports", overwrite=True) flow.from_source( source=RemoteFileSystem.load("minio-generate-reports"), entrypoint='generate_reports.py:generate_reports_flow',