< Marvin> I deploy a flow like this s3 bucket = RemoteFileSy Prefect Community #ask-marvin

<@ULVA73B9P> I deploy a flow like this: s3_bucket...

Tsvetelin

09/19/2025, 6:13 PM

@Marvin I deploy a flow like this: s3_bucket = RemoteFileSystem.load("minio") s3_bucket.put_directory(_to_path_='test', _ignore_file_='.prefectignore') flow.from_source( source=s3_bucket, entrypoint='test/test.py:test', ).deploy( name="test", _work_pool_name_='default', version=os.getenv("GIT_COMMIT_SHA"), build=False, push=False, image='prefect-default-worker:2.20.22', _job_variables_={"pull_policy": "Never", "network_mode": "bridge", "auto_remove": False}, ) In the flow's code I open a file to read I have deployed to minio with with open('file.csv') as ...: But the current work dir is not test, where the file exists. It is the one above it. How can I change the working dir of the flow? I want to do this to run the flow locally and not change the code

Marvin

09/19/2025, 6:13 PM

✅ thought for 345.5 seconds

Marvin

09/19/2025, 6:19 PM

You’re running from remote storage, so Prefect materializes your project into a temp “project root” then runs your entrypoint from there. By default, the flow’s cwd is that project root (the parent of

test

), not the entrypoint’s folder. That’s why

open('file.csv')

fails unless you’re in

test/

. You have two good options depending on your work pool type. 1) Process work pool (local runs) - Set the

working_dir

job variable to the subfolder you want - Example:

Copy code

flow.from_source(
    source=s3_bucket,
    entrypoint='test/test.py:test',
).deploy(
    name="test",
    work_pool_name="process-pool",
    build=False,
    push=False,
    job_variables={
        "working_dir": "test"   # cwd will be the test/ directory
    },
)

This uses the Process worker’s

working_dir

variable. You can see it in the process job template:

Copy code

prefect work-pool get-default-base-job-template --type process

2) Docker work pool From your

job_variables

(e.g.

network_mode

auto_remove

) it looks like Docker. Docker job variables don’t include

working_dir

directly, but you can set the container’s working directory via

container_create_kwargs

. Prefect downloads your source into

/opt/prefect/flows

inside the container, so point the working dir at your subfolder:

Copy code

flow.from_source(
    source=s3_bucket,
    entrypoint='test/test.py:test',
).deploy(
    name="test",
    work_pool_name="my-docker-pool",
    version=os.getenv("GIT_COMMIT_SHA"),
    build=False,
    push=False,
    image='prefect-default-worker:2.20.22',
    job_variables={
        # note: the correct field is image_pull_policy (not pull_policy)
        "image_pull_policy": "Never",
        "network_mode": "bridge",
        "auto_remove": False,
        "container_create_kwargs": {
            "working_dir": "/opt/prefect/flows/test"
        }
    },
)

Tip: If unsure of the exact path, add a quick

print(os.getcwd())

once to confirm where the project root is inside the container, then append

/test

. Notes - Default cwd when loading from remote storage is the materialized project root, not the entrypoint folder. - For Docker, prefer

container_create_kwargs.working_dir

over overriding

command

(overriding

command

can break Prefect’s auto-generated command). - In your current snippet, change

pull_policy

image_pull_policy

to match the Docker worker template. References - Process/Docker work pool templates can be inspected with:

Copy code

prefect work-pool get-default-base-job-template --type process
prefect work-pool get-default-base-job-template --type docker

- Docs: Upgrading agents to workers, Settings and Profiles Quick check: are you running on a Process pool or a Docker pool? If Docker, the

container_create_kwargs.working_dir

approach above will let you keep

open('file.csv')

working without changing your code.

Tsvetelin

09/19/2025, 6:34 PM

@Marvin give the a link to the docs showing all container_create_kwargs options

Marvin

09/19/2025, 6:34 PM

✅ thought for 198.8 seconds

Marvin

09/19/2025, 6:37 PM

There isn’t a Prefect-specific list —

container_create_kwargs

is passed straight through to the Docker SDK’s container create API. The authoritative reference for all available options is the Docker SDK docs: - Docker SDK: ContainerCollection.create kwargs

Tsvetelin

09/19/2025, 6:43 PM

@Marvin container_create_kwargs doesn't seem to work

Marvin

09/19/2025, 6:43 PM

✅ thought for 290.9 seconds

Marvin

09/19/2025, 6:48 PM

Got it — a few things can cause that. Quick checks and fixes: 1) Confirm your work pool type and worker version - Are you using a Docker work pool or a Process work pool? - If you’re using the Docker pool but running a Prefect 2.x worker image (e.g.,

prefect-default-worker:2.20.22

), Prefect 3’s

container_create_kwargs

will be ignored. In Prefect 3, Docker is provided by the prefect-docker collection; the 2.x worker image isn’t compatible with 3.x job variables. - Inspect your pool to confirm the template includes `container_create_kwargs`:

Copy code

prefect work-pool inspect <your-pool-name>
# or show the default template for docker pools
prefect work-pool get-default-base-job-template --type docker

You should see

"container_create_kwargs": "{{ container_create_kwargs }}"

in the job_configuration. 2) Use the correct job variable names - For Docker, the key is

image_pull_policy

(not

pull_policy

). Example:

Copy code

job_variables={
  "image": "your-image:tag",
  "image_pull_policy": "Never",
  "network_mode": "bridge",
  "auto_remove": False,
  "container_create_kwargs": {
    "working_dir": "/opt/prefect/flows/test"
  }
}

3) Verify the path inside the container - Add a quick

print(os.getcwd())

at flow start to confirm where Prefect materialized your project. Typically it’s under

/opt/prefect/flows

. Then target that subfolder in

working_dir

. 4) If you’re actually on a Process work pool (local runs) -

container_create_kwargs

won’t apply there. Instead, set the Process worker’s `working_dir`:

Copy code

job_variables={"working_dir": "test"}

Reference for Docker kwargs -

container_create_kwargs

maps directly to Docker SDK container create options: Docker SDK: ContainerCollection.create kwargs If you can share: - Your work pool type - The worker image/version you’re running - The exact

job_variables

you used …I can pinpoint why it’s being ignored and suggest the precise fix.

Tsvetelin

09/19/2025, 6:48 PM

@Marvin Docker work pool

Marvin

09/19/2025, 6:48 PM

✅ thought for 137.1 seconds

Marvin

09/19/2025, 6:51 PM

Great — for a Docker work pool,

container_create_kwargs

should work, but a few gotchas can make it look like it doesn’t. Here’s a tight checklist and a known-good example. Quick checks - Make sure the pool template actually includes `container_create_kwargs`:

Copy code

prefect work-pool get-default-base-job-template --type docker
prefect work-pool inspect <your-docker-pool>

You should see

"container_create_kwargs": "{{ container_create_kwargs }}"

. - Ensure the worker is Prefect 3.x with the docker extra installed (where the worker process runs):

Copy code

prefect version
pip show prefect-docker  # or ensure you started the worker in an env with `prefect[docker]`

- Use the correct key names: -

image_pull_policy

(not

pull_policy

) -

container_create_kwargs

values must match Docker SDK kwargs (e.g.,

working_dir

, lowercase) - Turn on DEBUG logs for the worker to see the exact kwargs passed to Docker:

Copy code

PREFECT_LOGGING_LEVEL=DEBUG prefect worker start -p <your-docker-pool>

Known-good deployment example

Copy code

flow.from_source(
    source=s3_bucket,
    entrypoint='test/test.py:test',
).deploy(
    name="test",
    work_pool_name="my-docker-pool",
    build=False,
    push=False,
    image='your-runtime-image:tag',  # must contain Prefect 3.x
    job_variables={
        "image_pull_policy": "Never",
        "network_mode": "bridge",
        "auto_remove": False,
        "container_create_kwargs": {
            "working_dir": "/opt/prefect/flows/test"
        }
    },
)

Tips - Inside the container, Prefect typically materializes your project under

/opt/prefect/flows

. Add a quick

print(os.getcwd())

at flow start once to confirm and adjust

working_dir

.../test

. - If your image does not contain Prefect 3.x, the auto-generated run command can fail or behave unexpectedly. Use an image that has Prefect 3.x installed. If this still doesn’t take effect, please share: - Output of

prefect work-pool inspect <pool>

- The worker’s

prefect version

and whether

prefect-docker

is installed - A snippet of the worker logs around “Creating Docker container …” (DEBUG level) Reference for valid kwargs - Docker SDK: ContainerCollection.create kwargs

Tsvetelin

09/19/2025, 6:57 PM

@Marvin Version: 2.20.22 API version: 0.8.4 Python version: 3.10.10 Git commit: 7bdb7b8e Built: Thu, Sep 11, 2025 9:55 AM OS/Arch: darwin/arm64 Profile: pd-flow-local Server type: server Current working directory: /opt/prefect/remote-file-system-minio "container_create_kwargs": { "anyOf": [ { "additionalProperties": true, "type": "object" }, { "type": "null" } ], "default": null, "description": "Configuration for containers created by workers. See the [

docker-py

documentation](https://docker-py.readthedocs.io/en/stable/containers.html) for accepted values.", "title": "Container Configuration" } }, "type": "object" } } 185458.662 | DEBUG | Flow run 'fervent-jerboa' - Changing working directory to 'remote-file-system-minio'

Marvin

09/19/2025, 6:57 PM

✅ thought for 213.9 seconds

Marvin

09/19/2025, 7:00 PM

Thanks — that info helps. You’re running Prefect 2.20.22 on a Docker work pool. In Prefect 2, the engine changes the working directory after your container starts to the materialized project root (you can see it in your log: “Changing working directory to 'remote-file-system-minio'”). This happens after Docker creates the container, so

container_create_kwargs.working_dir

won’t stick — the engine overrides it. That’s why it “doesn’t work.” Best no-code-change workaround in 2.x - Make the directory you want to be the cwd the project root in remote storage. In your case, upload the contents of

test/

to the remote root and point your entrypoint at

test.py

directly:

Copy code

# upload the contents of ./test into the remote root
s3_bucket.put_directory(from_path='test', to_path='.')

flow.from_source(
    source=s3_bucket,
    entrypoint='test.py:test',   # now lives at project root
).deploy(
    name="test",
    work_pool_name='default',
    version=os.getenv("GIT_COMMIT_SHA"),
    build=False,
    push=False,
    image='prefect-default-worker:2.20.22',
    job_variables={"image_pull_policy": "Never", "network_mode": "bridge", "auto_remove": False},
)

Now Prefect will set cwd to the project root (which used to be your

test/

), so

open('file.csv')

works without changing code. Notes - Keep using

image_pull_policy

(not

pull_policy

) for Docker pools. - If you decide to move to Prefect 3.x later, Docker pools support passing container create kwargs, and Process pools have a

working_dir

variable. But in 2.x the engine-level chdir takes precedence. Reference for Docker kwargs (if you need them later) - Docker SDK: ContainerCollection.create kwargs If adjusting the upload layout isn’t feasible, tell me a bit more about your constraints and we can consider other options.

Tsvetelin

09/20/2025, 5:11 PM

@Marvin is there a way to specify the source as a bucket and a folder in the bucket? All my flows are in one bucket and when I execute one flow I don't want to pull all other flows Currently I upload my code in the S3 bucket and pass it to flow.from_source() s3_bucket = RemoteFileSystem.load("minio") flow.from_source( source=s3_bucket, entrypoint='., ).deploy()

Marvin

09/20/2025, 5:12 PM

✅ thought for 252.4 seconds

Marvin

09/20/2025, 5:16 PM

Yes. Point your RemoteFileSystem’s basepath at the specific subfolder you want, then make the

entrypoint

relative to that folder. Prefect will only pull that subfolder instead of your whole bucket. Two easy ways to do this: 1) Create/use a block that’s already scoped to the subfolder - Configure your

RemoteFileSystem

block with

basepath = <s3://my-bucket/path/to/one-flow>

- Then deploy with an entrypoint relative to that folder:

Copy code

from prefect import flow
from prefect.filesystems import RemoteFileSystem

s3 = RemoteFileSystem.load("minio-one-flow")  # basepath points to <s3://my-bucket/path/to/one-flow>

Flow.from_source(
    source=s3,
    entrypoint="main.py:my_flow",  # relative to basepath
).deploy(name="my-flow")

2) Temporarily override the basepath in code (no need to save a new block) - Load your existing block, modify

basepath

in memory, and use it:

Copy code

from prefect import Flow
from prefect.filesystems import RemoteFileSystem

s3 = RemoteFileSystem.load("minio")  # e.g., currently points to <s3://my-bucket>
s3.basepath = "<s3://my-bucket/path/to/one-flow>"  # narrow to a subfolder

Flow.from_source(
    source=s3,
    entrypoint="main.py:my_flow",
).deploy(name="my-flow")

Notes: -

entrypoint

should be

<relative-path-to-file>.py:<flow_object_name>

relative to the basepath. - When using a

RemoteFileSystem

source, Prefect will fetch everything under

basepath

. There isn’t a per-file include/exclude for remote storage today, so scoping

basepath

to the exact folder you need is the best way to avoid pulling other flows. - If you have multiple flows in the same bucket, consider creating one

RemoteFileSystem

block per flow (each with its own subfolder basepath), or override

basepath

in code as shown above. If you share your current

minio

block’s

basepath

and your desired folder layout, I can suggest the exact

entrypoint

and basepath strings to use.

Tsvetelin

09/20/2025, 5:23 PM

@Marvin it is not working. The change exists only in the current context and is not persisted aftehr the deployment. So when I execute the flow it pulls from s3://my-bucket everythin

Marvin

09/20/2025, 5:23 PM

✅ thought for 119.1 seconds

Tsvetelin

09/20/2025, 5:24 PM

@Marvin is there another way to make the deployment to pull the code from remote file system?

Marvin

09/20/2025, 5:24 PM

✅ thought for 125.1 seconds

Marvin

09/20/2025, 5:25 PM

Thanks for the detail — what you’re seeing is expected. When you call

Flow.from_source(...).deploy()

, the deployment stores a reference to the saved block. At run time, the worker loads the block from Prefect and uses the block’s persisted

basepath

. Any in-memory change like

s3.basepath = "..."

that you didn’t save will be lost, so it will still pull from

<s3://my-bucket>

. You have a few solid options: 1) Create a block per subfolder (recommended) - Keep your existing

minio

block for the bucket root. - Create a new

RemoteFileSystem

block scoped to the subfolder and use that in your deployment.

Copy code

from prefect.filesystems import RemoteFileSystem
from prefect import Flow

base = RemoteFileSystem.load("minio")  # has your MinIO credentials/settings

# Create a new block for this one flow
flow_fs = RemoteFileSystem(
    basepath="<s3://my-bucket/path/to/one-flow>",
    settings=base.settings,  # reuse credentials/settings
)
flow_fs.save("minio-one-flow", overwrite=True)

Flow.from_source(
    source=RemoteFileSystem.load("minio-one-flow"),
    entrypoint="main.py:my_flow",  # relative to the basepath above
).deploy(name="my-flow")

- At run time, Prefect will only pull keys under

<s3://my-bucket/path/to/one-flow>

. 2) Persist your change to the existing block (affects anything else using it) - If you really want to reuse the same block name, you must save the change before deploying:

Copy code

from prefect.filesystems import RemoteFileSystem
from prefect import Flow

s3 = RemoteFileSystem.load("minio")
s3.basepath = "<s3://my-bucket/path/to/one-flow>"
s3.save("minio", overwrite=True)  # persist; careful: this impacts other deployments using this block

Flow.from_source(
    source=RemoteFileSystem.load("minio"),
    entrypoint="main.py:my_flow",
).deploy(name="my-flow")

3) Pass a plain S3 URL as the source (no block), using environment credentials - If you don’t want additional blocks, you can pass the URL directly and rely on env vars for credentials (e.g.,

AWS_ACCESS_KEY_ID

AWS_SECRET_ACCESS_KEY

, and your MinIO S3 endpoint env var).

Copy code

from prefect import Flow

Flow.from_source(
    source="<s3://my-bucket/path/to/one-flow>",
    entrypoint="main.py:my_flow",
).deploy(name="my-flow")

- This will also pull just the

path/to/one-flow

prefix. Ensure your MinIO credentials/endpoint are available in the environment where the flow runs. Notes: - There is no include/exclude filter for remote storage pulls today; scoping via

basepath

(or a direct prefix URL) is the way to avoid pulling other flows. -

entrypoint

must be relative to the folder you’ve scoped as the

basepath

. If you share your bucket layout and flow filenames, I can give you the exact

basepath

and

entrypoint

strings, or help script a quick “one block per flow” setup.

Marvin

09/20/2025, 5:26 PM

Yes—besides

Flow.from_source(...)

, you can have deployments pull code from a remote file system using project “pull steps” in a prefect.yaml, or by baking your code into a container image. Here are your main options: 1) Use prefect.yaml pull steps (CLI-based) - Define steps that fetch code from your remote FS at run time, then deploy via

prefect deploy

. - You can either reference a

RemoteFileSystem

block or a direct S3/MinIO URL. Example A: pull with a block

Copy code

# prefect.yaml
name: my-project

pull:
  - prefect.deployments.steps.pull.pull_with_block:
      block_type_slug: remote-file-system
      block_document_name: minio-one-flow   # a RemoteFileSystem block scoped to the subfolder

deployments:
  - name: my-flow
    entrypoint: main.py:my_flow

Example B: pull from a URL (no block)

Copy code

# prefect.yaml
name: my-project

pull:
  - prefect.deployments.steps.pull.pull_from_remote_storage:
      url: <s3://my-bucket/path/to/one-flow>
      # these kwargs mirror the RemoteFileSystem settings for s3fs/minio
      key: ${AWS_ACCESS_KEY_ID}
      secret: ${AWS_SECRET_ACCESS_KEY}
      client_kwargs:
        endpoint_url: <https://minio.example.com>

deployments:
  - name: my-flow
    entrypoint: main.py:my_flow

Then run:

Copy code

prefect deploy

Notes: -

pull_with_block

uses an existing

RemoteFileSystem

block;

pull_from_remote_storage

uses a URL + settings directly. - Your

entrypoint

is relative to the pulled code root. - Make sure your worker environment includes the required driver (e.g.,

s3fs

for S3/MinIO). 2) Keep Python-only but scope the block to the subfolder - What you’re doing works; just ensure

RemoteFileSystem.basepath

points to the specific subfolder so only that path is fetched:

Copy code

s3 = RemoteFileSystem.load("minio")
s3.basepath = "<s3://my-bucket/path/to/one-flow>"

Flow.from_source(source=s3, entrypoint="main.py:my_flow").deploy(name="my-flow")

3) Package code into a container image - If you set

image=...

when deploying and run on a Docker/Kubernetes work pool, your code is in the image so no remote pull is needed at run time. If you’d like, share your bucket layout and preferred method (CLI with prefect.yaml or Python-only), and I’ll provide an exact config. For reference, the built-in pull steps are in: - pull_from_remote_storage and pull_with_block: prefect/deployments/steps/pull.py

Tsvetelin

09/21/2025, 6:09 AM

@Marvin with git-based storage can I clone only a specific part of the repo e.g the flow directory?

Marvin

09/21/2025, 6:09 AM

✅ thought for 41.8 seconds

Marvin

09/21/2025, 6:10 AM

Yes. With git-based storage you can clone only specific paths using the git pull steps in prefect.yaml. Both git_clone and agit_clone support a directories parameter for sparse checkout so only those subfolders are fetched. Example prefect.yaml:

Copy code

name: my-project

pull:
  - prefect.deployments.steps.pull.git_clone:
      repository: <https://github.com/your-org/your-repo.git>
      branch: main
      # only fetch these subfolders (sparse checkout)
      directories:
        - flows/my-flow-dir
      # for private repos, use one of:
      # access_token: ${GITHUB_TOKEN}
      # credentials: your-git-credentials-block-name

  # optional: set the working directory for the entrypoint
  - prefect.deployments.steps.pull.set_working_directory:
      directory: flows/my-flow-dir

deployments:
  - name: my-flow
    entrypoint: main.py:my_flow

Then run:

Copy code

prefect deploy

Notes: - directories is a list of paths relative to the repo root. - You can use agit_clone instead of git_clone; it also supports directories. - If you use Flow.from_source("https://github.com/org/repo", ...), it will clone the repo; partial checkout is not configurable there—use the prefect.yaml pull step for sparse checkout. - Make sure git is available in the worker environment; include credentials via access_token or a credentials block for private repos. References: - Pull steps source (shows directories support): prefect/deployments/steps/pull.py

Tsvetelin

09/21/2025, 6:19 AM

@Marvin Can I do it with a deployment scirpt flow.from_source?

Marvin

09/21/2025, 6:19 AM

✅ thought for 17.3 seconds

Marvin

09/21/2025, 6:20 AM

Short answer: not with Flow.from_source alone. In Prefect 3.x, Flow.from_source("https://github.com/org/repo", ...) will clone the whole repo; it doesn’t expose a way to limit to specific subdirectories. Workable pattern in a deployment script: use the git pull step to do a sparse checkout (directories=...), then pass the local path to Flow.from_source. Example:

Copy code

import os
from prefect import Flow
from prefect.deployments.steps.pull import git_clone

# Sparse clone only the flow directory you need
res = git_clone(
    repository="<https://github.com/your-org/your-repo.git>",
    branch="main",
    directories=["flows/my-flow-dir"],  # relative to repo root
    # for private repos, one of:
    # access_token=os.environ["GITHUB_TOKEN"],
    # credentials=YourGitCredentialsBlock.load("my-git-creds"),
)

clone_dir = res["directory"]  # path to the checked-out code

Flow.from_source(
    source=clone_dir,  # local path
    entrypoint="flows/my-flow-dir/main.py:my_flow",
).deploy(name="my-flow")

Notes: - Requires git in the environment. - directories uses sparse checkout so only those paths are fetched. - If you prefer to stay fully declarative, the same sparse checkout is available in prefect.yaml via the pull step: - prefect.deployments.steps.pull.git_clone with directories=[...] Source for the pull steps (shows directories support): prefect/deployments/steps/pull.py If you share your repo layout and target flow path, I can tailor the exact directories and entrypoint.

Tsvetelin

09/21/2025, 6:23 AM

@Marvin can I have different prefect.yaml for each flow?

Marvin

09/21/2025, 6:24 AM

✅ thought for 26.6 seconds

Marvin

09/21/2025, 6:24 AM

Yes. You can keep a separate prefect.yaml per flow and deploy them independently. You have a few ways to do it: 1) One YAML per flow (recommended for different pull/clone logic) - Put a prefect.yaml next to each flow (or in each flow’s folder). - Run deploy pointing to the file:

Copy code

prefect deploy --prefect-file path/to/flowA/prefect.yaml
prefect deploy --prefect-file path/to/flowB/prefect.yaml

- Or cd into the folder and run:

Copy code

cd path/to/flowA
prefect deploy

2) Programmatic deploy with a specific YAML file - If you’re using a deployment script, you can call the programmatic API and pass the YAML file path:

Copy code

from prefect import deploy

deploy(
    entrypoint="./flows/my_flow/main.py:my_flow",
    names=["my-flow"],
    prefect_file="path/to/flowA/prefect.yaml",
)

3) Single repo, multiple deployments in one YAML - You can also keep one top-level prefect.yaml and define multiple deployments (each with its own entrypoint). If each flow needs different pull/clone behavior (e.g., different sparse-checkout directories), keeping separate YAMLs is usually cleaner. If you share your folder layout, I can sketch the exact commands/YAML structure for each flow.

Tsvetelin

09/21/2025, 6:28 AM

@Marvin Can I have a single yaml and execute flow deployment per flow

Marvin

09/21/2025, 6:28 AM

✅ thought for 15.2 seconds

Marvin

09/21/2025, 6:28 AM

Yes. You can keep a single prefect.yaml with multiple deployments and deploy them one-by-one. How to structure it:

Copy code

name: my-project

# optional shared pull steps if all flows use the same
# pull:
#   - prefect.deployments.steps.pull.git_clone:
#       repository: <https://github.com/org/repo.git>
#       branch: main

deployments:
  - name: flow-a/prod
    entrypoint: flows/flow_a.py:flow_a
    # work_pool, parameters, schedule, etc.

  - name: flow-b/prod
    entrypoint: flows/flow_b.py:flow_b

Deploy only one deployment from the file: - Using the CLI:

Copy code

prefect deploy --name flow-a/prod
# or select with a pattern
prefect deploy --name 'flow-b/*'
# you can also pass multiple --name flags
prefect deploy --name flow-a/prod --name flow-b/prod

- Programmatically:

Copy code

from prefect import deploy

deploy(
    names=["flow-a/prod"],  # deploy just this one
    # prefect_file defaults to ./prefect.yaml; set explicitly if needed:
    # prefect_file="path/to/prefect.yaml",
)

Notes: - Running

prefect deploy

with no

--name

filter will deploy all deployments defined in the YAML. - If different flows require different pull/clone logic, you can either: - Keep separate prefect.yaml files, or - Do programmatic deploys where you run the appropriate pull step first, then call

deploy(names=[...])

Tsvetelin

09/21/2025, 7:37 AM

@Marvin change this to baking the code of the flow into a separate image. I have the docker file in the current directory s3_bucket = RemoteFileSystem.load("minio") s3_bucket.put_directory(to_path='generate_reports_flow', ignore_file='.prefectignore') s3_bucket.put_directory(to_path='generate_reports_flow/common', local_path='common', ignore_file='.prefectignore') flow.from_source( source=s3_bucket, entrypoint='generate_reports_flow/generate_reports.py:generate_reports_flow', ).deploy( name="generate_reports_flow_deployment", work_pool_name='default', version=os.getenv("GIT_COMMIT_SHA"), build=False, push=False, image='prefect-flow-container:2.20.22', job_variables={ "image_pull_policy": "Never", "network_mode": "bridge", "auto_remove": False }, parameters={'clients': [""], 'languages': ["bg"], 'gdb_diff_recipients': [}, )

Marvin

09/21/2025, 7:37 AM

✅ thought for 141.3 seconds

Marvin

09/21/2025, 7:39 AM

Got it — to bake your flow code into a Docker image (using the Dockerfile in your current directory) you can drop the remote storage and build the image during deploy. Here’s a minimal change: - Remove the

RemoteFileSystem

and

flow.from_source(...)

- Use a

DockerImage

with your Dockerfile and set

build=True

- Keep

push=False

if the worker can access the local image; set

push=True

if it needs to pull from a registry Example (Prefect 3.x):

Copy code

import os
from prefect.docker import DockerImage
from generate_reports_flow.generate_reports import generate_reports_flow

generate_reports_flow.deploy(
    name="generate_reports_flow_deployment",
    work_pool_name="default",
    version=os.getenv("GIT_COMMIT_SHA"),
    image=DockerImage(
        name="prefect-flow-container",
        tag="2.20.22",
        dockerfile="Dockerfile",  # uses Dockerfile in the current directory
    ),
    build=True,
    push=False,  # set True if your worker needs to pull from a registry
    job_variables={
        "image_pull_policy": "Never",  # K8s only; ignored by Docker workers
        "network_mode": "bridge",
        "auto_remove": False,
    },
    parameters={
        "clients": ["Електрохолд"],
        "languages": ["bg"],
        "gdb_diff_recipients": [
            "<mailto:cvetelin.andreev@dreamix.eu|cvetelin.andreev@dreamix.eu>",
            "<mailto:georgi.yanev@dreamix.eu|georgi.yanev@dreamix.eu>",
            "<mailto:yordan.merakov@power-drone.com|yordan.merakov@power-drone.com>",
        ],
    },
)

Important notes: - Make sure your Dockerfile copies the project into the image at a path that matches your import/module layout. A common pattern is:

Copy code

FROM python:3.11-slim
  WORKDIR /opt/prefect/flows
  COPY . .
  RUN pip install -U pip && pip install -r requirements.txt && pip install "prefect>=3,<4"

With this, the entrypoint

generate_reports_flow.generate_reports:generate_reports_flow

will resolve inside the container as long as your folder layout is the same. - If you’re using a Docker work pool on the same machine where you’re building,

push=False

is fine and

image_pull_policy

is ignored. - If you’re using a Kubernetes work pool, you’ll typically set

push=True

, push to a registry your cluster can access, and keep

image_pull_policy

"IfNotPresent"

"Never"

depending on your workflow. Docs for reference: - Create deployments - Store flow code (bake into image vs remote storage) - Kubernetes work pools and images

Tsvetelin

09/21/2025, 7:41 AM

Tsvetelin

09/21/2025, 7:42 AM

@Marvin does this mean that in order to make the flow available in production, I need to build the image, push it to the registry and then the prod worker will pull it from the registry? Is there another way?

Marvin

09/21/2025, 7:42 AM

✅ thought for 27.6 seconds

Marvin

09/21/2025, 7:43 AM

Short answer: If you bake the code into the image, then yes — in production you typically need to build the image, push it to a registry your prod workers can access, and the workers will pull it. Alternatives exist if you want to avoid pushing a new image for each code change. Common production patterns: 1) Bake code into the image (immutable, recommended for reproducibility) - Build the image (locally or in CI), push to a registry, have workers pull it. - Pros: reproducible, single artifact. Cons: requires registry and push/pull on every change. - If your prod worker runs on the same node where you built the image, you can skip pushing and let it run the local image — but that doesn’t scale to multiple nodes. - K8s note:

image_pull_policy: Never

only works if the image already exists on every node. 2) Store code remotely and pull at runtime (no need to push images for each change) - Use

flow.from_source(...)

to point at: - Git:

source="<https://github.com/org/repo>"

with an

entrypoint="path/to/flow.py:my_flow"

and an optional ref (branch/tag/SHA) to pin versions. - Object storage (S3/MinIO/GCS/Azure): your existing

RemoteFileSystem

approach. - Workers run a stable base image (with Python/Prefect/deps) and download the code at run time. - Pros: fast iteration, no registry churn. Cons: you must ensure the source is reachable and versioned. - Docs: Store flow code, Create deployments 3) Shared volume mount with code - Mount a shared NFS/PVC (K8s) or Docker volume containing your code into the worker/flow container; set the deployment entrypoint to that path. - Pros: no image rebuilds. Cons: infra complexity; careful with consistency across nodes. - K8s infra reference: Kubernetes work pool guide 4) Package-and-install approach - Package your flow as a Python package, publish to a private index, and either: - Bake the package into a base image (rebuild only when deps change), or - Install the package at container startup (dynamic, but adds startup time and requires network/private index access). Which option is best depends on your infra and release cadence: - Need strict immutability/reproducibility? Bake and push images. - Want faster iteration with fewer image pushes? Use

from_source

(Git or MinIO/S3) with a stable base image. - Have a shared filesystem across workers? Consider a volume mount. If you share a bit about your prod environment (Docker or Kubernetes work pool? single node vs cluster? access to a registry? MinIO available in prod?), I can recommend the most practical setup and provide a tailored example.

Tsvetelin

09/21/2025, 7:57 AM

@Marvin all options are too complex. I don't want to introduce a registry or other infrastructure. I want to put my flow's code somewhere and make the worker pull only that code and execute it.

Marvin

09/21/2025, 7:57 AM

✅ thought for 89.5 seconds

Marvin

09/21/2025, 7:58 AM

Yes — you can avoid a registry entirely by having the worker pull just the code at runtime. Simplest setup: Process work pool + remote code (MinIO/S3) - Keep your code in MinIO using your existing

RemoteFileSystem

block. - Run a Process work pool (no containers). The worker downloads the code at run time and executes it in the host Python environment. - Ensure the worker machine has Prefect 3.x and your dependencies installed once. Example (adjusting your snippet to use Process work pool — no image needed):

Copy code

import os
from prefect.filesystems import RemoteFileSystem
from generate_reports_flow.generate_reports import generate_reports_flow

# Upload your code to MinIO (same as you do now)
s3_bucket = RemoteFileSystem.load("minio")
s3_bucket.put_directory(to_path="generate_reports_flow", ignore_file=".prefectignore")
s3_bucket.put_directory(to_path="generate_reports_flow/common", local_path="common", ignore_file=".prefectignore")

# Create a deployment that references the code in MinIO
generate_reports_flow.from_source(
    source=s3_bucket,
    entrypoint="generate_reports_flow/generate_reports.py:generate_reports_flow",
).deploy(
    name="generate_reports_flow_deployment",
    work_pool_name="my-process-pool",  # a Process work pool
    version=os.getenv("GIT_COMMIT_SHA"),
    build=False,
    push=False,
    parameters={
        "clients": ["Електрохолд"],
        "languages": ["bg"],
        "gdb_diff_recipients": [
            "<mailto:cvetelin.andreev@dreamix.eu|cvetelin.andreev@dreamix.eu>",
            "<mailto:georgi.yanev@dreamix.eu|georgi.yanev@dreamix.eu>",
            "<mailto:yordan.merakov@power-drone.com|yordan.merakov@power-drone.com>",
        ],
    },
)

What this does - The deployment stores a reference to your MinIO block + entrypoint. - When a run starts, the Process worker fetches the code to a working directory and runs it with the host’s Python environment. - No Docker image builds, no registry, no shared volumes. Requirements - Install Prefect 3.x and your flow’s dependencies on the worker host. - Make sure the worker has access to the MinIO block (credentials set in the block). Docs - Store flow code (Git/S3/MinIO) - Create deployments If this matches your constraints, I can help you set up the Process work pool and confirm what to install on the worker host.

Tsvetelin

09/21/2025, 8:06 AM

@Marvin same thing. Make me a yaml file for that deployment. pull only the code flow.

Marvin

09/21/2025, 8:06 AM

✅ thought for 402.6 seconds

Marvin

09/21/2025, 8:13 AM

Absolutely — here’s a minimal prefect.yaml that makes the worker pull only your flow code from your existing MinIO RemoteFileSystem block (no images/registry). It targets a Process work pool. Place this file at your project root, adjust the block name/path if needed, then run

prefect deploy

from that directory.

Copy code

name: generate-reports
prefect-version: ">=3.0.0"

deployments:
  - name: generate_reports_flow_deployment
    entrypoint: generate_reports_flow/generate_reports.py:generate_reports_flow

    # Source: pull code at runtime from your RemoteFileSystem block ("minio")
    # path is the root within the bucket where your code lives.
    source:
      block: remote-file-system/minio
      path: ""  # if your code is under "generate_reports_flow", you can leave entrypoint as-is

    work_pool:
      name: default  # a Process work pool; no containers, no registry

    parameters:
      clients: ["Електрохолд"]
      languages: ["bg"]
      gdb_diff_recipients:
        - "<mailto:cvetelin.andreev@dreamix.eu|cvetelin.andreev@dreamix.eu>"
        - "<mailto:georgi.yanev@dreamix.eu|georgi.yanev@dreamix.eu>"
        - "<mailto:yordan.merakov@power-drone.com|yordan.merakov@power-drone.com>"

    # optional metadata
    version: "{{ env.GIT_COMMIT_SHA }}"
    tags: []
    description: "Generate reports flow"

Notes: - Ensure the RemoteFileSystem block named

minio

exists and points to your MinIO bucket. - Make sure your code is uploaded to that bucket (e.g., the

generate_reports_flow/

folder and its contents). You can keep using your existing

put_directory

step in CI or a one-time upload. - The Process worker host must have Prefect 3.x and your Python dependencies installed once (no Docker). Docs: - Store flow code - prefect.yaml reference If your code in MinIO is rooted at

generate_reports_flow/

, you can also set: -

source.path: "generate_reports_flow"

- and shorten

entrypoint

generate_reports.py:generate_reports_flow

Want me to tailor this to a different work pool name or MinIO path?

Tsvetelin

09/21/2025, 8:25 AM

@Marvin I'm using prefect 2. Where I can see the push action full docs. I need to push the code to remote file system minio

Marvin

09/21/2025, 8:25 AM

✅ thought for 178.2 seconds

Marvin

09/21/2025, 8:28 AM

Thanks for clarifying — since you’re on Prefect 2.x: Where to read about “push” in Prefect 2 - The push step is part of the Prefect 2 “projects/prefect.yaml” deployment workflow. See: - Projects & prefect.yaml (2.x): https://docs-2.prefect.io/latest/guides/prefect-deploy/ - Filesystems / Storage blocks (2.x): https://docs-2.prefect.io/latest/concepts/filesystems/ Practical way to push code to MinIO in Prefect 2 In Prefect 2, the most straightforward path is to set a storage block on your deployment. When you build/apply the deployment, Prefect will upload (“push”) your code to that storage. For MinIO, use the RemoteFileSystem block (S3-compatible). Example: create a RemoteFileSystem block for MinIO and use it on a deployment

Copy code

from prefect.filesystems import RemoteFileSystem
from prefect.deployments import Deployment
from generate_reports_flow.generate_reports import generate_reports_flow

# 1) Create and save a RemoteFileSystem block that points to MinIO
#    - basepath uses the s3:// scheme for s3fs
#    - settings pass the custom endpoint to MinIO
minio_fs = RemoteFileSystem(
    basepath="<s3://my-bucket/prefix>",
    settings={
        "client_kwargs": {
            "endpoint_url": "<http://minio:9000>"  # or your MinIO URL
        }
    },
)
minio_fs.save("minio", overwrite=True)

# Ensure AWS creds for MinIO are available (e.g., env vars):
#   AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY
# Optionally AWS_DEFAULT_REGION (any value, e.g., "us-east-1")

# 2) Load the block and build/apply a deployment with that storage
storage = RemoteFileSystem.load("minio")

Deployment.build_from_flow(
    flow=generate_reports_flow,
    name="generate_reports_flow_deployment",
    storage=storage,
    parameters={
        "clients": ["Електрохолд"],
        "languages": ["bg"],
        "gdb_diff_recipients": [
            "<mailto:cvetelin.andreev@dreamix.eu|cvetelin.andreev@dreamix.eu>",
            "<mailto:georgi.yanev@dreamix.eu|georgi.yanev@dreamix.eu>",
            "<mailto:yordan.merakov@power-drone.com|yordan.merakov@power-drone.com>",
        ],
    },
).apply()

What this does - The deployment will reference your MinIO storage. - Building/applying uploads the flow code to MinIO (the “push”). - Your worker will pull the code from MinIO at run time. If you’d prefer doing this via prefect.yaml and the CLI, the same concept applies: configure your project to use a storage block and run

prefect deploy

. The docs above walk through the project/prefect.yaml workflow in 2.x. If you share your MinIO endpoint/credentials model (env vars vs. blocks), I can provide a prefect.yaml template for 2.x that wires it up for

prefect deploy

. Also, if you can plan a future upgrade, Prefect 3 simplifies this flow with

flow.from_source(...).deploy(...)

Tsvetelin

09/21/2025, 8:41 AM

@Marvin How do I push the code to minio remote file system with prefect yaml

Marvin

09/21/2025, 8:41 AM

✅ thought for 61.9 seconds

Marvin

09/21/2025, 8:42 AM

Great — here’s a Prefect 2.x-prefect.yaml that pushes your project code to MinIO (S3-compatible) during deploy, then has workers pull only that code at run time. This uses the prefect-aws collection’s S3 push/pull steps. Prereqs - Install prefect-aws where you run prefect deploy: pip install "prefect-aws>=0.3.0" - Create an AwsCredentials block configured for MinIO (endpoint URL points to MinIO):

Copy code

from prefect_aws.credentials import AwsCredentials

  AwsCredentials(
      aws_access_key_id="YOUR_MINIO_ACCESS_KEY",
      aws_secret_access_key="YOUR_MINIO_SECRET_KEY",
      aws_endpoint_url="<http://minio:9000>",  # adjust to your MinIO URL
  ).save("minio-creds", overwrite=True)

- Ensure your worker uses a Process work pool (no containers needed) and has your Python deps installed. prefect.yaml (2.x)

Copy code

name: generate-reports
prefect-version: ">=2.14.0"

# Nothing to build (no images)
build: []

# Push your project files to MinIO (S3-compatible) on deploy
push:
  - prefect_aws.deployments.steps.push_to_s3:
      requires: "prefect-aws>=0.3.0"
      id: code_push
      bucket: "my-minio-bucket"              # <-- change
      folder: "generate-reports/{{ git_sha }}"  # versioned folder; change if you like
      credentials: "{{ prefect.blocks.aws-credentials.minio-creds }}"
      # endpoint_url can also be moved into the AwsCredentials block; keeping here is OK
      endpoint_url: "<http://minio:9000>"      # <-- change

# Workers pull just the code when the flow runs
pull:
  - prefect_aws.deployments.steps.pull_from_s3:
      requires: "prefect-aws>=0.3.0"
      id: code_pull
      bucket: "{{ code_push.bucket }}"
      folder: "{{ code_push.folder }}"
      credentials: "{{ prefect.blocks.aws-credentials.minio-creds }}"
      endpoint_url: "<http://minio:9000>"

deployments:
  - name: generate_reports_flow_deployment
    entrypoint: generate_reports_flow/generate_reports.py:generate_reports_flow
    work_pool:
      name: my-process-pool                 # a Process work pool; no registry needed
    parameters:
      clients: ["Електрохолд"]
      languages: ["bg"]
      gdb_diff_recipients:
        - "<mailto:cvetelin.andreev@dreamix.eu|cvetelin.andreev@dreamix.eu>"
        - "<mailto:georgi.yanev@dreamix.eu|georgi.yanev@dreamix.eu>"
        - "<mailto:yordan.merakov@power-drone.com|yordan.merakov@power-drone.com>"

How it works - prefect deploy runs the push step, uploading your project (respecting .prefectignore) to s3://my-minio-bucket/generate-reports/<git_sha>. - At run time, the Process worker executes the pull step, downloads the code, and runs your flow from the entrypoint. - No Docker images or registries involved. Tips - Add a .prefectignore file to exclude large/unneeded files from upload. - If you don’t have git available, you can change folder to include a timestamp or a static folder name. - If you prefer using an S3 Bucket block instead of steps, you can also set storage on the deployment, but steps are the most direct way to “push code” via prefect.yaml in 2.x. Docs - Prefect 2 Projects & Deploy (prefect.yaml, build/push/pull): https://docs-2.prefect.io/latest/guides/prefect-deploy/ - Filesystems / Storage (2.x): https://docs-2.prefect.io/latest/concepts/filesystems/ - prefect-aws collection (steps and credentials): https://prefecthq.github.io/prefect-aws/ If you share your bucket name and MinIO URL, I can plug them in and sanity-check the YAML for you.

Tsvetelin

09/21/2025, 9:09 AM

@Marvin How do I skip the questions when prefect deploy

Marvin

09/21/2025, 9:09 AM

✅ thought for 12.1 seconds

Marvin

09/21/2025, 9:10 AM

You can run deploy non-interactively by disabling CLI prompts. Options: - Prefix the command with --no-prompt (global flag)

Copy code

prefect --no-prompt deploy

- Or set the environment variable (useful in CI)

Copy code

export PREFECT_CLI_PROMPT=false
prefect deploy

Also make sure your prefect.yaml fully specifies everything (entrypoint, work_pool, push/pull steps, any required fields) so there’s nothing to ask about. Docs: https://docs-2.prefect.io/latest/guides/prefect-deploy/

Tsvetelin

09/21/2025, 9:59 AM

For the record: I was able to fix my issue with s3_bucket = RemoteFileSystem.load("minio") s3_bucket.put_directory(to_path='generate_reports_flow', ignore_file='.prefectignore') s3_bucket.put_directory(to_path='generate_reports_flow/common', local_path='common', ignore_file='.prefectignore') s3_bucket.basepath = s3_bucket.basepath + '/generate_reports_flow' s3_bucket.save("minio-generate-reports", overwrite=True) flow.from_source( source=RemoteFileSystem.load("minio-generate-reports"), entrypoint='generate_reports.py:generate_reports_flow',

7 Views

Open in Slack

Previous Next