Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.

Prefect Community

Hi all! I noticed that theres a preference for workers over agents now, but workers dont support storage blocks. Does this mean storage blocks are no longer recommended? If I've stored my flows in an s3 bucket, does this mean there's no way to use them with a worker? Should I be doing something completely different?

We've got two use cases for flows: one set that runs tasks in a dask task runner that simple data ingestion and transformation, and another use case where we use `prefect-aws` to launch ECS tasks for longer running compute. Im unsure how a "worker" would change or improve these two use cases

hi <@U02MRTWFSTE> - the idea of a storage block for code storage has been replaced with a `pull` step that gets attached to a deployment

so the worker picks up a deployment's flow run from a work pool, checks out the `pull` step to figure out where to grab the code from (sometimes instead its just a matter of <https://github.com/zzstoatzz/prefect-monorepo/blob/main/prefect.yaml#L114|setting a directory>, like if you have code baked into your flow run docker image)

commonly you'll see something like:
• <https://github.com/zzstoatzz/prefect-monorepo/blob/main/prefect.yaml#L12-L16|this>, a `git_clone` pull step that applies to all deployments in my `prefect.yaml` which do not have their own `pull` step defined explicitly. when a worker picks the flow run, it clones the flow code down to the runtime machine
• <https://github.com/zzstoatzz/prefect-monorepo/blob/main/prefect.yaml#L81|this>, a `pull_from_gcs` (same for s3) pull step that applies to this particular deployment, so at runtime, the worker pulls from that gcs bucket/folder. this assumes your code is actually in that bucket, so you might want a `push` step that pushes up everything not ignored by your `.prefectignore` at `prefect deploy` time (unless you already have some CI to push up your code to blob)
you're free to add other actions in your pull/push/build step, like `pip_install_requirements` or `run_shell_script` if for example, you have a 3rd party secret service that you want to grab some secrets from just before the flow run starts (the pull step) and inject those into that machine (if you don't want to keep `Secret` blocks on the server or something)

Interesting. I guess right now the confusing part is how to integrate that into what we currently have - which is no yaml files at all - as we build deployments from the flow. Im guessing there might be some new kwargs for me to check out :slightly_smiling_face:

a similar (to build_from_flow) pythonic interface for worker/work pool deployments is in the works, but for deployments that need dynamically allocated infra (contrasted by Flow.serve, which is an all python deployment pattern), you will need to either use prefect.yaml / prefect deploy or stick with agents / block-based deployments at this time.

imo there’s a lot of nice things about the work pool/worker paradigm, like customizing the base job template on a work pool to enforce that deployment creators (as opposed to dev ops / platform folks) can only specify certain infra overrides for a given deployment, or the prefect deploy wizard that walks you through what you actually need in your yaml file etc

but i understand the impulse to keep everything python, we’ve heard a bunch of similar sentiment. feel free to reach out with snags you hit!

Sounds good, thanks Nate! We got 2.13.4 to upgrade in non-prod after wiping the db, and the changes in worker pool creation was the only other snag, so will wait for the dust to settle after upgrade prod in the morning to look into swapping over from agents to workers