https://prefect.io logo
Title
m

Matt Alhonte

05/24/2023, 1:42 AM
@Marvin Is there a simple way to register every single flow in a directory? The equivalent of
prefect register <filepath>
in Prefect 1? Assume it's a bunch of
py
files with Deployments defined in Python code.
m

Marvin

05/24/2023, 1:43 AM
In Prefect 2, you can register all flows in a directory by iterating through the Python files and registering each flow. Here's a simple Python script to achieve this:
python
import os
import glob
from prefect import Client

client = Client()

# Replace 'your_directory' with the path to your directory containing the flows
for flow_file in glob.glob("your_directory/*.py"):
    flow_module = os.path.splitext(os.path.basename(flow_file))[0]
    flow = __import__(flow_module)
    client.register(flow=flow.your_flow_name, project_name="your_project_name")
Replace
your_directory
with the path to the directory containing your Python files with flows, and replace
your_flow_name
and
your_project_name
with the appropriate names. This script will register each flow in the specified directory. Remember, the universe is a chaotic place, but at least we can bring some order to our flows.
m

Matt Alhonte

05/24/2023, 1:44 AM
oof, I was hoping that wasn't it, k
n

Nate

05/24/2023, 1:44 AM
nope! marvin got that wrong
that looks like prefect 1 stuff, id check out projects for bulk deployment options
m

Matt Alhonte

05/24/2023, 1:45 AM
@Nate Projects don't look like they're quite ready for primetime, though?
I liked my old workflow because it was pretty convenient for our CI/CD pipelines. Define everything in Python -> drop in the Flows folder -> have CI/CD just run
prefect register <Flows folder>
n

Nate

05/24/2023, 1:46 AM
yeah they’re still beta, but shouldnt undergo major interface changes. i like it personally, but yeah i totally get that
id say projects are designed to be convenient for CLI-based CICD, but depending on your needs you could make the api calls in python if you wanted. if you want a pythonic interface i think infra blocks are still the way to go, though personally im not as much of a fan of monorepo setups with the python deployment sdk
m

Matt Alhonte

05/24/2023, 1:51 AM
prefect register <Flows folder>
was run from the CLI
n

Nate

05/24/2023, 1:54 AM
ah i misunderstood what you meant by “define everything in python” ill leave this here in case you wanna check it out https://docs.prefect.io/latest/concepts/projects/#working-with-multiple-deployments
m

Matt Alhonte

05/24/2023, 1:58 AM
@Nate So with that, 1 flow would have to be 1
.py
file and 1
yml
file?
That sounds tolerable I guess but if there's any way to avoid the Context Switch, would definitely be preferable.
n

Nate

05/24/2023, 2:13 PM
1 flow would have to be 1
.py
file and 1
yml
file?
not necessarily, you could have 1
deployment.yaml
that specifies N deployments (in their own python files), as shown in the example I linked above
deployments:
  - name: deployment-1
    entrypoint: flows/hello.py:my_flow
    parameters:
        number: 42,
        message: Don't panic!
    work_pool:
        name: my-process-work-pool
        work_queue_name: primary-queue

  - name: deployment-2
    entrypoint: flows/goodbye.py:my_other_flow
    work_pool:
        name: my-process-work-pool
        work_queue_name: secondary-queue

  - name: deployment-3
    entrypoint: flows/hello.py:yet_another_flow
    work_pool:
        name: my-docker-work-pool
        work_queue_name: tertiary-queue
😬 1
m

Matt Alhonte

05/24/2023, 8:04 PM
ack, so there's no way to just define everything in a
py
file? Minimally, it's Python file + a new entry in a
yaml
file?
Or alternatively running every
py
file as a script one-by-one?
a

alex

05/24/2023, 8:30 PM
Yeah, those are currently your two options for deploying lots of flows. Would you mind sharing what you enjoy about defining your deployments in Python or what causes you hesitation about defining deployments in YAML? As Nate mentioned, projects are still in beta, so any feedback you can provide would be super appreciated!
m

Matt Alhonte

05/24/2023, 8:32 PM
@alex So, deploying a new flow being a 1-step process ("just put new file in Flow folder") was wonderfully handy, and very much in keeping with the "Reduce/Eliminate Negative Engineering" philosophy that I like about Prefect
Introducing a Context Switch ("put flow in folder -> write some boilerplate in a
yaml
or
py
file telling the system about it") just seems like a lot more extra Friction & Negative Engineering.
And running the
py
files one-by-one in another script will be an okay approximation, but I'm pretty sure it'll be a lot slower. We used to do it that way in our CI/CD pipeline, but when the
prefect register <folder>
CLI command came out it shaved like 10 minutes off every run (which of course trickles down into much tighter feedback loops)
tbh I also don't really like having to turn the Flows into little scripts at the end either. like tacking something to the effect of
if __name__ == "__main__":
    memory = <number>
    base_args = make_deployment_args(memory)
    storage = S3Bucket.load(<name>)
    deployment = Deployment.build_from_flow(
        flow=log_flow,
        parameters={"name": "Marvin"},
        storage=storage,
        **base_args
    )
to the end of every Flow isn't THAT bad, but it's more boilerplate and less Declarative than it was in 1.0
a

alex

05/24/2023, 8:46 PM
Thanks for sharing that context! Creating a similar experience in Prefect 2.0 will be tricky since deployments and flow are separate but related concepts, whereas, in 1.0, everything was bundled into the flow. How often do things like schedules and storage differ between your deployments?
m

Matt Alhonte

05/24/2023, 8:47 PM
Storage: Essentially never Schedules: Sometimes! Like we have flows that deal with different clients whose data is different sizes, where it can make sense to deploy with different-sized containers.
"We run Feature Engineering every 2 weeks for Client X, who needs containers of size Y" is a conceivable thing that'd be handy to be able to express.
a

alex

05/24/2023, 8:55 PM
Gotcha, that’s doable, but as you said, you’ll need to write some additional Python or YAML to express that. Thanks for raising these friction points! We’ll have to think a little to come up with good ways to address them.
👍 1
m

Matt Alhonte

05/24/2023, 8:56 PM
As long as it's within the same file, it's not a big deal. the Expensive friction points are Context Switches