Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.

Prefect Community

I have 3 flows for a data workflow (flow1=extracts the data, flow2= makes data ready, flow 3= creates training set)
For the next step of work, we want to run this workflow (3 flows) in parallel for different parameters +1000 times .
Our concerns are:
1- memory
2- how to set up the pipeline to run +1000 flows in parallel + run the subflows sequentially
Any thoughts?

What are your concerns with memory?
Your infrastructure will need the appropriate amount of memory to conduct whatever steps are necessary in your tasks. What does your infrastructure look like to support this? How large are the datasets needing processed?

You want to run 1000 flows in parallel - is there a particular requirement that they have to be started in parallel? Regardless, you can create flow runs at whatever frequency you require - each flow / flow deployment can spawn a flow run. If you want to spawn more than one, you can create x number of flows using a given deployment - <https://docs.prefect.io/api-ref/prefect/deployments/#prefect.deployments.run_deployment>

<@U03FKQX1TH9> We haven't decide on the infrastructure yet and that's what I am evaluating now.