https://prefect.io logo
Title
m

Maryam Veisi

02/27/2023, 4:51 PM
I have 3 flows for a data workflow (flow1=extracts the data, flow2= makes data ready, flow 3= creates training set) For the next step of work, we want to run this workflow (3 flows) in parallel for different parameters +1000 times . Our concerns are: 1- memory 2- how to set up the pipeline to run +1000 flows in parallel + run the subflows sequentially Any thoughts?
c

Christopher Boyd

02/27/2023, 8:02 PM
What are your concerns with memory? Your infrastructure will need the appropriate amount of memory to conduct whatever steps are necessary in your tasks. What does your infrastructure look like to support this? How large are the datasets needing processed? You want to run 1000 flows in parallel - is there a particular requirement that they have to be started in parallel? Regardless, you can create flow runs at whatever frequency you require - each flow / flow deployment can spawn a flow run. If you want to spawn more than one, you can create x number of flows using a given deployment - https://docs.prefect.io/api-ref/prefect/deployments/#prefect.deployments.run_deployment
m

Maryam Veisi

02/27/2023, 11:19 PM
@Christopher Boyd We haven't decide on the infrastructure yet and that's what I am evaluating now.