This may be outside of the design criteria for Prefect.
The use-case consists of two tasks. Task A is CPU-bound. Task B is network bound. Task A generates data that Task B uploads to a slow server.
I want a pool of Task A workers to feed a pool of Task B workers. For now, they are all subprocesses of a single parent process.
The number of Task A processes should be the same as the number of Cores (not including "hyperthreads"). I'm more flexible about the number of Task B processes.
k
Kevin Kho
04/09/2021, 3:08 PM
Hi @Paul Prescod, the Prefect way to do this would be to split out Task A and Task B and have them live on separate infrastructure. You can use a parent Flow to then orchestrate Task A and Task B (setting Task A as an upstream dependency). Is this a job that you want to run on a schedule? Is this a sequential process or you want Task A workers and Task B workers to simultaneously be running?
p
Paul Prescod
04/09/2021, 3:09 PM
Simultaneously running. Sort of "streaming" but the units of work take minutes to generate, not microseconds.
Paul Prescod
04/09/2021, 3:10 PM
The job will be user-initiated or initiated by code outside of Prefect.
k
Kevin Kho
04/09/2021, 3:11 PM
I think a queue type system might serve you better because Task A results can be placed into the queue and Task B workers can just listed to the queue.
p
Paul Prescod
04/09/2021, 3:12 PM
Yes it does seem a good match for queuing software but I plan to use Prefect elsewhere in the system so I wanted to double-check whether it had a technique for this use-case.
k
Kevin Kho
04/09/2021, 3:16 PM
It can be hacked together but we don’t have any native support for “streaming”. You would either need to split out the Tasks and give up the stream (Task A finishes and then Task B), or you would need to lump them together on the same infrastructure and have one worker produce data with Task A and then upload with Task B sequentially. At this point we’re not handling the CPU-bound and network bound differences well and we might be hit by both bottlenecks 😅
p
Paul Prescod
04/09/2021, 3:18 PM
Okay thanks! Can't be all things to all people. That matched my own research. Thanks for confirming.
k
Kevin Kho
04/09/2021, 3:19 PM
Thanks for asking! Feel free to ask about Prefect applications in other parts of the system
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.