Hi, I am trying to set up task orchestration, where all the tasks will be AWS Batch jobs or similar (no actual processing happens within a task).
What's the recommended configuration here (e.g. what agent to use, how to configure)? The setup should accommodate some concurrency (say up 10 flows run concurrently, and up to 40 tasks run concurrently), with minimum resource usage - as these should only be API calls.
Tadej Svetina
10/03/2021, 9:25 AM
And maybe a more specific question: I have experience with Airflow, and with local executor there, each task was forked as a separate process, which consumed ~100MB per task just to create the task. Is this also the case with local concurrency in Prefect?
k
Kevin Kho
10/04/2021, 1:41 PM
Hey @Tadej Svetina, it sounds like you just need a lightweight Local Agent and then you can kick off those batch jobs. Each flow will still take a process on the machine, meaning that you just need enough hardware to support 10 flow runs concurrently. I don’t imagine you need it to be a powerful computer.
Kevin Kho
10/04/2021, 1:42 PM
If you are using the local executor, I think it’s like 20KB at most per task
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.