Quite an aws specific question.. I have a batch job (created before I started using Prefect) which is calling a lambda function approx 1000 times, each time with slightly different args (it is processing images on S3). The lambdas happen to all request the same file on S3 (used as a reference image) and I was getting S3 rate limits, so taking the lazy route I just throttled the lambdas by running them in serial using
. I also log the response from the lambdas to a db, currently at the end of the batch job. My question is whether I am better to stick with running batch job from Prefect, or if I am better to use Prefect to call the 1000 lambdas in parallel using lambda
mode, then use Prefects ability to handle retries to manage the show..? Also a related question is when is the best time to log to the db. Thanks in advance! ps loving prefect way more than airflow
1 year ago
Hi @Robin Cole! I think you can use Prefect to run the AWS lambdas with a task concurrency limit (if you still need to limit the rate). I think logging at the end of the batch job is better if possible so that you only have one write operation
I honestly feel either setup works but will add some thoughts. With batch, Prefect just calls one task but if Prefect calls the lambdas directly, you get the monitoring of those tasks on the Prefect UI.