https://prefect.io logo
Title
a

Aaron Gonzalez

02/20/2023, 4:07 PM
QQ for those with experience using GCP's Cloud Run parallelism abilities. I am going to try to run a bunch of commands in parallel, but I am not sure if I should first try to keep everything in Prefect-land and try to use the ``ConcurrentTaskRunner`` and maybe chunk my commands into groups of like 10 (10 would be more than enough).........or if I should try to use the native parallel options available to Cloud Run.
You can structure a job as a single task or as multiple, independent tasks (up to 10,000 tasks) that can be executed in parallel. Each task runs one container instance and can be configured to retry in case of failure. Each task is aware of its index, which is stored in the
CLOUD_RUN_TASK_INDEX
environment variable. The overall count of tasks is stored in the
CLOUD_RUN_TASK_COUNT
environment variable. If you are processing data in parallel, your code is responsible for determining which task handles which subset of the data.
FWIW both options seem easy enough to implement. I just wasn't sure if anyone had first hand experience. I guess it would also depend on what exactly my commands are. (e.g. just a bunch of small
gsutil rsync
's)
t

Thet Naing

02/20/2023, 5:42 PM
Commenting to hear what others say. Have never tried Cloud Run's parallelism feature myself