Hi. I’m running on ECS with local dask executor. ...
# prefect-community
Hi. I’m running on ECS with local dask executor. If I understand my config correctly, I’m running with 512 cpu and 1024 memory I have a process that does a fairly big mapping (could be 1000s or 100ks of api calls). I’d like to run with as many workers as possible. I am using 16 workers with default scheduler. This gets me about 16 tasks per second on my laptop, but ECS performance seems to be more like 4-5 tasks per second. It seems like
might be an option, but I don’t quite understand all the setup I would need to do.
My current config
Copy code
def set_run_config(flow_name: str) -> RunConfig:
    aws_account_id = Secret('AWS_ACCOUNT_ID-' + RUN_ENV).get()
    # aws_account_id = AWS_ACCOUNT_ID['dev']
    return ECSRun(
        cpu= 512,
        memory= 1024,
        run_task_kwargs=dict(cluster=prefect_cluster[RUN_ENV], launchType='FARGATE'),
I saw an example snippet for the DaskExecutor here, but am unsure if that will spin up a temp cluster in Fargate without any additional prep from me or not.
going back to LocalDaskExecutor, I honestly don’t know enough about running processes vs threads will help me and my script blew up with errors when I switched the scheduler to “processes”. Obviously there’s config I’m missing.
I’m running with 512 cpu and 1024 memory
in this setup, you don't have much capacity to parallelize things over, it's half of CPU core and 1 GB of memory - check this https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-cpu-memory-error.html
the easiest first step would be to increase this size on your task definition and use
in your flow running there and see if that helps - using DaskExecutor with the dask cloud provider FargateCluster brings a whole lot of complexity, so I would only use it as a last resort if increasing the capacity on your task definition didn't help
brings a whole lot of complexity
It seemed that way. Thanks!
👍 1