Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.

Prefect Community

Hi.  I’m running on ECS with local dask executor.   If I understand my config correctly, I’m running with 512 cpu and 1024 memory

 I have a process that does a fairly big mapping (could be 1000s or 100ks of api calls).  I’d like to run with as many workers as possible.  I am using 16 workers with default scheduler.  This gets me about 16 tasks per second on my laptop, but ECS performance seems to be more like 4-5 tasks per second.  It seems like `DaskExecutor` might be an option, but I don’t quite understand all the setup I would need to do.

My current config
```def set_run_config(flow_name: str) -&gt; RunConfig:
    aws_account_id = Secret('AWS_ACCOUNT_ID-' + RUN_ENV).get()
    # aws_account_id = AWS_ACCOUNT_ID['dev']
    return ECSRun(
        cpu= 512,
        memory= 1024,
        run_task_kwargs=dict(cluster=prefect_cluster[RUN_ENV], launchType='FARGATE'),
        execution_role_arn=f'arn:aws:iam::{aws_account_id}:role/prefectECSAgentTaskExecutionRole',
        task_role_arn=f'arn:aws:iam::{aws_account_id}:role/prefectTaskRole',
        image=f'{aws_account_id}.<http://dkr.ecr.us-east-1.amazonaws.com/{flow_name}:latest',\|dkr.ecr.us-east-1.amazonaws.com/{flow_name}:latest',\>
        labels=[RUN_ENV],
    )```

I saw an example snippet for the DaskExecutor <https://docs.prefect.io/api/latest/executors.html#daskexecutor|here>, but am unsure if that will spin up a temp cluster in Fargate without any additional prep from me or not.

going back to LocalDaskExecutor, I honestly don’t know enough about running processes vs threads will help me and my script blew up with errors when I switched the scheduler to “processes”.  Obviously there’s config I’m missing.

&gt; I’m running with 512 cpu and 1024 memory
in this setup, you don't have much capacity to parallelize things over, it's half of CPU core and 1 GB of memory - check this <https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-cpu-memory-error.html>

the easiest first step would be to increase this size on your task definition and use `LocalDaskExecutor` in your flow running there and see if that helps - using DaskExecutor with the dask cloud provider FargateCluster brings a whole lot of complexity, so I would only use it as a last resort if increasing the capacity on your task definition didn't help

&gt; brings a whole lot of complexity
It seemed that way.

Thanks!