Hi. I’m running on ECS with local dask executor. ...
# prefect-community
s
Hi. I’m running on ECS with local dask executor. If I understand my config correctly, I’m running with 512 cpu and 1024 memory I have a process that does a fairly big mapping (could be 1000s or 100ks of api calls). I’d like to run with as many workers as possible. I am using 16 workers with default scheduler. This gets me about 16 tasks per second on my laptop, but ECS performance seems to be more like 4-5 tasks per second. It seems like
DaskExecutor
might be an option, but I don’t quite understand all the setup I would need to do.
My current config
Copy code
def set_run_config(flow_name: str) -> RunConfig:
    aws_account_id = Secret('AWS_ACCOUNT_ID-' + RUN_ENV).get()
    # aws_account_id = AWS_ACCOUNT_ID['dev']
    return ECSRun(
        cpu= 512,
        memory= 1024,
        run_task_kwargs=dict(cluster=prefect_cluster[RUN_ENV], launchType='FARGATE'),
        execution_role_arn=f'arn:aws:iam::{aws_account_id}:role/prefectECSAgentTaskExecutionRole',
        task_role_arn=f'arn:aws:iam::{aws_account_id}:role/prefectTaskRole',
        image=f'{aws_account_id}.<http://dkr.ecr.us-east-1.amazonaws.com/{flow_name}:latest',\|dkr.ecr.us-east-1.amazonaws.com/{flow_name}:latest',\>
        labels=[RUN_ENV],
    )
I saw an example snippet for the DaskExecutor here, but am unsure if that will spin up a temp cluster in Fargate without any additional prep from me or not.
going back to LocalDaskExecutor, I honestly don’t know enough about running processes vs threads will help me and my script blew up with errors when I switched the scheduler to “processes”. Obviously there’s config I’m missing.
a
I’m running with 512 cpu and 1024 memory
in this setup, you don't have much capacity to parallelize things over, it's half of CPU core and 1 GB of memory - check this https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-cpu-memory-error.html
the easiest first step would be to increase this size on your task definition and use
LocalDaskExecutor
in your flow running there and see if that helps - using DaskExecutor with the dask cloud provider FargateCluster brings a whole lot of complexity, so I would only use it as a last resort if increasing the capacity on your task definition didn't help
s
brings a whole lot of complexity
It seemed that way. Thanks!
👍 1