Thread
#prefect-community
    s

    Stephen Lloyd

    3 months ago
    Hi. I’m running on ECS with local dask executor. If I understand my config correctly, I’m running with 512 cpu and 1024 memory I have a process that does a fairly big mapping (could be 1000s or 100ks of api calls). I’d like to run with as many workers as possible. I am using 16 workers with default scheduler. This gets me about 16 tasks per second on my laptop, but ECS performance seems to be more like 4-5 tasks per second. It seems like
    DaskExecutor
    might be an option, but I don’t quite understand all the setup I would need to do.
    My current config
    def set_run_config(flow_name: str) -> RunConfig:
        aws_account_id = Secret('AWS_ACCOUNT_ID-' + RUN_ENV).get()
        # aws_account_id = AWS_ACCOUNT_ID['dev']
        return ECSRun(
            cpu= 512,
            memory= 1024,
            run_task_kwargs=dict(cluster=prefect_cluster[RUN_ENV], launchType='FARGATE'),
            execution_role_arn=f'arn:aws:iam::{aws_account_id}:role/prefectECSAgentTaskExecutionRole',
            task_role_arn=f'arn:aws:iam::{aws_account_id}:role/prefectTaskRole',
            image=f'{aws_account_id}.<http://dkr.ecr.us-east-1.amazonaws.com/{flow_name}:latest',\|dkr.ecr.us-east-1.amazonaws.com/{flow_name}:latest',\>
            labels=[RUN_ENV],
        )
    I saw an example snippet for the DaskExecutor here, but am unsure if that will spin up a temp cluster in Fargate without any additional prep from me or not.
    going back to LocalDaskExecutor, I honestly don’t know enough about running processes vs threads will help me and my script blew up with errors when I switched the scheduler to “processes”. Obviously there’s config I’m missing.
    Anna Geller

    Anna Geller

    3 months ago
    I’m running with 512 cpu and 1024 memory
    in this setup, you don't have much capacity to parallelize things over, it's half of CPU core and 1 GB of memory - check this https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-cpu-memory-error.html
    the easiest first step would be to increase this size on your task definition and use
    LocalDaskExecutor
    in your flow running there and see if that helps - using DaskExecutor with the dask cloud provider FargateCluster brings a whole lot of complexity, so I would only use it as a last resort if increasing the capacity on your task definition didn't help
    s

    Stephen Lloyd

    3 months ago
    brings a whole lot of complexity
    It seemed that way. Thanks!