Nikola Lusic
06/01/2021, 10:25 AMECSRun
configuration combined with LocalDaskExecutor(scheduler="processes", num_workers=4)
support parallel execution of mapped tasks?
Currently I'm unable to get the ECS task to spawn any additional processes - all are run in sequence (first image).
When running the same flow on the local Prefect environment, the tasks are all done in parallel (second image).
If I use LocalDaskExecutor(scheduler="threads", num_workers=4)
, the flow tasks are executed in parallel, but threaded flow only covers part of our use cases.ciaran
06/01/2021, 10:35 AMciaran
06/01/2021, 10:36 AMLocalDaskExecutor
will run on the one container, so it won't spin off extra ECS tasksNikola Lusic
06/01/2021, 10:36 AMTask memory (MiB)8192
Task CPU (unit)4096
Nikola Lusic
06/01/2021, 10:37 AMciaran
06/01/2021, 10:37 AMNikola Lusic
06/01/2021, 10:39 AMciaran
06/01/2021, 10:40 AMNikola Lusic
06/01/2021, 12:45 PMfrom prefect import Flow, task
from prefect.engine.results import S3Result
from prefect.executors import LocalDaskExecutor
from prefect.run_configs import ECSRun
from prefect.storage import S3
@task
def process(key):
import time
time.sleep(30)
with Flow(
name="opal_copy_test",
) as flow:
keys = [1,2,3,4]
process.map(keys)
flow.executor = LocalDaskExecutor(scheduler="processes", num_workers=4)
run_config = ECSRun(
labels=["python3.7"],
image="<DOCKER_IMAGE>",
)
storage = S3(bucket="<S3_BUCKET>")
result = S3Result(bucket="<S3_BUCKET>")
flow.run_config = run_config
flow.storage = storage
if __name__ == '__main__':
flow.register(project_name='test')
When registering it for local environment, the storage, result and run_config part is removed.
Also, I change the ~/.prefect/backend.toml
to point to the ECS Prefect Server when registering flows to ECS.ciaran
06/01/2021, 1:23 PMNikola Lusic
06/01/2021, 1:29 PMNikola Lusic
06/01/2021, 1:35 PMprefect server start
agent deployment (options.yaml
contain network configuration):
prefect agent ecs start
--task-role-arn=arn:aws:iam::<ACCOUNT_ID>:role/ECSTaskS3ECRRole
--execution-role-arn arn:aws:iam::<ACCOUNT_ID>:role/ECSTaskS3ECRRole
--log-level INFO
--label ecs-dev
--label python3.7
--name ecs-test
--cluster prefect-cluster
--run-task-kwargs /home/ubuntu/prefect-agent/options.yaml
--api http://<PREFECT_SERVER_IP>:4200
Kevin Kho
processes
works but threads
doesn’t?Nikola Lusic
06/01/2021, 2:38 PMthreads
- works as expected
processes
- runs tasks in sequence instead of parallelKevin Kho
Kevin Kho
Nikola Lusic
06/01/2021, 2:52 PMKevin Kho
Kevin Kho
Nikola Lusic
06/01/2021, 3:28 PMNikola Lusic
06/01/2021, 3:28 PM