<@ULVA73B9P> For the following assume a Docker wor...
# ask-marvin
b
@Marvin For the following assume a Docker worker. In a prefect 3.x flow I can trigger tasks and other (sub-)flows. I have a workflow that requires some steps to use GPU-infrastructure and ideally don't want to run everything on one single worker. 1. Is my understanding correct, that any subflow will be run on the same worker? 2. Is it possible to distribute tasks to other workers (with different infrastructure)? 3. If need be, can a prefect flow also trigger a run for another deployment and wait for it to be finished?
m
1. Subflow Execution on the Same Worker: By default, subflows in Prefect 3.x run on the same worker as the parent flow. This is because the worker is responsible for the execution environment, and subflows inherit this environment unless explicitly configured otherwise. 2. Distributing Tasks to Different Workers: Yes, it is possible to distribute tasks to different workers with different infrastructures. Prefect 3.x supports multiple worker types, such as Docker, Kubernetes, ECS, etc. You can assign tasks to different workers by configuring your deployments to use specific work pools that match the desired execution environment. More details can be found in the Prefect Workers Documentation. 3. Triggering a Deployment and Waiting for Completion: Prefect allows you to trigger other deployments and wait for them to finish. This can be achieved using deployment triggers and automations. You can define triggers in your deployment configuration to start a flow run based on specific events. More information on this can be found in the Prefect Deployment Triggers Documentation. For additional insights into common gotchas in Prefect 3.x, you can review the Prefect 3.x Gotchas.
b
@Marvin: Regarding point 2: I get that the worker creates the infrastructure, but for my flow example I'd need 2 different infrastructures on different physical machines, with different worker instances. How would I configure this within one deployment? (prefect 3; docker work pools)
m
To configure a Prefect 3.x deployment that uses multiple infrastructures on different physical machines with different worker instances, you can follow these steps: 1. Create Multiple Work Pools: Define separate work pools for each type of infrastructure you want to use. Each work pool will correspond to a different physical machine or infrastructure setup you have. 2. Assign Deployments to Work Pools: You can create multiple deployments from the same flow, each assigned to a different work pool. This allows you to run parts of your workflow on different infrastructures by configuring the deployments accordingly. 3. Use
deploy
Function
: You can use the
deploy
function to create multiple deployments from one or more Python files. This function lets you manage these deployments independently, allowing you to deploy the same flow with different configurations in the same codebase. For instance, you can have one deployment that uses a Docker work pool and another that uses a different type of infrastructure. 4. Override Environment Variables: If needed, you can override environment variables for specific deployments using the
job_variables
argument. This can be useful for customizing certain aspects of the deployment environment. For more detailed guidance, you can refer to the Prefect Docker Infrastructure Examples and Prefect Work Pools Documentation.
b
@Marvin: That doesn't sound like it would be possible to run parts of the flow on infra1 and another part of infra2, does it? Assuming the flow started on infra1 and now it has reached one task that requires infra2, how would the flow tell this task to run on the other work pool?
m
In Prefect 3.x, assigning individual tasks within a flow to run on different infrastructures (or work pools) isn't natively supported. The current structure is designed to have flows run on a single worker or infrastructure defined by the deployment's work pool. However, you can work around this limitation by structuring your workflow into separate flows or subflows, each with its own deployment and associated work pool. Here's a general approach: 1. Split the Workflow: Divide your workflow into distinct flows where each flow corresponds to a part of the workflow that requires a specific infrastructure. 2. Create Separate Deployments: For each of these flows, create separate deployments, each associated with the appropriate work pool that matches the required infrastructure. 3. Trigger Subflows: Within your main flow, trigger these subflows as needed. Each subflow will execute on its designated infrastructure as per its deployment configuration. This approach allows you to effectively manage tasks across different infrastructures by leveraging Prefect's deployment and work pool configurations. For more details on how to manage deployments and work pools, you can refer to the Prefect Work Pools Documentation.