Aj Floersch
10/18/2024, 8:25 PMMarvin
10/18/2024, 8:25 PMtask.submit()
in a loop can be influenced by several factors, including task runner settings and concurrency limits.
1. Immediate Execution:
- When you use .submit()
in Prefect, the tasks are typically submitted for execution using a task runner like the ThreadPoolTaskRunner
or DaskTaskRunner
. These tasks return a PrefectFuture
to provide access to their computation status. Tasks are scheduled for execution, but depending on available resources or configuration, they might not execute immediately.
2. Concurrency Limits:
- Concurrency limits play a crucial role in task execution. Prefect allows defining global concurrency limits or task-based concurrency limits to ensure resource constraints are respected. This means that even if tasks are submitted, they might wait for execution slots to become available if current slots are at capacity. This could explain why you notice the first task starting only after all others are submitted. For more details, you can refer to the documentation about concurrency limits here: https://docs-3.prefect.io/3.0/develop/global-concurrency-limits
3. Task Scheduling:
- The task runner might be busy managing the scheduling of the submitted tasks, especially if there are many tasks being queued at once. While the task runner is designed to handle concurrent executions, configuration settings or resource availability might affect this process.
To better manage and diagnose the behavior of tasks in your workflow, consider examining configurations related to concurrency, task runners, and possibly task dependencies. You might want to ensure that your environment is optimized for the scale you are working at, particularly when dealing with hundreds of tasks.
If you're seeing consistent timing or performance issues, you might need to adjust your task runner settings or concurrency limits as per your resources and requirements. You can find more information on task runners in the Prefect documentation here: https://docs-3.prefect.io/3.0/develop/task-runners.Nate
10/18/2024, 8:50 PMit appears the first task does not actually being running until the 500th task has been submitted. Is this expected?no. i suspect something might be blocking the code doing the submission?
Aj Floersch
10/18/2024, 8:53 PMrequests
library be that blocker? Each task is making an HTTP request.Aj Floersch
10/18/2024, 8:54 PMNate
10/18/2024, 8:55 PMAj Floersch
10/18/2024, 9:34 PMfor modified_customer in res.items:
netsuite_customer_to_salesforce_netsuite_customer_c.submit(modified_customer.get('id'))
For context, res.items
is a dictionary list containing ids. The task makes a get call to NetSuite using the id and a PUT call to Salesforce using the retrieved NetSuite data.Nate
10/18/2024, 11:29 PMnetsuite_customer_to_salesforce_netsuite_customer_c
im not quite sure yet