Hello <@ULVA73B9P> I attached little more csv file...
# ask-marvin
k
Hello @Marvin I attached little more csv file, this time with different number of workers. Can you please give it a try and help me? Settings are this: PREFECT_RUNNER_POLL_FREQUENCY='1' PREFECT_WORKER_QUERY_SECONDS='0.5' These three are files such data_(no of_worker}__{events__per_second}__{number_of_minutes}.csv First column= event number Second Column = flow_run created - event_occured Third Column = flow run started - flow run created Values in third column is not linear but lies between some interval as we can see from attachements. As the second column is almost linear function of event emitted even in case of different number of workers . Can we parallelize something on server side such that we can say that maximum value of second column lies between some known interval as in case of third column
m
To address your question about parallelizing tasks on the server side to ensure the maximum value of the second column lies within a known interval, we need to consider how Prefect handles task execution and how you can optimize it. Given your settings and the description of your CSV files, you're dealing with metrics related to event processing and flow execution. Here's a concise plan to approach this: 1. Understand Prefect's Execution Model: Prefect allows for parallel execution of tasks across multiple workers. By configuring the flow and task settings, you can control how tasks are distributed and executed. 2. Adjust Worker Configuration: Since you're already using different numbers of workers, ensure that each worker is optimally utilized. You might need to adjust the concurrency settings for each worker to handle more tasks simultaneously. 3. Flow and Task Parallelization: - Flow Configuration: Ensure that your flow is set to run with concurrency in mind. Prefect's
flow
decorators allow you to specify parameters that can enhance parallel execution. - Task Parallelization: Use Prefect's task decorators to specify tasks that can be run in parallel. You can adjust the
task_runner
settings to optimize how tasks are executed. 4. Analyze and Adjust Polling Intervals: The settings
PREFECT_RUNNER_POLL_FREQUENCY
and
PREFECT_WORKER_QUERY_SECONDS
control how often the runner and workers check for new tasks. Fine-tuning these can help balance load and responsiveness. 5. Monitor and Iterate: Use Prefect's monitoring tools to observe the behavior of your flows and tasks. Adjust settings based on observed performance to ensure that the maximum value of the second column remains within a desired range. To further assist you, I can look up specific Prefect documentation that might provide more detailed guidance on configuring parallel execution and worker settings. Let me know if you'd like me to do that.
k
Yes please do that