https://prefect.io logo
Title
a

Alexandru Anghel

09/13/2022, 4:15 PM
Hey guys, We are running a few pipelines on Prefect 1.2 (on-prem Kubernetes) and recently i've started working on migrating to Orion. We have two use cases that query Prometheus metrics: streaming flows running at seconds interval and batch flows running for the previous day, weeks etc. In Prefect 1.2 I had two projects, one for streaming and one for batch. In this way, I was able to register a flow having the same name in both projects (of course, having different tasks under the hood). For example, i have a flow called 'Alerts' in both projects, one running daily and fetching previous day data and another one running at 30 seconds interval. In Orion, it seems it's not possible anymore to isolate the two use cases. Is there another way? I've read about Workspaces, but that it's available only in Prefect Cloud. Another question is about starting streaming flows. My flow consists in having one task creating a table and another one running a while loop that calls the Prometheus api every 30 seconds and inserts data into the table. So i need to run in streaming mode just one task (the api call), not the entire flow. In Prefect 1.2 i am scheduling the flow to run only once 10 seconds after the registration (using
from prefect.schedules import Schedule
). In this way, it's entering the task while loop and the UI shows only one flow run in "Running" state (until you decide to kill it). Is there a way to replicate this scenario in Orion? If i use the Interval schedule, it's creating flow runs every 30 seconds and it's difficult for the users to read the UI once you have more flows running. Thank you!
1
r

Rob Freedy

09/13/2022, 8:36 PM
For your first question, I would recommend looking at Deployments: https://docs.prefect.io/concepts/deployments/#deployments https://docs.prefect.io/concepts/deployments/#deployments-and-flows This will allow you to set different schedules for the same flow run in the same work space and should accomplish the same functionality as your two 1.2 flows. For the second question, I would recommend taking a look at this article: https://www.prefect.io/guide/blog/you-no-longer-need-two-separate-systems-for-batch-processing-and-streaming/ It has a good overview of a basic data streaming pipeline that I believe could be helpful for your use case
a

Alexandru Anghel

09/14/2022, 10:52 AM
Thanks, @Rob Freedy! You are saying that I could try to set up only two flows (batch and streaming) and treat them as templates on top of which I can apply deployments? If i do that i cannot track them easily in the Flow Runs section of the UI. I will see something like batch/amazing-elephant instead of batch/Alerts. Is there a way to add the deployment name in the flow run?
r

Rob Freedy

09/14/2022, 1:13 PM
You can specify the name when you build the deployment. So it could be the same code with a different schedule and name, which is configured in each of the deployments! https://docs.prefect.io/concepts/deployments/#build-the-deployment
a

Alexandru Anghel

09/14/2022, 2:11 PM
@Rob Freedy i was wondering if the flow run name can be changed somehow, not the deployment name. Thanks!
r

Rob Freedy

09/14/2022, 2:31 PM
I see, I believe you can do this for task runs but I am not sure about flow runs: https://discourse.prefect.io/t/can-i-set-a-custom-name-of-a-task-or-task-run-at-run-time-in-prefect-2-0/535 You could also use the with_options on the flow to change the name at run time and add tags in the deployment spec to help narrow down which flow is which: https://docs.prefect.io/api-ref/prefect/flows/?h=with_options#prefect.flows.Flow.with_options