Say I have a flow that processes data and publishes a manual computer vision annotation job to an annotation server - how might I schedule a flow to poll the annotation job to publish some information once the job has been completed? It may take hours, days or maybe a week for this to finish, so polling every 4 hours or something may make sense. Would this be an application for
pause_flow_run
using the
reschedule
argument? Or is there another way I can create a
cron
like schedule for a parameterized flow to run until it successfully completes?
Geoffrey Keating
08/14/2023, 8:57 PM
Alternatively, I could have a flow that tracks down the jobs/datasets that need to be published, but unfortunately that is a little bit harder that scheduling a polling operation
j
Jake Kaplan
08/14/2023, 9:02 PM
If i'm understanding right, would it work for your use case to kick off another flow run to publish the information once your original flow run completes? This way you wouldn't need to poll at all
Jake Kaplan
08/14/2023, 9:03 PM
theres a few ways to do this:
You can call
run_deployment()
inside of your flow run, maybe inside of the
on_completion
hook. here is a general example for how to use run deployment.
Additionally if you're using Prefect Cloud, use can use automations, where you can trigger a flow to kick off after success of another flow
g
Geoffrey Keating
08/14/2023, 9:12 PM
That's correct, but I don't really know when the annotation job will be ready to pull down. So something like
run_deployment
, maybe I could set retries to a large number and poll the status until it is completed that way?
j
Jake Kaplan
08/14/2023, 9:21 PM
Ah got it, so theres kind of a hidden middle step?
• flow1 completes
• annotation job (not a flow) happens in some indeterminate amount of time
• flow2 needs to run
g
Geoffrey Keating
08/14/2023, 10:38 PM
Right, think of some middle blob of time that is on the scale of human work (days/week) and wrangling a web hook out of the annotation engine would be quite a bit of effort.
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.