Hi I try to implement custom retrying policy in Prefect Orio Prefect Community #ask-community

Hi! I try to implement custom retrying policy in P...

Vladimir Bolshakov

03/18/2022, 8:23 AM

Hi! I try to implement custom retrying policy in Prefect Orion (2.0b2). But, as i see at this moment, policies in Orion are hardcoded (CoreFlowPolicy/CoreTaskPolicy, MinimalFlowPolicy/MinimalTaskPolicy) and server API handlers for setting states do not have parameters in requests to receive data about custom policies (task policy and flow policy not a parts of requests, but just a dependency injection parameters initialised with

provide_task_policy

provide_flow_policy

). So my question is about orchestration settings that will be released in the future. Is custom orchestration policies will be parameters of API requests to set state of task/flow? Or orchestration concepts and APIs will be seriously changed in the near future? How orchestration policies will be serialized/deserialized between server and agent’s engines?

Anna Geller

03/18/2022, 11:40 AM

Wow, first off, thank you so much for diving so deep into Orion already! Afaik, we originally intended for the orchestration policies to be abstracted away from the user in the backend. From the technical standpoint, I'll ask someone from the team to respond, but can you explain what do you try to do on a high level? What custom retrying policy do you try to build?

Anna Geller

03/18/2022, 12:11 PM

So far I got a response from the team that orchestration policies can be (for now) considered internal, since we have not yet put machinery in for user-customizable policies

Vladimir Bolshakov

03/18/2022, 12:17 PM

About retrying policy. Imagine a task that make some HTTP request and it failed because of rate limits. But response from the server contains information about a time when next request will be accepted (e.g.

retry_after

). In this case I try to return response data about time as a

data

of failed State object that will be checked in custom retrying policy. Policy behaviour is simple in this case — check state data and return AwaitingRetry with some time instead of Failed state.

Vladimir Bolshakov

03/18/2022, 12:18 PM

Without retrying policy i try to implement this behaviour in flow code. Awaiting task result, check state, pause until some time and retry task in loop.

Vladimir Bolshakov

03/18/2022, 12:22 PM

So far I got a response from the team that orchestration policies can be (for now) considered internal, since we have not yet put machinery in for user-customizable policies

Thank you for fast response! It’s mean for me that I correctly understood concepts of orchestration. Orchestration policies more about task/flow lifecycle, but not about business logic of workflow. So retrying in some cases like above is more business logic than task behaviour. It’s more about flow/task source code than orchestration. Correct?

Vladimir Bolshakov

03/18/2022, 12:32 PM

In stable version of Prefect (not Orion) was a terminal state Pause, that can be used in this case. But in Orion we have only three terminal states — Completed, Cancelled and Failed.

Vladimir Bolshakov

03/18/2022, 12:35 PM

And in my case i see only two ways: 1. Run task and get result. If task failed then pause (sleep) until some moment and run task again. 2. Run task and get result. If task failed then reschedule entire flow (create new scheduled flow run) and fail flow

Anna Geller

03/18/2022, 1:10 PM

Without retrying policy i try to implement this behaviour in flow code. Awaiting task result, check state, pause until some time and retry task in loop.

Orchestration policies are indeed customizable but we haven't exposed an endpoint to configure that yet. But if you are willing to configure that in your flow instead, this is something I could help with more.

Anna Geller

03/18/2022, 1:11 PM

Orchestration policies more about task/flow lifecycle, but not about business logic of workflow.

You're spot on, your business logic should live within your tasks and flows, not within orchestration policies.

Anna Geller

03/18/2022, 1:13 PM

Run task and get result. If task failed then pause (sleep) until some moment and run task again.

This sounds like the best and most natural way of approaching it

Dustin Ngo

03/18/2022, 1:45 PM

Hi Vladimir, thank you for taking all this time to engage with Orion! We fully intend one day to allow some degree to customization or Orchestration Policies, so it'd some day be possible to customize the logic that controls the code execution of your tasks and flows on every state transition. While the specific situation you've outlined is well-handled in the flow code itself, implementing special retries for rate-limited HTTP requests is definitely also possible in Orchestration. One way you might imagine this being implemented is like the current retry rule: while entering a

failed

state, tasks tagged with an

http-request

tag will sleep based on some exponential backoff you've configured before being asked to re-enter a

running

state again. This capability hasn't been wired up because we wanted to understand how our users wanted to interact with them before deciding on a design! However in all likelihood, the way you'd add a custom policy is to configure them when you're setting up your Orion server. And because all Orchestration rule code runs on the server behind the API and not the agent machines, once you've done this all your tasks and flows will be governed by the additional logic. If you find more reasons to want to customize your Orchestration policies, please don't hesitate to let us know prefect duck

🙏 4

37 Views

Open in Slack

Previous Next