I m running a flow where both the flow and the tasks have re Prefect Community #ask-community

I’m running a flow where both the flow and the tas...

Alexis Chicoine

04/04/2024, 2:30 PM

I’m running a flow where both the flow and the tasks have retries. For the first run the task with retries behaved as expected waiting retry_delay_seconds between each attempt. Then after the number of retries was exhausted the flow retry kicked in and waited retry_delay_seconds before trying the flow again. So far so good. The issue I have is that on the second flow attempt the task did retry three times, but didn’t respect the retry_delay_seconds and just ran the three tries one after the other really fast. Is this expected behaviour or did I land on a bug?

Mason Menges

04/04/2024, 2:36 PM

Hmm first reaction to this is that it sounds like a bug, what version of perfect are you running?

Alexis Chicoine

04/04/2024, 2:38 PM

This is what should be running in the image:

Copy code

prefect==2.16.5

Mason Menges

04/04/2024, 2:40 PM

Are you and to reproduce the behavior locally? Or is it just when running on the image? If you have an example of the code you're running that would be super helpful as well

Alexis Chicoine

04/04/2024, 2:41 PM

I had tried locally, but without the retry_delay_seconds just to make sure it would total the right amount of retries and that was fine. I’ll see if I can reproduce something equivalent locally and if that does it I’ll share it with you.

🙏 1

Alexis Chicoine

04/04/2024, 4:24 PM

I wrote this code and it works correctly I’m not sure what would be the difference when it ran in our flow.

Copy code

from datetime import datetime

from prefect import flow, task

times = []


@task(retries=2, retry_delay_seconds = 10)
def raise_a_task_exception():
    global times
    # append the current time
    times.append(datetime.now())
    print(times)
    raise ValueError("This is a task exception")


@flow(retries=1, retry_delay_seconds=10)
def raise_an_exception():
    raise_a_task_exception()

if __name__ == "__main__":
    try:
        raise_an_exception()
    except Exception as e:
        print(times)

[datetime.datetime(2024, 4, 4, 12, 21, 55, 841183), datetime.datetime(2024, 4, 4, 12, 22, 6, 387484), datetime.datetime(2024, 4, 4, 12, 22, 17, 68533), datetime.datetime(2024, 4, 4, 12, 22, 28, 510515), datetime.datetime(2024, 4, 4, 12, 22, 39, 357650), datetime.datetime(2024, 4, 4, 12, 22, 49, 976638)]

Alexis Chicoine

04/04/2024, 4:32 PM

I have this error in my logs so maybe something else went wrong. Task run ‘xyz’ received abort during orchestration: The enclosing flow must be running to begin task execution. Task run is in SCHEDULED state.

Alexis Chicoine

04/04/2024, 4:32 PM

It showed as a warning.

Mason Menges

04/04/2024, 4:35 PM

That error is basically saying it the task tried to run outside of a flow run context which seems kinda odd if the flow was retried, what worker is the flow running from when you see this happen, i.e. docker, ecs, kubernetes?

Alexis Chicoine

04/04/2024, 4:35 PM

It’s in kubernetes. We’re still with the old agents system though we’re doing work to move to workers eventually.

72 Views

Open in Slack

Previous Next