https://prefect.io logo
Title
b

Brian Osserman

10/26/2022, 4:39 PM
Hi, I've been moving my workflow orchestration into Prefect, and recently ran across some strange hanging behavior in moving from one task to another which I'm hoping someone can shed some light on. This behavior was observed running Prefect 1.4.0 from windows cmd. I saw it on two different flows so far, and it's a bit inconsistent. What I'm seeing is that one or more of the tasks will have completed (according eg to the logs generated by the actual task), but then it hangs right at the end. When I press Ctrl-C (the first time was to try to abort the run since that's the usual behavior), instead of aborting it seems to prod it into acknowledging that the task completed successfully and it then moves on the next task. I haven't been able to rerun much to test since these are large flows that takes more than a day to run, but in each case I've seen it happening in a few of the tasks but not others, and in one case I was able to rerun for test purposes, and it happened again, but on a different step from the first time. I'm going to post some more details in a thread.
The actual output to console is kind of odd as well. When it hangs, the only output for that task is about the task starting. But when I press Ctrl-C, it will output a completion line generated at the very end of our
@task
python code with elapsed time, which is 'backdated' to when the logs indicate the task actually finished. At the same time, it will generate the usual 'Finished task run' message, and 'Starting task run' for the next task, with timestamps that are not backdated. So if the task started at time A, and the logs (which aren't sent to console) indicate it finished at time B, and I hit Ctrl-C at time C, then before I hit Ctrl-C, I just see the INFO lines with timestamp A indicating the task started. But when I hit Ctrl-C, I immediately get the INFO line with timestamp B showing the elapsed time to complete is B - A, and then INFO lines with timestamp C saying that the task completed with state success, and that the next task has started.
Our best guess is that for some reason the timestamp B output to console is being buffered and not actually printed, and the hang is happening somewhere in the prefect code itself.
m

Mason Menges

10/27/2022, 3:50 PM
Hey @Brian Osserman Would you be able to put together a minimum reproducible example of the issue you're seeing This isn't an issue I've seen come up before so it's hard to say exactly what could be causing it, On a side note if you're just getting started moving over to prefect I'd suggest checking out Prefect 2 as it is our LTS product https://docs.prefect.io/
b

Brian Osserman

10/27/2022, 5:44 PM
Hi, unfortunately, this is hard to reproduce when I tinker with it. I tried rerunning with a smaller input file, and it only happened one out of six times. And when we tried isolating the first task and rerunning it, it didn't hang at all.
As far as Prefect 2, although I've only gotten started moving workflows into Prefect, my company has put some substantial effort into creating resources for running Prefect 1, and I don't know how hard it would be to port all of that over.
I was hoping that at least the fact that Ctrl-C behaves differently from usual would help narrow down what is happening.
If you think it would be helpful I'd be happy to try rerunning in some sort of debug mode as I saw mentioned in a different thread.