Hi all, In our prefect deployment, we see task log...
# ask-community
s
Hi all, In our prefect deployment, we see task logs like.. "Task finished with final status Pending". Does Lazarus process handle this? I have seen this stuck for hours.
k
Are you using LocalDaskExecutor or DaskExecutor? Lazarus handles the events like not being able to start the Flow because of not being able to find hardware.
s
We are using LocalDaskExecutor on Kubernetes Agent with Prefect Cloud
k
Does this always happen or just sometimes?
s
It happens sometimes... mostly when upstream task is stuck for a long time (which also happens sometimes), the downstream task will end up in Pending state and logs show Task finished with final status Pending
Even after the upstream task succeeds/fails, the downstream remains in Pending
k
Do you have a similar setup to this ?
s
Yes we do... Is there a way to resolve/improve this behaviour?
k
It should resolve though eventually. Or does your Flow process just stop?
Does the
ClientError
or
manual_only
trigger fit what you are doing? Or just the
set_upstream
?
s
I saw the Flow process just being stuck for hours. I had to manually restart the flow from Prefect UI.
We just use the set_upstream.. we use the functional API instead of the Imperative one
k
With retries?
s
Yes, all our tasks have default retries of 3 with 5 seconds delay
k
Have you tried using
processes
instead of
threads
for LocalDaskExecutor?
s
No, can you please point me to some documentation for that?
Also, the tasks that I debugged, for the upstream tasks which were stuck, I marked them (upstream) as failed from Prefect UI, and expected the downstream to work. Does setting the task state from Prefect UI affect the flow in anyway?
k
Just try
LocalDaskExecutor(scheduler="processes")
and it might be more stable than threads. Yes setting from the UI will affect the Flow since the state is resolved already. So when a downstream checks the upstream dependencies, it will affect whether it starts or not.
🆗 1
s
Sorry to bother you with this, but while we are on this topic, the upstream tasks which gets stuck are simple HTTP POST requests. Our API endpoint times out after 1 hour, so I assumed the task to fail or something after 1 hour, but I see it sometimes going on for hours. We use this task within
.map
so there are multiple parallel task runs. Some of these task runs would get stuck, like no useful logs other than task started running... Does this sound like a LocalDaskExecutor issue?
k
No worries. I think this is an issue you will find with any executor then. It seems like there is nothing being returned. You can try a timeout on task level and see if it helps?
@task(timeout=timedelta(hours=1))
s
Yes, I tried setting that up for all our tasks, but unfortunately that started messing up with one of our other tasks which is simple Pandas 1 liner code. Anyways, I'll set it again and see if I can fetch more details on what issue is causes. Thanks for your help @Kevin Kho!
k
Of course!