https://prefect.io logo
Title
j

Jacques

06/23/2020, 3:51 PM
I'm trying to get to the bottom of a weird error, hoping this rings a bell for someone - I'm making a boto call in one of my tasks, and if the task fails I set it to retry with a
max_retries
parameter. I'm catching the error from boto and then logging the error immediately before doing a
raise signals.FAIL()
to trigger the retry mechanism. When the boto call fails (it does this once or twice a day - unpredictably) the error is caught, logs show the task is set to
Retrying
, and downstream tasks are set to
Pending
. All looks good until the flow is scheduled to run again, then I get a python stack overflow as some object is being pickled (I think - seeing looped calls to bits like
File "/var/lang/lib/python3.7/pickle.py", line 662 in save_reduce
in the stack trace) directly after the
Beginning Flow run
message. I'm using
DaskExecutor
if that matters.
As a side note, the boto docs have this comment for the exception that is being thrown:
# Subclasses of ClientError's are dynamically generated and
        # cannot be pickled unless they are attributes of a
        # module. So at the very least return a ClientError back.
j

Jim Crist-Harif

06/23/2020, 3:54 PM
Hi Jacques, hmmmm, that is odd. There's an open issue about Prefect having issues with unserializable exceptions, let me find it for you.
j

Jacques

06/23/2020, 3:54 PM
Does prefect do something like pickle the last exception before a retry or something like this?
Thanks!
j

Jim Crist-Harif

06/23/2020, 3:55 PM
j

Jacques

06/23/2020, 4:03 PM
Ok, yes that is exactly the issue I'm having
Thanks so much, is there something I can do to make prefect not save the exception when I know it's not serializable? Or should I respond in that issue rather?
j

Jim Crist-Harif

06/23/2020, 4:06 PM
I'm sorry I don't have an immediate fix for you. The issue is on our radar, but probably won't be handled for a bit. One option for now - you could catch the error locally in your task, and reraise a new error that is serializable.
Something like:
class MyBotoError(Exception):
    pass

def mytask(...):
    try:
        some_boto_thing()
    except SomeBotoError as exc:
        raise MyBotoError(str(exc))
j

Jacques

06/23/2020, 6:13 PM
Sorry for the slow reply - we use sentry.io to catch exceptions, so I don't want to leave an unhandled exception - but I think I can re-raise and then catch the second exception then signal fail - will test and pop a message on that github issue with results. Thanks so much for the help, you guys are amazing with support!
j

Jim Crist-Harif

06/23/2020, 6:15 PM
Generally you want to use prefect to handle exceptional cases, which is why I left an unhandled exception. Prefect will catch the exception and mark the task as failed, but the error wrapped with
MyBotoError
will then be serializable.
Anyway, glad you figured things out!
j

Jacques

06/24/2020, 1:09 PM
Tried the re-raising pattern as suggested (
str()
into a
RuntimeError
), but it still fails in the same way. Is it possible prefect is collecting all the exceptions and not just the most recent? This is fairly problematic as it causes a python stack overflow, not just a failed flow run on retry. Is there anything else that I could try?