https://prefect.io logo
Title
w

William Wolfe-McGuire

10/02/2022, 7:26 PM
in prefect 0.15.13 is there a good way to check whether a task and its output is serializable? I know you can use
is_serializable
to check if entire flows can be serialized but that doesnt allow you to narrow down which tasks are responsible for your serialization issues. what is the best way to debug in this situation?
1
a

Anna Geller

10/02/2022, 8:48 PM
Interesting question. There are 3 reasons I can immediately think of why some of your flows may not be serializable: 1. You return a database connection 2. You return an HTTP client (e.g. boto3 client) 3. You use pickle storage e.g. when leveraging Docker storage build process, and some dependencies are not packaged alongside your flow properly #1 and #2 can be fixed by moving that logic into a task e.g. instead of returning such object, use it in your task and close when no longer needed (avoiding passing such non-serializable objects between tasks) #3 can be fixed by adding missing dependencies to the list of packages to install - this docs page explains it in more detail
I see you ask specifically about task output: for that, you can probably try to load this object with cloudpickle:
cloudpickle.load(obj)
w

William Wolfe-McGuire

10/03/2022, 6:44 PM
@Anna Geller thanks for your response. using cloudpickle.dumps to serialize individual tasks seems to have allowed me to at least track down which task is causing my issue
a

Anna Geller

10/03/2022, 7:28 PM
Nice work! We are close to finalizing the Results feature so within the next two weeks I would expect to see more guidance on that