https://prefect.io logo
Title
g

Guillaume Latour

05/19/2022, 8:20 AM
Hello everyone, I am using a dask cluster and I've run into this issue:
distributed.protocol.pickle - INFO - Failed to serialize <Success: "Task run succeeded.">. Exception: cannot pickle 'lxml.etree.XMLSchema' object
which leads to
distributed.worker - ERROR - failed during get data with <ip> -> <ip>
which at some point close the connexion
distributed.comm.core.CommClosedError: in <TCP (closed)  local=tcp://<ip> remote=tcp://<ip>>: Stream is closed
this is managed by prefect by some retries (depending on configuration) and finally the agent raises an error and the flow is marked as failling
distributed.scheduler.KilledWorker: ('concatenate_df_based_on_time-b91c06dc30f54c5084e9f5fe8b6b32a5', <WorkerState 'tcp://<ip>', status: closed, memory: 0, processing: 1>)
Do you have an idea on how to prevent this kind of error?
a

Anna Geller

05/19/2022, 9:52 AM
do you have your
XMLSchema
object defined outside of the task decorator or do you happen to return it from a task and pass it as data dependency to another task? If so, Dask can't serialize it with cloudpickle
g

Guillaume Latour

05/19/2022, 11:40 AM
From the error message I've understood that it was the prefect success object that is generating the Exception. is it possible?
a

Anna Geller

05/19/2022, 11:42 AM
can you by any chance share the flow code? it would be easier to assess that way - you can share via DM for privacy and we can still continue the discussion here
k

Kevin Kho

05/19/2022, 1:33 PM
I doubt it’s the Prefect success object. The SUCCESS is just an exception and it’s serializable. There is something being returned that
cloudpickle
can’t serialize. I think you can try doing
cloudpickle.dumps(x)
to test this
👀 1
g

Guillaume Latour

05/24/2022, 6:48 AM
Ok you were right, a task was returning an XMLSchema object, which is indeed not serializable via pickle. I merged the tasks using this object so the XMLSchema is now internally used by the tasks and everything is working. ty for your time and guidance :)
a

Anna Geller

05/24/2022, 12:39 PM
nice work and thanks for the update!