Jeremy Phelps
09/08/2021, 9:14 AMUnexpected error: ValueError('Cell is empty')
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 49, in dumps
result = pickle.dumps(x, **dump_kwargs)
_pickle.PicklingError: Can't pickle <function create_job at 0x7fabe956fc20>: it's not the same object as tasks.common.create_job
None of my Prefect tasks have been working since it first appeared.
I tried this Stack Overflow answer but it didn't help: https://stackoverflow.com/a/1413299
What causes this error? The create_job
function is always referred to by its full Python import path (tasks.common.create_job
) due to the Stack Overflow answer.Kevin Kho
cloudpickle
and this library doesn’t seem to throw the same error. When I do:
from prefect import task, Flow
import cloudpickle
@task
def abc(x):
return x
abc.__name__ = "test"
print(cloudpickle.dumps(abc))
whereas the equivalent for pickle
throws the error. Would you be able to give me a minimum example?Jeremy Phelps
09/08/2021, 2:56 PMstored_as_script
option on the Flow to avoid attempting to pickle functions, as that only works under specific circumstances. Is there more I need to do to avoid pickling code?Kevin Kho
cloudpickle
Kevin Kho
PickleSerializer
by default (for task outputs). You can turn that off with @task(checkpoint=False)
, but the PickleSerialzier
uses cloudpickle
also so I don’t expect that to be your issue.Jeremy Phelps
09/08/2021, 3:16 PMcloudpickle
uses pickle
internally. The error happens within cloudpickle
. I wonder if a recent release of cloudpickle
introduced this problem. It would explain why it popped up with no relevant code changes on my part. I'm going to try pinning cloudpickle
to a version released during a time when things still worked.Jeremy Phelps
09/08/2021, 3:28 PMKevin Kho
Jeremy Phelps
09/08/2021, 3:30 PMJeremy Phelps
09/08/2021, 3:39 PMpickle
first, and then falls back to cloudpickle
when that fails. This results in a secondary exception that I've been ignoring because I assumed it was irrelevant. Turns out the secondary exception is the real error:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 49, in dumps
result = pickle.dumps(x, **dump_kwargs)
_pickle.PicklingError: Can't pickle <function create_job at 0x7f14afc1a5f0>: it's not the same object as tasks.common.create_job
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/prefect/engine/runner.py", line 48, in inner
new_state = method(self, state, *args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/prefect/engine/flow_runner.py", line 628, in get_flow_run_state
extra_context=extra_context(task),
File "/usr/local/lib/python3.7/site-packages/prefect/executors/dask.py", line 396, in submit
_maybe_run, self._should_run_event.name, fn, *args, **kwargs
File "/usr/local/lib/python3.7/site-packages/distributed/client.py", line 1603, in submit
actors=actor,
File "/usr/local/lib/python3.7/site-packages/distributed/client.py", line 2580, in _graph_to_futures
dsk = dsk.__dask_distributed_pack__(self, keyset, annotations)
File "/usr/local/lib/python3.7/site-packages/dask/highlevelgraph.py", line 995, in __dask_distributed_pack__
client_keys,
File "/usr/local/lib/python3.7/site-packages/dask/highlevelgraph.py", line 423, in __dask_distributed_pack__
dsk = toolz.valmap(dumps_task, dsk)
File "/usr/local/lib/python3.7/site-packages/toolz/dicttoolz.py", line 83, in valmap
rv.update(zip(d.keys(), map(func, d.values())))
File "/usr/local/lib/python3.7/site-packages/distributed/worker.py", line 3666, in dumps_task
d["kwargs"] = warn_dumps(task[3])
File "/usr/local/lib/python3.7/site-packages/distributed/worker.py", line 3678, in warn_dumps
b = dumps(obj, protocol=4)
File "/usr/local/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 60, in dumps
result = cloudpickle.dumps(x, **dump_kwargs)
File "/usr/local/lib/python3.7/site-packages/cloudpickle/cloudpickle_fast.py", line 102, in dumps
cp.dump(obj)
File "/usr/local/lib/python3.7/site-packages/cloudpickle/cloudpickle_fast.py", line 563, in dump
return Pickler.dump(self, obj)
File "/usr/local/lib/python3.7/pickle.py", line 437, in dump
self.save(obj)
File "/usr/local/lib/python3.7/pickle.py", line 504, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/local/lib/python3.7/site-packages/dill/dill.py", line 871, in save_module_dict
StockPickler.save_dict(pickler, obj)
File "/usr/local/lib/python3.7/pickle.py", line 859, in save_dict
self._batch_setitems(obj.items())
File "/usr/local/lib/python3.7/pickle.py", line 885, in _batch_setitems
save(v)
File "/usr/local/lib/python3.7/pickle.py", line 549, in save
self.save_reduce(obj=obj, *rv)
File "/usr/local/lib/python3.7/pickle.py", line 662, in save_reduce
save(state)
File "/usr/local/lib/python3.7/pickle.py", line 504, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/local/lib/python3.7/site-packages/dill/dill.py", line 871, in save_module_dict
StockPickler.save_dict(pickler, obj)
File "/usr/local/lib/python3.7/pickle.py", line 859, in save_dict
self._batch_setitems(obj.items())
File "/usr/local/lib/python3.7/pickle.py", line 885, in _batch_setitems
save(v)
File "/usr/local/lib/python3.7/pickle.py", line 504, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/local/lib/python3.7/site-packages/cloudpickle/cloudpickle_fast.py", line 745, in save_function
*self._dynamic_function_reduce(obj), obj=obj
File "/usr/local/lib/python3.7/site-packages/cloudpickle/cloudpickle_fast.py", line 682, in _save_reduce_pickle5
dictitems=dictitems, obj=obj
File "/usr/local/lib/python3.7/pickle.py", line 638, in save_reduce
save(args)
File "/usr/local/lib/python3.7/pickle.py", line 504, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/local/lib/python3.7/pickle.py", line 789, in save_tuple
save(element)
File "/usr/local/lib/python3.7/pickle.py", line 504, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/local/lib/python3.7/pickle.py", line 774, in save_tuple
save(element)
File "/usr/local/lib/python3.7/pickle.py", line 504, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/local/lib/python3.7/site-packages/dill/dill.py", line 1097, in save_cell
f = obj.cell_contents
ValueError: Cell is empty
Kevin Kho
ValueError
you mean?Jeremy Phelps
09/08/2021, 3:43 PMJeremy Phelps
09/08/2021, 3:52 PMpickle
or dill
in my flow, and I haven't allowed new versions of Prefect or Dask to get pulled in, so this doesn't explain why the error suddenly popped up. Unless the problem is in a transient dependency. My project has 99 direct dependencies, so that would be extremely difficult to find.Kevin Kho
petastorm
and it affected the behavior of pyspark
Jeremy Phelps
09/08/2021, 3:54 PMnately
installed.Jeremy Phelps
09/08/2021, 4:01 PMsite.py
so nothing gets a chance to import dill
ahead of it.Jeremy Phelps
09/08/2021, 9:51 PMKevin Kho