I'm having a strange error that just popped up one...
# ask-community
j
I'm having a strange error that just popped up one day after a completely unrelated code change:
Copy code
Unexpected error: ValueError('Cell is empty')
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 49, in dumps
    result = pickle.dumps(x, **dump_kwargs)
_pickle.PicklingError: Can't pickle <function create_job at 0x7fabe956fc20>: it's not the same object as tasks.common.create_job
None of my Prefect tasks have been working since it first appeared. I tried this Stack Overflow answer but it didn't help: https://stackoverflow.com/a/1413299 What causes this error? The
create_job
function is always referred to by its full Python import path (
tasks.common.create_job
) due to the Stack Overflow answer.
k
Hey, I made a task in another file and tried to import it and it seems to retain the name. Also Prefect mostly uses
cloudpickle
and this library doesn’t seem to throw the same error. When I do:
Copy code
from prefect import task, Flow
import cloudpickle

@task
def abc(x):
    return x

abc.__name__ = "test"

print(cloudpickle.dumps(abc))
whereas the equivalent for
pickle
throws the error. Would you be able to give me a minimum example?
j
I tried using the
stored_as_script
option on the Flow to avoid attempting to pickle functions, as that only works under specific circumstances. Is there more I need to do to avoid pickling code?
k
No on the Prefect side but if you’re using Dask as a executor, work is sent to the workers by using
cloudpickle
Results also use the
PickleSerializer
by default (for task outputs). You can turn that off with
@task(checkpoint=False)
, but the
PickleSerialzier
uses
cloudpickle
also so I don’t expect that to be your issue.
j
It seems that
cloudpickle
uses
pickle
internally. The error happens within
cloudpickle
. I wonder if a recent release of
cloudpickle
introduced this problem. It would explain why it popped up with no relevant code changes on my part. I'm going to try pinning
cloudpickle
to a version released during a time when things still worked.
Actually, I'm wrong. The releases happened in 2020.
k
I was just on the page myself. Yeah, last one was August 2020.
j
I misread the year.
Looking at the Dask source code reveals that my error is a red herring. Dask tries to use
pickle
first, and then falls back to
cloudpickle
when that fails. This results in a secondary exception that I've been ignoring because I assumed it was irrelevant. Turns out the secondary exception is the real error:
Copy code
Traceback (most recent call last):
                                                 File "/usr/local/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 49, in dumps
                                                   result = pickle.dumps(x, **dump_kwargs)
                                               _pickle.PicklingError: Can't pickle <function create_job at 0x7f14afc1a5f0>: it's not the same object as tasks.common.create_job
    
                                               During handling of the above exception, another exception occurred:
    
                                               Traceback (most recent call last):
                                                 File "/usr/local/lib/python3.7/site-packages/prefect/engine/runner.py", line 48, in inner
                                                   new_state = method(self, state, *args, **kwargs)
                                                 File "/usr/local/lib/python3.7/site-packages/prefect/engine/flow_runner.py", line 628, in get_flow_run_state
                                                   extra_context=extra_context(task),
                                                 File "/usr/local/lib/python3.7/site-packages/prefect/executors/dask.py", line 396, in submit
                                                   _maybe_run, self._should_run_event.name, fn, *args, **kwargs
                                                 File "/usr/local/lib/python3.7/site-packages/distributed/client.py", line 1603, in submit
                                                   actors=actor,
                                                 File "/usr/local/lib/python3.7/site-packages/distributed/client.py", line 2580, in _graph_to_futures
                                                   dsk = dsk.__dask_distributed_pack__(self, keyset, annotations)
                                                 File "/usr/local/lib/python3.7/site-packages/dask/highlevelgraph.py", line 995, in __dask_distributed_pack__
                                                   client_keys,
                                                 File "/usr/local/lib/python3.7/site-packages/dask/highlevelgraph.py", line 423, in __dask_distributed_pack__
                                                   dsk = toolz.valmap(dumps_task, dsk)
                                                 File "/usr/local/lib/python3.7/site-packages/toolz/dicttoolz.py", line 83, in valmap
                                                   rv.update(zip(d.keys(), map(func, d.values())))
                                                 File "/usr/local/lib/python3.7/site-packages/distributed/worker.py", line 3666, in dumps_task
                                                   d["kwargs"] = warn_dumps(task[3])
                                                 File "/usr/local/lib/python3.7/site-packages/distributed/worker.py", line 3678, in warn_dumps
                                                   b = dumps(obj, protocol=4)
                                                 File "/usr/local/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 60, in dumps
                                                   result = cloudpickle.dumps(x, **dump_kwargs)
                                                 File "/usr/local/lib/python3.7/site-packages/cloudpickle/cloudpickle_fast.py", line 102, in dumps
                                                   cp.dump(obj)
                                                 File "/usr/local/lib/python3.7/site-packages/cloudpickle/cloudpickle_fast.py", line 563, in dump
                                                   return Pickler.dump(self, obj)
                                                 File "/usr/local/lib/python3.7/pickle.py", line 437, in dump
                                                   self.save(obj)
                                                 File "/usr/local/lib/python3.7/pickle.py", line 504, in save
                                                   f(self, obj) # Call unbound method with explicit self
                                                 File "/usr/local/lib/python3.7/site-packages/dill/dill.py", line 871, in save_module_dict
                                                   StockPickler.save_dict(pickler, obj)
                                                 File "/usr/local/lib/python3.7/pickle.py", line 859, in save_dict
                                                   self._batch_setitems(obj.items())
                                                 File "/usr/local/lib/python3.7/pickle.py", line 885, in _batch_setitems
                                                   save(v)
                                                 File "/usr/local/lib/python3.7/pickle.py", line 549, in save
                                                   self.save_reduce(obj=obj, *rv)
                                                 File "/usr/local/lib/python3.7/pickle.py", line 662, in save_reduce
                                                   save(state)
                                                 File "/usr/local/lib/python3.7/pickle.py", line 504, in save
                                                   f(self, obj) # Call unbound method with explicit self
                                                 File "/usr/local/lib/python3.7/site-packages/dill/dill.py", line 871, in save_module_dict
                                                   StockPickler.save_dict(pickler, obj)
                                                 File "/usr/local/lib/python3.7/pickle.py", line 859, in save_dict
                                                   self._batch_setitems(obj.items())
                                                 File "/usr/local/lib/python3.7/pickle.py", line 885, in _batch_setitems
                                                   save(v)
                                                 File "/usr/local/lib/python3.7/pickle.py", line 504, in save
                                                   f(self, obj) # Call unbound method with explicit self
                                                 File "/usr/local/lib/python3.7/site-packages/cloudpickle/cloudpickle_fast.py", line 745, in save_function
                                                   *self._dynamic_function_reduce(obj), obj=obj
                                                 File "/usr/local/lib/python3.7/site-packages/cloudpickle/cloudpickle_fast.py", line 682, in _save_reduce_pickle5
                                                   dictitems=dictitems, obj=obj
                                                 File "/usr/local/lib/python3.7/pickle.py", line 638, in save_reduce
                                                   save(args)
                                                 File "/usr/local/lib/python3.7/pickle.py", line 504, in save
                                                   f(self, obj) # Call unbound method with explicit self
                                                 File "/usr/local/lib/python3.7/pickle.py", line 789, in save_tuple
                                                   save(element)
                                                 File "/usr/local/lib/python3.7/pickle.py", line 504, in save
                                                   f(self, obj) # Call unbound method with explicit self
                                                 File "/usr/local/lib/python3.7/pickle.py", line 774, in save_tuple
                                                   save(element)
                                                 File "/usr/local/lib/python3.7/pickle.py", line 504, in save
                                                   f(self, obj) # Call unbound method with explicit self
                                                 File "/usr/local/lib/python3.7/site-packages/dill/dill.py", line 1097, in save_cell
                                                   f = obj.cell_contents
                                               ValueError: Cell is empty
k
Ohh. The
ValueError
you mean?
j
Yes. What would cause that?
k
I don’t know enough about
dill
but looking at this , they literally got around the error by changing the order of imports. Also this where it says
importing dill
prior to Dask causes this error.
j
I don't use
pickle
or
dill
in my flow, and I haven't allowed new versions of Prefect or Dask to get pulled in, so this doesn't explain why the error suddenly popped up. Unless the problem is in a transient dependency. My project has 99 direct dependencies, so that would be extremely difficult to find.
k
It might be nately because in the first issue, they imported
petastorm
and it affected the behavior of
pyspark
j
I don't have
nately
installed.
I'm going to import Dask from my Docker image's
site.py
so nothing gets a chance to import
dill
ahead of it.
Importing Dask first does not help. I got exactly the same error.
k
Sorry, I don’t think I’d have any more ideas on this one.