Brian McFeeley

07/29/2019, 7:02 PM
I'm running into what looks like a pythonpath/importing error when attempting to run on a local dask cluster:
Unexpected error occured in FlowRunner: ModuleNotFoundError("No module named 'utils'")
, referencing shared static functions in a package called
. Any ideas how I might go about debugging this to make sure that code is available at the time of execution? When I look at the barf from the dask workers themselves, it looks like it's crapping out trying to deserialize a task:
distributed.worker - WARNING - Could not deserialize task
Traceback (most recent call last):
  File "/Users/bmcfeeley/.virtualenvs/spark3.7/lib/python3.7/site-packages/distributed/", line 1272, in add_task
    self.tasks[key] = _deserialize(function, args, kwargs, task)
  File "/Users/bmcfeeley/.virtualenvs/spark3.7/lib/python3.7/site-packages/distributed/", line 3060, in _deserialize
    function = pickle.loads(function)
  File "/Users/bmcfeeley/.virtualenvs/spark3.7/lib/python3.7/site-packages/distributed/protocol/", line 61, in loads
    return pickle.loads(x)
ModuleNotFoundError: No module named 'utils'
Does all the code have to live in the same file as the flow/task definitions somehow?

Chris White

07/29/2019, 7:03 PM
do the workers also have access to this shared module?
no it doesn’t need to live in the same file, it just needs to be importable from the same path
on the workers

Brian McFeeley

07/29/2019, 7:09 PM
ahh ok