Marwan Sarieddine
09/03/2020, 7:42 PMDylan
09/03/2020, 8:28 PMMarwan Sarieddine
09/03/2020, 8:29 PMChris White
09/03/2020, 8:51 PMMarwan Sarieddine
09/03/2020, 8:54 PMfrom prefect import task, Flow
import cloudpickle
import pickle
@task
def func(x):
return 2 * x
with Flow("test") as flow:
y = func(2)
cloudpickle.dumps(flow)
pickle.dumps(flow)
and the last line picke.dumps(flow)
throws this error:
Traceback (most recent call last):
File "simple_pickle.py", line 14, in <module>
pickle.dumps(flow)
_pickle.PicklingError: Can't pickle <function func at 0x106f58160>: it's not the same object as __main__.func
I am using prefect version 0.13.4
Chris White
09/03/2020, 9:01 PMpickle
is not a viable serializer for Prefect workflows - it is very sensitive to import paths and where things are defined. Specifically in this instance, func
is both the name of the underlying function (defined in __main__
) as well as the name of the task instance. To be precise:
@task
def func(x):
return 2 * x
func.__dict__['run'] # <function __main__.func(x)>
func # <Task: func>
pickle.dumps(func) # pickling error
pickle.dumps(func.__dict__['run']) # pickling error
and pickle gets confused about which func
is which. In theory, a class-based implementation resolves this:
from prefect import Task
class Func(Task):
def run(self, x):
return 2 * x
my_func = Func()
pickle.dumps(my_func) # succeeds!
but I suspect you will run into more issues like this the deeper you dig hereMarwan Sarieddine
09/03/2020, 9:11 PMChris White
09/03/2020, 9:27 PMMarvin
09/03/2020, 9:28 PM