Adam Roderick

03/15/2022, 5:36 PM
Hey all! I am working on more complex flow tasks that require external libraries like pandas and scikit-learn. I've had trouble deploying before due to pickling not supporting all libraries. Do you have any general guidance, or a list of "known pickleable" libraries?

Kevin Kho

03/15/2022, 5:38 PM
Those should be pickleable. It tends to be deep learning libraries, connections like SQL clients and cloud clients, and distributed compute stuff like Dask and Spark that are not pickleable
You can test by doing
on those objects
👍 1
But also the pickleable requirement is only if you use Dask. You can store the Flow as a script so there is no serialization, and if you don’t use Dask, no pickling needs to happen
More info here
👀 1