I’m experiencing an issue with unpickling when a f...
# ask-community
s
I’m experiencing an issue with unpickling when a flow is registered from inside a conda environment. Registering the flow directly from a
virtualenv
which contains the necessary dependencies works and the flow runs as expected. But registering the flow from a conda environment I receive this unpickling error when running.
Copy code
Failed to load and execute Flow's environment: StorageError('An error occurred while unpickling the flow:\n AttributeError("Can\'t get attribute \'PandasIndex\' on <module \'xarray.core.indexes\' from \'/srv/conda/envs/notebook/lib/python3.8/site-packages/xarray/core/indexes.py\'>")')
The Dask worker uses the same base image as the one used to register the flow.
And storage is on S3.
k
Hey @Sean Harkins! This seems tricky. Are you using the conda environment to register but then running with Docker?
s
I hope this isn’t anything exceedingly tricky. We have been registering flows using virtualenvs and the flows are executed using an agent and dask cluster running in ECS (the worker image uses a conda environment which is activated as part of the base image) and this works fine. But for our CI we need to have a wide variety of potential dependencies which our flows might require so I tried to use the same worker image with the conda environment active for registering flows and now I am seeing this error.
k
Are the versions pinned? Did you output the versions of all packages and compare?
s
Identical
Actually, let me verify before I say so for sure.
k
Do you need it pickled? You can try storing as a script too
s
We have been consistently using the pickled approach successfully so I would like to at least understand why it is failing in this case before making a switch.
k
That makes sense. This error indicates some version mismatch right? You could also try running with debug logs?
s
So interestingly this module in
xarray
was just recently refactored https://github.com/pydata/xarray/commit/6e14df62f0b01d8ca5b04bd0ed2b5ee45444265d#diff-b73383e3adc1135bda2af03eaa6[…]5a64e23c193e4f351f6bcdc17fc0d30f but we have everything pinned to
0.18.0
prior to this change so I’m not sure how this updated version might be creeping in.
k
Just wondering how you're sure the conda environment is being activated? Some people have issues with activating conda in Docker. Would your setup just fail entirely if the environment we're activated?
s
@Kevin Kho The issue was a package that is not published in conda and therefore needed to be
pip
installed into the conda environment. This package was upgrading the
xarray
version on the image registering our flow. Thanks for rubber ducking me through it 😄
k
Oh jeez. How long did this take to find? A full day?
s
No. Thankfully I was able to track it down pretty quickly after your suggestion 🙇
k
Oh that’s good to hear. Nice job!