Running a Prefect 2.0 Hello World workflow with a ...
# prefect-community
k
Running a Prefect 2.0 Hello World workflow with a GKE cluster from the image
prefecthq/prefect:2.0.2-python3.10
and when trying to access GCS it appears that
gcsfs
is not installed and am getting the following error from my flow, should I build a custom image that installs
gcsfs
?
Copy code
Flow could not be retrieved from deployment.
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/fsspec/registry.py", line 232, in get_filesystem_class
    register_implementation(protocol, _import_class(bit["class"]))
  File "/usr/local/lib/python3.10/site-packages/fsspec/registry.py", line 255, in _import_class
    mod = importlib.import_module(mod)
  File "/usr/local/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1004, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'gcsfs'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/prefect/filesystems.py", line 312, in filesystem
    self._filesystem = fsspec.filesystem(scheme, **self.settings)
  File "/usr/local/lib/python3.10/site-packages/fsspec/registry.py", line 265, in filesystem
    cls = get_filesystem_class(protocol)
  File "/usr/local/lib/python3.10/site-packages/fsspec/registry.py", line 234, in get_filesystem_class
    raise ImportError(bit["err"]) from e
ImportError: Please install gcsfs to access Google Storage

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 247, in retrieve_flow_then_begin_flow_run
    flow = await load_flow_from_flow_run(flow_run, client=client)
  File "/usr/local/lib/python3.10/site-packages/prefect/client.py", line 104, in with_injected_client
    return await fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/prefect/deployments.py", line 47, in load_flow_from_flow_run
    await storage_block.get_directory(from_path=None, local_path=".")
  File "/usr/local/lib/python3.10/site-packages/prefect/filesystems.py", line 454, in get_directory
    return await self.filesystem.get_directory(
  File "/usr/local/lib/python3.10/site-packages/prefect/filesystems.py", line 251, in get_directory
    return self.filesystem.get(from_path, local_path, recursive=True)
  File "/usr/local/lib/python3.10/site-packages/prefect/filesystems.py", line 315, in filesystem
    raise RuntimeError(
RuntimeError: File system created with scheme 'gcs' from base path 'gcs://<bucket>/deployments' could not be created. You are likely missing a Python module required to use the given storage protocol.
1
j
libraries that interact with storage have been removed from our based image due to dependency issues. You can set it really quickly with the
EXTRA_PIP_PACKAGES
setting.
k
Does this look correct?
j
Env is a dict that holds
EXTRA_PIP_PACKAGES
k
Okay, tried it up in the dict and run is still missing
gcsfs
must be defining it in the wrong place...
Building the custom Dockerfile and pushing the image to Artifact Registry worked just fine.
👍 2
n
@Keith you mean you had to build the image yourself with gcsfs instead of the
EXTRA_PIP_PACKAGES
being installed in the image by prefect?
k
Yes @Nate, couldn't get the
EXTRA_PIP_PACKAGES
to install the package, not sure I was putting the config in the correct location.
Built the simplest Dockerfile
Copy code
FROM prefecthq/prefect:2.0.2-python3.10
RUN /usr/local/bin/python -m pip install --upgrade pip
RUN pip install gcsfs