Hi, I'm trying to use GCSResult as a result handle...
# prefect-server
m
Hi, I'm trying to use GCSResult as a result handler for my flow (we're using a docker agent). Whenever I set the result handler, I get the below error (and whenever do not specify a result handler the flow runs fine). I've installed the google python library in the docker image and set my GCP credentials as a Prefect Secret. Do you have any ideas about the issue here?
k
Is this on Coiled? Do you have the extra installed in your coiled software environment?
m
this is on coiled (hello again 🙂 ) - I have the google python client installed both in the prefect docker image and the coiled software environment
k
How did you do it in the coield software env?
m
Copy code
coiled.create_software_environment(
    name="prefect-executor-v2",
    container="njgreg/greg-smith-prefect-executor",
    pip=['numpy','google-api-python-client']
)
I also tried installing
google-cloud
with pip too - I get the same error
I also tried installing these three libraries
Copy code
'google-auth','google-auth-httplib2', 'google-api-python-client'
same error (installed both in the coiled environment and the prefect Docker image)
k
What is your RunConfig? Docker?
m
Yes, docker
k
Does this error happen with LocalExecutor?
m
No, just the Coiled Dask Executor
It ran fine with the LocalExecutor
k
Ok I think it’s clear something is wrong with the software env right? I would see if Coiled can give you insight on that
m
Sounds good - I'll try over there. Thanks!
When I do flow.run() locally (with the DaskExecutor on Coiled) it executes fine. When I try to launch the flow from Prefect Cloud, that's when I get the error. Do you think this is still a Coiled issue? Or is it worth troubleshooting more on the Prefect side?
k
No that sounds like a non Coiled side. Can you exec into your container then and try the import?
Sorry, when you tried Local Executor, was that with DockerRun and an Agent? I think that would be the thing to try
m
It was with DockerRun and an Agent
Let me try on the container - I've got to run to a meeting but I'll let you know how it goes and keep you posted
thanks again for the help
So this runs fine
Copy code
import prefect
from prefect import task, Flow
from prefect.executors import DaskExecutor
import coiled
from prefect.storage import Docker

coiled.create_software_environment(
    name="prefect-executor-v3",
    container="njgreg/greg-smith-prefect-executor",
    pip=['numpy','google-auth','google-auth-httplib2', 'google-api-python-client']
)

@task
def hello_task():
    from google.oauth2.service_account import Credentials
    logger = prefect.context.get("logger")
    <http://logger.info|logger.info>("Hello world!")

flow = Flow("test-coiled-execution", tasks=[hello_task])

executor = DaskExecutor(
    cluster_class=coiled.Cluster,
    cluster_kwargs={
        "software": "prefect-executor-v3",
        "shutdown_on_close": False,
        "name": "prefect-executor",
    },
)

flow.executor = executor
flow.storage = Docker(registry_url="",
python_dependencies=['coiled','pandas','s3fs','blosc','lz4','google-auth','google-auth-httplib2', 'google-api-python-client'])
# flow.result = gcs_result

flow.register('tester')
and I'm able to trigger this flow from the Prefect Cloud UI
(I added the import Credentials piece that was throwing the error)
It's when I uncomment this
# flow.result = gcs_result
I get the import error when trying to trigger it from the Web UI
k
Can you try adding
google-cloud-storage
?
m
Same error 😞
k
I am puzzled because one of the following should NOT work right? 1. Local execution but DaskExecutor with the software env 2. Agent execution with LocalExecutor Because there are 3 parts. The agent, the flow container, and the executor. I don’t think the agent needs it since it spins up a container. The flow container needs it to talk to Dask, and the Dask cluster needs it to persist work. Number 1 above tests cluster installation. Number 2 tests the container installation. Could you exec into your container and try importing it?
m
I exec into the container and I was able to run the following in python
Copy code
from google.oauth2.service_account import Credentials
I'll try the coiled container now
that also worked in the coiled container
k
Sometime there are two installations of Python in some containers. It would help if you could check there is only one in the Docker container
where python
But I can’t believe everything else would be working and just the google library is not
😢 1
m
I don't have where in the container
this is what I get for which
I guess this may be a coiled error?
I just tried the local dask executor and it worked fine
It's odd that it can execute on coiled when run locally though
k
I’m as confused as you are…
Wait, are you using the adaptive cluster? Should we try without that?
m
Yeah I was - I can try without it
It didn't work 😞
but I think you're right that it's on the coiled side
looking at the error message again it seems to refer to a coiled conda env
Copy code
Unexpected error: ModuleNotFoundError("No module named 'google'")
Traceback (most recent call last):
  File "/opt/conda/envs/coiled/lib/python3.8/site-packages/prefect/engine/runner.py", line 48, in inner
    new_state = method(self, state, *args, **kwargs)
  File "/opt/conda/envs/coiled/lib/python3.8/site-packages/prefect/engine/task_runner.py", line 926, in get_task_run_state
    result = self.result.write(value, **formatting_kwargs)
  File "/opt/conda/envs/coiled/lib/python3.8/site-packages/prefect/engine/results/gcs_result.py", line 77, in write
    self.gcs_bucket.blob(new.location).upload_from_string(binary_data)
  File "/opt/conda/envs/coiled/lib/python3.8/site-packages/prefect/engine/results/gcs_result.py", line 39, in gcs_bucket
    from prefect.utilities.gcp import get_storage_client
  File "/opt/conda/envs/coiled/lib/python3.8/site-packages/prefect/utilities/gcp.py", line 6, in <module>
    from google.oauth2.service_account import Credentials
ModuleNotFoundError: No module named 'google'
So I'm going to try to fix that in the image and see if it works
thanks for the help on this
k
of course!
🙌 1
Can you try
"prefect[google]"
under the
pip
section of the Coiled software env?
m
same error
I even tried putting this in the first task in the script
Copy code
import pip
pip.main(['install', 'google-auth-oauthlib'])
pip.main(['install', 'google-api-python-client'])
pip.main(['install', 'google-auth-httplib2'])
same error
k
Just confirming it works with LocalDaskExecutor but not Executor?
m
yeah
Maybe we need to set the PYTHONPATH or PYTHONHOME environment variables?
I noticed those are not set in the container, so maybe it's looking for the Google auth library in the wrong place
k
that’s certainly worth exploring. Pip install should put it in the PYTHONPATH though
m
this didn't work either 😞
I outputted all the installed packages to the logs and the google ones are not appearing
Copy code
installed_packages = pkg_resources.working_set
    installed_packages_list = sorted(["%s==%s" % (i.key, i.version)
        for i in installed_packages])
    <http://prefect.context.logger.info|prefect.context.logger.info>(installed_packages_list)
and the log output
Copy code
['aiobotocore==2.0.1', 'aiohttp==3.8.1', 'aioitertools==0.8.0', 'aiosignal==1.2.0', 'appdirs==1.4.4', 'asn1crypto==1.4.0', 'async-timeout==4.0.1', 'attrs==21.2.0', 'backcall==0.2.0', 'backoff==1.11.1', 'backports.functools-lru-cache==1.6.4', 'blosc==1.10.2', 'bokeh==2.4.2', 'boto3==1.19.8', 'botocore==1.22.8', 'brotlipy==0.7.0', 'certifi==2021.10.8', 'cffi==1.15.0', 'charset-normalizer==2.0.9', 'click==8.0.3', 'cloudpickle==2.0.0', 'coiled==0.0.58', 'colorama==0.4.4', 'commonmark==0.9.1', 'croniter==0.3.36', 'cryptography==36.0.0', 'cytoolz==0.11.2', 'dask==2021.11.2', 'dataclasses==0.8', 'decorator==5.1.0', 'distributed==2021.11.2', 'docker-pycreds==0.4.0', 'docker==5.0.3', 'frozenlist==1.2.0', 'fsspec==2021.11.1', 'future==0.18.2', 'heapdict==1.0.1', 'idna==3.1', 'importlib-resources==5.4.0', 'ipython==7.30.1', 'jedi==0.18.1', 'jinja2==3.0.3', 'jmespath==0.10.0', 'locket==0.2.0', 'lz4==3.1.10', 'markupsafe==2.0.1', 'marshmallow-oneofschema==3.0.1', 'marshmallow==3.14.1', 'matplotlib-inline==0.1.3', 'msgpack==1.0.3', 'multidict==5.2.0', 'mypy-extensions==0.4.3', 'natsort==8.0.0', 'numpy==1.21.4', 'olefile==0.46', 'packaging==21.3', 'pandas==1.3.4', 'parso==0.8.3', 'partd==1.2.0', 'pendulum==2.1.2', 'pexpect==4.8.0', 'pickleshare==0.7.5', 'pillow==8.4.0', 'pip==21.3.1', 'prefect==0.15.10', 'prompt-toolkit==3.0.23', 'psutil==5.8.0', 'ptyprocess==0.7.0', 'pycparser==2.21', 'pygments==2.10.0', 'pyopenssl==21.0.0', 'pyparsing==3.0.6', 'pysocks==1.7.1', 'python-box==5.4.1', 'python-dateutil==2.8.2', 'python-slugify==5.0.2', 'pytz==2021.3', 'pytzdata==2020.1', 'pyyaml==6.0', 'requests==2.26.0', 'rich==10.15.2', 'ruamel.yaml.clib==0.2.6', 'ruamel.yaml==0.17.17', 's3fs==2021.11.1', 's3transfer==0.5.0', 'setuptools==59.4.0', 'six==1.16.0', 'sortedcontainers==2.4.0', 'tabulate==0.8.9', 'tblib==1.7.0', 'text-unidecode==1.3', 'toml==0.10.2', 'toolz==0.11.2', 'tornado==6.1', 'traitlets==5.1.1', 'typing-extensions==4.0.1', 'unidecode==1.3.2', 'urllib3==1.26.7', 'wcwidth==0.2.5', 'websocket-client==1.2.3', 'wheel==0.37.0', 'wrapt==1.13.3', 'yarl==1.7.2', 'zict==2.0.0', 'zipp==3.6.0']
k
I think this is getting to be too much trouble that maybe you should just consider making your own image?
m
I did make my own image
it's the same result - one of prefect or coiled can't locate the google libraries that I need to authenticate to our GCP setup
k
Are you still using the software environment? Maybe you can pass the image to that instead of building it?
m
Yeah - I've tried that as well
k
When you logged the packages, was it just with the normal executor?
a
Can you share your Dockerfile so that we can have a look? Regarding authenticating with your GCP resources - is this only a package dependency issue or an issue with missing service account?
@Michael Ulin I think the most helpful thing you can do right now is to separate dependencies required by your Prefect flow, and dependencies needed by Dask (and thus Coiled). If you could move the imports with custom GCP modules etc. into the tasks that will be executed on Dask, and you would include only Prefect-relevant dependencies in your Prefect flow imports, then it would be easier to separate what goes into the Coiled software environment, and what goes into the Prefect Docker image built when registering flow using Docker storage. Additionally, you could then potentially get rid of the custom image here:
Copy code
coiled.create_software_environment(
    name="prefect-executor-v3",
    container="njgreg/greg-smith-prefect-executor", # comment this line out
# put all dependencies into pip list below:    
pip=['numpy','google-auth','google-auth-httplib2', 'google-api-python-client']
)
I think taking it slowly and separating dependencies, then potentially building your own Prefect image with a custom Dockerfile, will solve the issue.
m
Thanks @Anna Geller and @Kevin Kho I'm talking to folks at Coiled tomorrow on this. Appreciate the help and support here.
🙌 1