https://prefect.io logo
Title
m

Michael Ulin

12/14/2021, 9:45 PM
Hi, I'm trying to use GCSResult as a result handler for my flow (we're using a docker agent). Whenever I set the result handler, I get the below error (and whenever do not specify a result handler the flow runs fine). I've installed the google python library in the docker image and set my GCP credentials as a Prefect Secret. Do you have any ideas about the issue here?
k

Kevin Kho

12/14/2021, 9:47 PM
Is this on Coiled? Do you have the extra installed in your coiled software environment?
m

Michael Ulin

12/14/2021, 9:56 PM
this is on coiled (hello again 🙂 ) - I have the google python client installed both in the prefect docker image and the coiled software environment
k

Kevin Kho

12/14/2021, 10:02 PM
How did you do it in the coield software env?
m

Michael Ulin

12/14/2021, 10:03 PM
coiled.create_software_environment(
    name="prefect-executor-v2",
    container="njgreg/greg-smith-prefect-executor",
    pip=['numpy','google-api-python-client']
)
I also tried installing
google-cloud
with pip too - I get the same error
I also tried installing these three libraries
'google-auth','google-auth-httplib2', 'google-api-python-client'
same error (installed both in the coiled environment and the prefect Docker image)
k

Kevin Kho

12/14/2021, 10:17 PM
What is your RunConfig? Docker?
m

Michael Ulin

12/14/2021, 10:17 PM
Yes, docker
k

Kevin Kho

12/14/2021, 10:20 PM
Does this error happen with LocalExecutor?
m

Michael Ulin

12/14/2021, 10:23 PM
No, just the Coiled Dask Executor
It ran fine with the LocalExecutor
k

Kevin Kho

12/14/2021, 10:24 PM
Ok I think it’s clear something is wrong with the software env right? I would see if Coiled can give you insight on that
m

Michael Ulin

12/14/2021, 10:25 PM
Sounds good - I'll try over there. Thanks!
When I do flow.run() locally (with the DaskExecutor on Coiled) it executes fine. When I try to launch the flow from Prefect Cloud, that's when I get the error. Do you think this is still a Coiled issue? Or is it worth troubleshooting more on the Prefect side?
k

Kevin Kho

12/14/2021, 10:37 PM
No that sounds like a non Coiled side. Can you exec into your container then and try the import?
Sorry, when you tried Local Executor, was that with DockerRun and an Agent? I think that would be the thing to try
m

Michael Ulin

12/14/2021, 10:38 PM
It was with DockerRun and an Agent
Let me try on the container - I've got to run to a meeting but I'll let you know how it goes and keep you posted
thanks again for the help
So this runs fine
import prefect
from prefect import task, Flow
from prefect.executors import DaskExecutor
import coiled
from prefect.storage import Docker

coiled.create_software_environment(
    name="prefect-executor-v3",
    container="njgreg/greg-smith-prefect-executor",
    pip=['numpy','google-auth','google-auth-httplib2', 'google-api-python-client']
)

@task
def hello_task():
    from google.oauth2.service_account import Credentials
    logger = prefect.context.get("logger")
    <http://logger.info|logger.info>("Hello world!")

flow = Flow("test-coiled-execution", tasks=[hello_task])

executor = DaskExecutor(
    cluster_class=coiled.Cluster,
    cluster_kwargs={
        "software": "prefect-executor-v3",
        "shutdown_on_close": False,
        "name": "prefect-executor",
    },
)

flow.executor = executor
flow.storage = Docker(registry_url="",
python_dependencies=['coiled','pandas','s3fs','blosc','lz4','google-auth','google-auth-httplib2', 'google-api-python-client'])
# flow.result = gcs_result

flow.register('tester')
and I'm able to trigger this flow from the Prefect Cloud UI
(I added the import Credentials piece that was throwing the error)
It's when I uncomment this
# flow.result = gcs_result
I get the import error when trying to trigger it from the Web UI
k

Kevin Kho

12/14/2021, 10:56 PM
Can you try adding
google-cloud-storage
?
m

Michael Ulin

12/15/2021, 12:19 AM
Same error 😞
k

Kevin Kho

12/15/2021, 1:40 AM
I am puzzled because one of the following should NOT work right? 1. Local execution but DaskExecutor with the software env 2. Agent execution with LocalExecutor Because there are 3 parts. The agent, the flow container, and the executor. I don’t think the agent needs it since it spins up a container. The flow container needs it to talk to Dask, and the Dask cluster needs it to persist work. Number 1 above tests cluster installation. Number 2 tests the container installation. Could you exec into your container and try importing it?
m

Michael Ulin

12/15/2021, 1:51 AM
I exec into the container and I was able to run the following in python
from google.oauth2.service_account import Credentials
I'll try the coiled container now
that also worked in the coiled container
k

Kevin Kho

12/15/2021, 1:54 AM
Sometime there are two installations of Python in some containers. It would help if you could check there is only one in the Docker container
where python
But I can’t believe everything else would be working and just the google library is not
😢 1
m

Michael Ulin

12/15/2021, 1:56 AM
I don't have where in the container
this is what I get for which
I guess this may be a coiled error?
I just tried the local dask executor and it worked fine
It's odd that it can execute on coiled when run locally though
k

Kevin Kho

12/15/2021, 2:02 AM
I’m as confused as you are…
Wait, are you using the adaptive cluster? Should we try without that?
m

Michael Ulin

12/15/2021, 2:03 AM
Yeah I was - I can try without it
It didn't work 😞
but I think you're right that it's on the coiled side
looking at the error message again it seems to refer to a coiled conda env
Unexpected error: ModuleNotFoundError("No module named 'google'")
Traceback (most recent call last):
  File "/opt/conda/envs/coiled/lib/python3.8/site-packages/prefect/engine/runner.py", line 48, in inner
    new_state = method(self, state, *args, **kwargs)
  File "/opt/conda/envs/coiled/lib/python3.8/site-packages/prefect/engine/task_runner.py", line 926, in get_task_run_state
    result = self.result.write(value, **formatting_kwargs)
  File "/opt/conda/envs/coiled/lib/python3.8/site-packages/prefect/engine/results/gcs_result.py", line 77, in write
    self.gcs_bucket.blob(new.location).upload_from_string(binary_data)
  File "/opt/conda/envs/coiled/lib/python3.8/site-packages/prefect/engine/results/gcs_result.py", line 39, in gcs_bucket
    from prefect.utilities.gcp import get_storage_client
  File "/opt/conda/envs/coiled/lib/python3.8/site-packages/prefect/utilities/gcp.py", line 6, in <module>
    from google.oauth2.service_account import Credentials
ModuleNotFoundError: No module named 'google'
So I'm going to try to fix that in the image and see if it works
thanks for the help on this
k

Kevin Kho

12/15/2021, 2:31 AM
of course!
🙌 1
Can you try
"prefect[google]"
under the
pip
section of the Coiled software env?
m

Michael Ulin

12/15/2021, 5:25 PM
same error
I even tried putting this in the first task in the script
import pip
pip.main(['install', 'google-auth-oauthlib'])
pip.main(['install', 'google-api-python-client'])
pip.main(['install', 'google-auth-httplib2'])
same error
k

Kevin Kho

12/15/2021, 5:32 PM
Just confirming it works with LocalDaskExecutor but not Executor?
m

Michael Ulin

12/15/2021, 5:57 PM
yeah
Maybe we need to set the PYTHONPATH or PYTHONHOME environment variables?
I noticed those are not set in the container, so maybe it's looking for the Google auth library in the wrong place
k

Kevin Kho

12/15/2021, 6:03 PM
that’s certainly worth exploring. Pip install should put it in the PYTHONPATH though
m

Michael Ulin

12/16/2021, 7:05 PM
this didn't work either 😞
I outputted all the installed packages to the logs and the google ones are not appearing
installed_packages = pkg_resources.working_set
    installed_packages_list = sorted(["%s==%s" % (i.key, i.version)
        for i in installed_packages])
    <http://prefect.context.logger.info|prefect.context.logger.info>(installed_packages_list)
and the log output
['aiobotocore==2.0.1', 'aiohttp==3.8.1', 'aioitertools==0.8.0', 'aiosignal==1.2.0', 'appdirs==1.4.4', 'asn1crypto==1.4.0', 'async-timeout==4.0.1', 'attrs==21.2.0', 'backcall==0.2.0', 'backoff==1.11.1', 'backports.functools-lru-cache==1.6.4', 'blosc==1.10.2', 'bokeh==2.4.2', 'boto3==1.19.8', 'botocore==1.22.8', 'brotlipy==0.7.0', 'certifi==2021.10.8', 'cffi==1.15.0', 'charset-normalizer==2.0.9', 'click==8.0.3', 'cloudpickle==2.0.0', 'coiled==0.0.58', 'colorama==0.4.4', 'commonmark==0.9.1', 'croniter==0.3.36', 'cryptography==36.0.0', 'cytoolz==0.11.2', 'dask==2021.11.2', 'dataclasses==0.8', 'decorator==5.1.0', 'distributed==2021.11.2', 'docker-pycreds==0.4.0', 'docker==5.0.3', 'frozenlist==1.2.0', 'fsspec==2021.11.1', 'future==0.18.2', 'heapdict==1.0.1', 'idna==3.1', 'importlib-resources==5.4.0', 'ipython==7.30.1', 'jedi==0.18.1', 'jinja2==3.0.3', 'jmespath==0.10.0', 'locket==0.2.0', 'lz4==3.1.10', 'markupsafe==2.0.1', 'marshmallow-oneofschema==3.0.1', 'marshmallow==3.14.1', 'matplotlib-inline==0.1.3', 'msgpack==1.0.3', 'multidict==5.2.0', 'mypy-extensions==0.4.3', 'natsort==8.0.0', 'numpy==1.21.4', 'olefile==0.46', 'packaging==21.3', 'pandas==1.3.4', 'parso==0.8.3', 'partd==1.2.0', 'pendulum==2.1.2', 'pexpect==4.8.0', 'pickleshare==0.7.5', 'pillow==8.4.0', 'pip==21.3.1', 'prefect==0.15.10', 'prompt-toolkit==3.0.23', 'psutil==5.8.0', 'ptyprocess==0.7.0', 'pycparser==2.21', 'pygments==2.10.0', 'pyopenssl==21.0.0', 'pyparsing==3.0.6', 'pysocks==1.7.1', 'python-box==5.4.1', 'python-dateutil==2.8.2', 'python-slugify==5.0.2', 'pytz==2021.3', 'pytzdata==2020.1', 'pyyaml==6.0', 'requests==2.26.0', 'rich==10.15.2', 'ruamel.yaml.clib==0.2.6', 'ruamel.yaml==0.17.17', 's3fs==2021.11.1', 's3transfer==0.5.0', 'setuptools==59.4.0', 'six==1.16.0', 'sortedcontainers==2.4.0', 'tabulate==0.8.9', 'tblib==1.7.0', 'text-unidecode==1.3', 'toml==0.10.2', 'toolz==0.11.2', 'tornado==6.1', 'traitlets==5.1.1', 'typing-extensions==4.0.1', 'unidecode==1.3.2', 'urllib3==1.26.7', 'wcwidth==0.2.5', 'websocket-client==1.2.3', 'wheel==0.37.0', 'wrapt==1.13.3', 'yarl==1.7.2', 'zict==2.0.0', 'zipp==3.6.0']
k

Kevin Kho

12/16/2021, 7:09 PM
I think this is getting to be too much trouble that maybe you should just consider making your own image?
m

Michael Ulin

12/16/2021, 7:22 PM
I did make my own image
it's the same result - one of prefect or coiled can't locate the google libraries that I need to authenticate to our GCP setup
k

Kevin Kho

12/16/2021, 7:28 PM
Are you still using the software environment? Maybe you can pass the image to that instead of building it?
m

Michael Ulin

12/16/2021, 7:29 PM
Yeah - I've tried that as well
k

Kevin Kho

12/16/2021, 7:31 PM
When you logged the packages, was it just with the normal executor?
a

Anna Geller

12/16/2021, 7:37 PM
Can you share your Dockerfile so that we can have a look? Regarding authenticating with your GCP resources - is this only a package dependency issue or an issue with missing service account?
@Michael Ulin I think the most helpful thing you can do right now is to separate dependencies required by your Prefect flow, and dependencies needed by Dask (and thus Coiled). If you could move the imports with custom GCP modules etc. into the tasks that will be executed on Dask, and you would include only Prefect-relevant dependencies in your Prefect flow imports, then it would be easier to separate what goes into the Coiled software environment, and what goes into the Prefect Docker image built when registering flow using Docker storage. Additionally, you could then potentially get rid of the custom image here:
coiled.create_software_environment(
    name="prefect-executor-v3",
    container="njgreg/greg-smith-prefect-executor", # comment this line out
# put all dependencies into pip list below:    
pip=['numpy','google-auth','google-auth-httplib2', 'google-api-python-client']
)
I think taking it slowly and separating dependencies, then potentially building your own Prefect image with a custom Dockerfile, will solve the issue.
m

Michael Ulin

12/17/2021, 12:55 AM
Thanks @Anna Geller and @Kevin Kho I'm talking to folks at Coiled tomorrow on this. Appreciate the help and support here.
🙌 1