Hi always get that error with the package, but ins...
# ask-community
l
Hi always get that error with the package, but installed it several times. thanks for helping me!
Copy code
[2021-03-01 18:24:40+0100] ERROR - prefect.FlowRunner | Unexpected error: ModuleNotFoundError("No module named 'oauthlib'")
Traceback (most recent call last):
  File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\prefect\engine\runner.py", line 48, in inner
    new_state = method(self, state, *args, **kwargs)
  File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\prefect\engine\flow_runner.py", line 619, in get_flow_run_state
    final_states = executor.wait(
  File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\prefect\executors\dask.py", line 397, in wait
    return self.client.gather(futures)
  File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\distributed\client.py", line 1993, in gather
    return self.sync(
  File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\distributed\client.py", line 839, in sync
    return sync(
  File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\distributed\utils.py", line 340, in sync
    raise exc.with_traceback(tb)
  File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\distributed\utils.py", line 324, in f
    result[0] = yield future
  File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\tornado\gen.py", line 762, in run
    value = future.result()
  File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\distributed\client.py", line 1858, in _gather
    raise exception.with_traceback(traceback)
  File "/opt/conda/lib/python3.8/site-packages/distributed/protocol/pickle.py", line 75, in loads
ModuleNotFoundError: No module named 'oauthlib'
[2021-03-01 18:24:40+0100] ERROR - prefect.billwerk-contracts | Unexpected error occured in FlowRunner: ModuleNotFoundError("No module named 'oauthlib'")
this is my flow run:
Copy code
flow.storage= Docker(registry_url="<http://gcr.io/keller-steering-enabling/flows|gcr.io/keller-steering-enabling/flows>", image_name="billwerk-contracts",python_dependencies=["pandas"])
flow.run_config = KubernetesRun()
flow.run(executor=executor)
flow.register('Billwerk')
z
Hi @Laura Vaida, this package looks to be missing from your
python_dependencies
list in your flow storage configuration
It needs to be installed in the docker image that your flow is run in
l
Copy code
flow.storage= Docker(registry_url="<http://gcr.io/keller-steering-enabling/flows|gcr.io/keller-steering-enabling/flows>", image_name="billwerk-contracts",python_dependencies=["pandas", "oauthlib"])
flow.run_config = KubernetesRun()
flow.run(executor=executor)
flow.register('Billwerk')
like that? gives same error..
Copy code
from oauthlib.oauth2 import LegacyApplicationClient
from requests_oauthlib import OAuth2Session
need this two packages
z
Then you'll need
requests_oauthlib
in that list as well
upvote 1
l
perfect, thanks!
does it work like that in general? to not import the packages, but to declare them as dependencies?
[2021-03-01 19:25:55+0100] ERROR - prefect.FlowRunner | Unexpected error: ModuleNotFoundError("No module named 'oauthlib'")
Traceback (most recent call last):
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\prefect\engine\runner.py", line 48, in inner
new_state = method(self, state, *args, **kwargs)
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\prefect\engine\flow_runner.py", line 619, in get_flow_run_state
final_states = executor.wait(
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\prefect\executors\dask.py", line 397, in wait
return self.client.gather(futures)
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\distributed\client.py", line 1993, in gather
return self.sync(
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\distributed\client.py", line 839, in sync
return sync(
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\distributed\utils.py", line 340, in sync
raise exc.with_traceback(tb)
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\distributed\utils.py", line 324, in f
result[0] = yield future
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\tornado\gen.py", line 762, in run
value = future.result()
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\distributed\client.py", line 1858, in _gather
raise exception.with_traceback(traceback)
File "/opt/conda/lib/python3.8/site-packages/distributed/protocol/pickle.py", line 75, in loads
ModuleNotFoundError: No module named 'oauthlib'
[2021-03-01 19:25:55+0100] ERROR - prefect.billwerk-contracts | Unexpected error occured in FlowRunner: ModuleNotFoundError("No module named 'oauthlib'")
Step 1/9 : FROM prefecthq/prefect:0.14.6-python3.8
---> 67415bcf7bcb
Step 2/9 : ENV PREFECT__USER_CONFIG_PATH='/opt/prefect/config.toml'
---> Using cache
---> 4ffc0b7b4b6b
Step 3/9 : RUN pip install pip --upgrade
---> Using cache
---> ee56fc384512
Step 4/9 : RUN pip show prefect || pip install git+<https://github.com/PrefectHQ/prefect.git@0.14.6#egg=prefect[all_orchestration_extras]>
---> Using cache
---> 28b18e331c74
Step 5/9 : RUN pip install pandas oauthlib requests_oauthlib wheel
---> Running in dd84ec5c9dbf
Collecting pandas
Downloading pandas-1.2.2-cp38-cp38-manylinux1_x86_64.whl (9.7 MB)
Requirement already satisfied: oauthlib in /usr/local/lib/python3.8/site-packages (3.1.0)
Requirement already satisfied: requests_oauthlib in /usr/local/lib/python3.8/site-packages (1.3.0)
Requirement already satisfied: wheel in /usr/local/lib/python3.8/site-packages (0.36.2)
Collecting numpy>=1.16.5
Downloading numpy-1.20.1-cp38-cp38-manylinux2010_x86_64.whl (15.4 MB)
Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.8/site-packages (from pandas) (2.8.1)
Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.8/site-packages (from pandas) (2021.1)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.8/site-packages (from python-dateutil>=2.7.3->pandas) (1.15.0)
Requirement already satisfied: requests>=2.0.0 in /usr/local/lib/python3.8/site-packages (from requests_oauthlib) (2.25.1)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.8/site-packages (from requests>=2.0.0->requests_oauthlib) (1.26.3)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.8/site-packages (from requests>=2.0.0->requests_oauthlib) (2.10)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.8/site-packages (from requests>=2.0.0->requests_oauthlib) (2020.12.5)
Requirement already satisfied: chardet<5,>=3.0.2 in /usr/local/lib/python3.8/site-packages (from requests>=2.0.0->requests_oauthlib) (4.0.0)
Installing collected packages: numpy, pandas
Successfully installed numpy-1.20.1 pandas-1.2.2
Removing intermediate container dd84ec5c9dbf
---> af3778938310
Step 6/9 : RUN mkdir -p /opt/prefect/
---> Running in df00c4a4b75b
Removing intermediate container df00c4a4b75b
---> 27c9c554a00f
Step 7/9 : COPY billwerk-contracts.flow /opt/prefect/flows/billwerk-contracts.prefect
---> 01d61c6debe5
Step 8/9 : COPY healthcheck.py /opt/prefect/healthcheck.py
---> 4cb89cafe90a
Step 9/9 : RUN python /opt/prefect/healthcheck.py '["/opt/prefect/flows/billwerk-contracts.prefect"]' '(3, 8)'
---> Running in 6e9b7cdccdb6
Beginning health checks...
System Version check: OK
Cloudpickle serialization check: OK
Result check: OK
Environment dependency check: OK
All health checks passed.
Removing intermediate container 6e9b7cdccdb6
---> 522099f92070
Successfully built 522099f92070
z
So, here you're calling both
flow.run()
and
flow.register()
When you register a flow you are setting it up for later execution as a
KubernetesRun
which will use your
Docker
image with the proper dependencies installed.
When you run a flow with
flow.run()
it will execute in your local system where you must have
pip install ...
your dependencies yourself
l
but i run it with executor = my load balancer?
at a google cloud platform
z
You can
flow.run()
with a dask executor that may be running somewhere other than your local machine. In that case, you'd need your dependencies installed on the dask workers as well.
Generally, you're better off using
flow.register()
and the
run_config
to manage those dependencies though
l
okey cool
thanks!
z
l
[2021-03-01 192555+0100] ERROR - prefect.FlowRunner | Unexpected error: ModuleNotFoundError("No module named 'oauthlib'")
this is referred to the local run then?
z
Yep
l
hi @Zanie! thanks for helping me so far! I still get that error when I kick it off from the ui
Copy code
#define executor as dask executor with tcp from load balancer
flow.executor = DaskExecutor(address="35.198.104.230:8786") # tcp:port address of dask load balancer
flow.storage = Docker(registry_url="<http://gcr.io/keller-steering-enabling/flows|gcr.io/keller-steering-enabling/flows>", image_name="billwerk-contracts",python_dependencies=["pandas", "oauthlib ", "requests", "requests_oauthlib", "datetime", "oauth2client", "snowflake"])
flow.run_config = KubernetesRun()
flow.register('Billwerk')
this is the code
z
Since you're using a custom dask executor the flow is not being executed in the docker image / k8s container. It's being executed wherever you have the dask workers setup. You should not need to use a dask cluster/load balancer here if you're using the KubernetesRun. Generally a
DaskExecutor
local to the job is sufficient
l
hi @Zanie I'm sorry about that! My problem is im located in germany, so I was trying to get an answer as soon as possible, because of working hours🙈
can u help me a bit with the configuration to set it then in the docker image/k8 container directly?
z
That example shows using a dask executor with
flow.run()
ie you run your flow locally but the tasks are executed on the dask workers on gcp OR using a
KubernetesRun()
with
flow.register()
and a
KubernetesAgent
to run your flow
Your issue is that you have combined the examples. This will work better if you decide on a single approach. Following https://coda.io/@laura-lorenz/setting-up-your-execution-layer-a-companion-guide/static-dask-cluster-in-gcp-with-helm-3 you'd want to add your requirements to
EXTRA_PIP_PACKAGES
and ignore
.register()
and the
KubernetesRun
. Using https://coda.io/@laura-lorenz/setting-up-your-execution-layer-a-companion-guide/using-just-the-kubernetes-agent-5 you'll want to unset the
flow.executor
and use
.register
and an agent.
l
ok, but the latter is the better solution if i want to kick it off from the ui?
z
Yep. You could use a localrun/agent with your helm dask cluster instead but if you have kubernetes that is probably the easiest way to do it.
l
ah great, sorry for all that question, im just beginning to understand the differences
so in the kubernetes cluster, dask is running automatically then?
z
Yep dask will be spun up in the container that your flow runs in.
l
Copy code
flow.storage = Docker(registry_url="<http://gcr.io/keller-steering-enabling/flows|gcr.io/keller-steering-enabling/flows>", image_name="billwerk-contracts",python_dependencies=["pandas", "oauthlib ", "requests", "requests_oauthlib", "datetime", "oauth2client", "snowflake"])
flow.run_config = KubernetesRun()
flow.register('Billwerk')
this should work then?
z
Yep!
Presuming you've deployed a kubernetes agent to your cluster
l
yes 🙂 thanks, worked now!