Laura Vaida
03/01/2021, 5:27 PM[2021-03-01 18:24:40+0100] ERROR - prefect.FlowRunner | Unexpected error: ModuleNotFoundError("No module named 'oauthlib'")
Traceback (most recent call last):
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\prefect\engine\runner.py", line 48, in inner
new_state = method(self, state, *args, **kwargs)
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\prefect\engine\flow_runner.py", line 619, in get_flow_run_state
final_states = executor.wait(
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\prefect\executors\dask.py", line 397, in wait
return self.client.gather(futures)
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\distributed\client.py", line 1993, in gather
return self.sync(
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\distributed\client.py", line 839, in sync
return sync(
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\distributed\utils.py", line 340, in sync
raise exc.with_traceback(tb)
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\distributed\utils.py", line 324, in f
result[0] = yield future
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\tornado\gen.py", line 762, in run
value = future.result()
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\distributed\client.py", line 1858, in _gather
raise exception.with_traceback(traceback)
File "/opt/conda/lib/python3.8/site-packages/distributed/protocol/pickle.py", line 75, in loads
ModuleNotFoundError: No module named 'oauthlib'
[2021-03-01 18:24:40+0100] ERROR - prefect.billwerk-contracts | Unexpected error occured in FlowRunner: ModuleNotFoundError("No module named 'oauthlib'")
Laura Vaida
03/01/2021, 5:28 PMflow.storage= Docker(registry_url="<http://gcr.io/keller-steering-enabling/flows|gcr.io/keller-steering-enabling/flows>", image_name="billwerk-contracts",python_dependencies=["pandas"])
flow.run_config = KubernetesRun()
flow.run(executor=executor)
flow.register('Billwerk')
Zanie
python_dependencies
list in your flow storage configurationZanie
Laura Vaida
03/01/2021, 5:35 PMflow.storage= Docker(registry_url="<http://gcr.io/keller-steering-enabling/flows|gcr.io/keller-steering-enabling/flows>", image_name="billwerk-contracts",python_dependencies=["pandas", "oauthlib"])
flow.run_config = KubernetesRun()
flow.run(executor=executor)
flow.register('Billwerk')
like that? gives same error..Laura Vaida
03/01/2021, 5:44 PMfrom oauthlib.oauth2 import LegacyApplicationClient
from requests_oauthlib import OAuth2Session
need this two packagesZanie
requests_oauthlib
in that list as wellLaura Vaida
03/01/2021, 6:21 PMLaura Vaida
03/01/2021, 6:22 PMLaura Vaida
03/01/2021, 6:27 PM[2021-03-01 19:25:55+0100] ERROR - prefect.FlowRunner | Unexpected error: ModuleNotFoundError("No module named 'oauthlib'")
Traceback (most recent call last):
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\prefect\engine\runner.py", line 48, in inner
new_state = method(self, state, *args, **kwargs)
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\prefect\engine\flow_runner.py", line 619, in get_flow_run_state
final_states = executor.wait(
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\prefect\executors\dask.py", line 397, in wait
return self.client.gather(futures)
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\distributed\client.py", line 1993, in gather
return self.sync(
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\distributed\client.py", line 839, in sync
return sync(
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\distributed\utils.py", line 340, in sync
raise exc.with_traceback(tb)
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\distributed\utils.py", line 324, in f
result[0] = yield future
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\tornado\gen.py", line 762, in run
value = future.result()
File "C:\Users\laura.vaida.000\anaconda3\envs\prefect\lib\site-packages\distributed\client.py", line 1858, in _gather
raise exception.with_traceback(traceback)
File "/opt/conda/lib/python3.8/site-packages/distributed/protocol/pickle.py", line 75, in loads
ModuleNotFoundError: No module named 'oauthlib'
[2021-03-01 19:25:55+0100] ERROR - prefect.billwerk-contracts | Unexpected error occured in FlowRunner: ModuleNotFoundError("No module named 'oauthlib'")
Step 1/9 : FROM prefecthq/prefect:0.14.6-python3.8
---> 67415bcf7bcb
Step 2/9 : ENV PREFECT__USER_CONFIG_PATH='/opt/prefect/config.toml'
---> Using cache
---> 4ffc0b7b4b6b
Step 3/9 : RUN pip install pip --upgrade
---> Using cache
---> ee56fc384512
Step 4/9 : RUN pip show prefect || pip install git+<https://github.com/PrefectHQ/prefect.git@0.14.6#egg=prefect[all_orchestration_extras]>
---> Using cache
---> 28b18e331c74
Step 5/9 : RUN pip install pandas oauthlib requests_oauthlib wheel
---> Running in dd84ec5c9dbf
Collecting pandas
Downloading pandas-1.2.2-cp38-cp38-manylinux1_x86_64.whl (9.7 MB)
Requirement already satisfied: oauthlib in /usr/local/lib/python3.8/site-packages (3.1.0)
Requirement already satisfied: requests_oauthlib in /usr/local/lib/python3.8/site-packages (1.3.0)
Requirement already satisfied: wheel in /usr/local/lib/python3.8/site-packages (0.36.2)
Collecting numpy>=1.16.5
Downloading numpy-1.20.1-cp38-cp38-manylinux2010_x86_64.whl (15.4 MB)
Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.8/site-packages (from pandas) (2.8.1)
Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.8/site-packages (from pandas) (2021.1)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.8/site-packages (from python-dateutil>=2.7.3->pandas) (1.15.0)
Requirement already satisfied: requests>=2.0.0 in /usr/local/lib/python3.8/site-packages (from requests_oauthlib) (2.25.1)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.8/site-packages (from requests>=2.0.0->requests_oauthlib) (1.26.3)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.8/site-packages (from requests>=2.0.0->requests_oauthlib) (2.10)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.8/site-packages (from requests>=2.0.0->requests_oauthlib) (2020.12.5)
Requirement already satisfied: chardet<5,>=3.0.2 in /usr/local/lib/python3.8/site-packages (from requests>=2.0.0->requests_oauthlib) (4.0.0)
Installing collected packages: numpy, pandas
Successfully installed numpy-1.20.1 pandas-1.2.2
Removing intermediate container dd84ec5c9dbf
---> af3778938310
Step 6/9 : RUN mkdir -p /opt/prefect/
---> Running in df00c4a4b75b
Removing intermediate container df00c4a4b75b
---> 27c9c554a00f
Step 7/9 : COPY billwerk-contracts.flow /opt/prefect/flows/billwerk-contracts.prefect
---> 01d61c6debe5
Step 8/9 : COPY healthcheck.py /opt/prefect/healthcheck.py
---> 4cb89cafe90a
Step 9/9 : RUN python /opt/prefect/healthcheck.py '["/opt/prefect/flows/billwerk-contracts.prefect"]' '(3, 8)'
---> Running in 6e9b7cdccdb6
Beginning health checks...
System Version check: OK
Cloudpickle serialization check: OK
Result check: OK
Environment dependency check: OK
All health checks passed.
Removing intermediate container 6e9b7cdccdb6
---> 522099f92070
Successfully built 522099f92070
Zanie
flow.run()
and flow.register()
Zanie
KubernetesRun
which will use your Docker
image with the proper dependencies installed.Zanie
flow.run()
it will execute in your local system where you must have pip install ...
your dependencies yourselfLaura Vaida
03/01/2021, 6:30 PMLaura Vaida
03/01/2021, 6:30 PMZanie
flow.run()
with a dask executor that may be running somewhere other than your local machine. In that case, you'd need your dependencies installed on the dask workers as well.Zanie
flow.register()
and the run_config
to manage those dependencies thoughLaura Vaida
03/01/2021, 6:32 PMLaura Vaida
03/01/2021, 6:33 PMZanie
Laura Vaida
03/01/2021, 6:35 PMLaura Vaida
03/01/2021, 6:36 PMZanie
Laura Vaida
03/02/2021, 4:46 PMLaura Vaida
03/02/2021, 4:47 PM#define executor as dask executor with tcp from load balancer
flow.executor = DaskExecutor(address="35.198.104.230:8786") # tcp:port address of dask load balancer
flow.storage = Docker(registry_url="<http://gcr.io/keller-steering-enabling/flows|gcr.io/keller-steering-enabling/flows>", image_name="billwerk-contracts",python_dependencies=["pandas", "oauthlib ", "requests", "requests_oauthlib", "datetime", "oauth2client", "snowflake"])
flow.run_config = KubernetesRun()
flow.register('Billwerk')
this is the codeZanie
DaskExecutor
local to the job is sufficientLaura Vaida
03/02/2021, 8:13 PMLaura Vaida
03/02/2021, 8:14 PMLaura Vaida
03/02/2021, 8:15 PMZanie
flow.run()
ie you run your flow locally but the tasks are executed on the dask workers on gcp OR using a KubernetesRun()
with flow.register()
and a KubernetesAgent
to run your flowZanie
EXTRA_PIP_PACKAGES
and ignore .register()
and the KubernetesRun
. Using https://coda.io/@laura-lorenz/setting-up-your-execution-layer-a-companion-guide/using-just-the-kubernetes-agent-5 you'll want to unset the flow.executor
and use .register
and an agent.Laura Vaida
03/02/2021, 8:21 PMZanie
Laura Vaida
03/02/2021, 8:27 PMLaura Vaida
03/02/2021, 8:28 PMZanie
Laura Vaida
03/02/2021, 8:42 PMflow.storage = Docker(registry_url="<http://gcr.io/keller-steering-enabling/flows|gcr.io/keller-steering-enabling/flows>", image_name="billwerk-contracts",python_dependencies=["pandas", "oauthlib ", "requests", "requests_oauthlib", "datetime", "oauth2client", "snowflake"])
flow.run_config = KubernetesRun()
flow.register('Billwerk')
this should work then?Zanie
Zanie
Laura Vaida
03/02/2021, 9:09 PM