https://prefect.io logo
Title
d

Daniel Davee

05/19/2021, 9:14 PM
I am running a prefect server executing with a Dask Kubernetes cluster, and I'm runinng the hello world test script and keep getting this error. python test.py  [2021-05-19 21:06:50+0000] INFO - prefect.FlowRunner | Beginning Flow run for 'Run Me' [2021-05-19 21:06:50+0000] INFO - prefect.DaskExecutor | Connecting to an existing Dask cluster at tcp://x.x.x.x:8786 /home/dani1057/anaconda3/envs/faraday3.8/lib/python3.8/site-packages/distributed/client.py:1140: VersionMismatchWarning: Mismatched versions found +---------+---------------+---------------+---------------+ | Package | client    | scheduler   | workers    | +---------+---------------+---------------+---------------+ | python | 3.8.8.final.0 | 3.8.0.final.0 | 3.8.0.final.0 | +---------+---------------+---------------+---------------+  warnings.warn(version_module.VersionMismatchWarning(msg[0]["warning"])) [2021-05-19 21:06:50+0000] ERROR - prefect.FlowRunner | Unexpected error: ModuleNotFoundError("No module named 'prefect'") Traceback (most recent call last):  File "/home/dani1057/anaconda3/envs/faraday3.8/lib/python3.8/site-packages/prefect/engine/runner.py", line 48, in inner   new_state = method(self, state, *args, **kwargs)  File "/home/dani1057/anaconda3/envs/faraday3.8/lib/python3.8/site-packages/prefect/engine/flow_runner.py", line 643, in get_flow_run_state   final_states = executor.wait(  File "/home/dani1057/anaconda3/envs/faraday3.8/lib/python3.8/site-packages/prefect/executors/dask.py", line 414, in wait   return self.client.gather(futures)  File "/home/dani1057/anaconda3/envs/faraday3.8/lib/python3.8/site-packages/distributed/client.py", line 1975, in gather   return self.sync(  File "/home/dani1057/anaconda3/envs/faraday3.8/lib/python3.8/site-packages/distributed/client.py", line 843, in sync   return sync(  File "/home/dani1057/anaconda3/envs/faraday3.8/lib/python3.8/site-packages/distributed/utils.py", line 353, in sync   raise exc.with_traceback(tb)  File "/home/dani1057/anaconda3/envs/faraday3.8/lib/python3.8/site-packages/distributed/utils.py", line 336, in f   result[0] = yield future  File "/home/dani1057/anaconda3/envs/faraday3.8/lib/python3.8/site-packages/tornado/gen.py", line 762, in run   value = future.result()  File "/home/dani1057/anaconda3/envs/faraday3.8/lib/python3.8/site-packages/distributed/client.py", line 1840, in _gather   raise exception.with_traceback(traceback)  File "/opt/conda/lib/python3.8/site-packages/distributed/protocol/pickle.py", line 75, in loads ModuleNotFoundError: No module named 'prefect' [2021-05-19 21:06:50+0000] ERROR - prefect.Run Me | Unexpected error occured in FlowRunner: ModuleNotFoundError("No module named 'prefect'") [2:15 PM] I also have a kubernetes agent running on the server [2:16 PM] Does the agent have to be running on the cluster?
z

Zanie

05/19/2021, 9:16 PM
Hey @Daniel Davee -- could you move your traceback message into the thread to keep the top-level channel easy to browse?
d

Daniel Davee

05/19/2021, 9:17 PM
ok
z

Zanie

05/19/2021, 9:18 PM
Did you use
prefect register -m prefect.hello_world
? That will use
Module
storage which means that your flow should be accessible as an importable object in the environment where it is to be executed.
It looks like the error here is that
prefect
is missing from your dask worker (so it can't find the tasks of your flow)
d

Daniel Davee

05/19/2021, 9:20 PM
No did not, is this documented?
z

Zanie

05/19/2021, 9:20 PM
Ah sorry maybe I misunderstood which hello world script you're using.
Want to share your flow definition and how you registered it?
d

Daniel Davee

05/19/2021, 9:21 PM
from prefect import task, Flow import pandas as pd from prefect.run_configs import KubernetesRun from time import sleep @task def say_hello():   pd.DataFrame()   print("Hello, world!")   sleep() with Flow("Run Me") as flow:   h = say_hello() from prefect.executors import DaskExecutor flow.executor = DaskExecutor(address="tcp://xxxx:8786") flow.run_config = KubernetesRun() # flow.register(project_name="test") flow.run()
Also the agent is running on the VM with prefect and not on the cluster.
z

Zanie

05/19/2021, 9:23 PM
So there are some docs on using an existing cluster at https://docs.prefect.io/orchestration/flow_config/executors.html#connecting-to-an-existing-cluster -- you'll need
prefect
installed there as a minimum requirement
I wouldn't recommend using an existing cluster to start though, it'll be easier to get used to Prefect just using a
LocalDaskExecutor()
d

Daniel Davee

05/19/2021, 9:25 PM
Well this all new infra, I'm just trying to prefect to run jobs using a kubernetes cluster
I'm trying to move my POC to production
We setup the Dask cluster up for prefect.
z

Zanie

05/19/2021, 9:38 PM
I see. You'll want to have prefect installed on your dask workers as that doc describes
d

Daniel Davee

05/19/2021, 9:43 PM
Ok quick question on that. If my flow is using special library or some custom code, does prefect send the whole environment?
z

Zanie

05/19/2021, 9:55 PM
Not with an external dask cluster like this
d

Daniel Davee

05/19/2021, 9:59 PM
Thank you so much. If I may ask one more question, Is the difference between the docker sidecar and static cluster, is the later has to be scaled by updating the manifest while the former can potentially auto scale or am I completely off base?
z

Zanie

05/19/2021, 10:36 PM
Ahh can you try rephrasing? There is no side-car pattern that we recommend as far as I know (definitely lmk if I'm wrong)