https://prefect.io logo
p

Pedro Martins

12/22/2020, 7:24 PM
Hey! Me again 🙂 I'm trying to run the aircraft example in a Kubernetes Agent. I created an image in which the
aircraftlib
comes already installed but when prefect jobs start running it can't find the package.
Copy code
[2020-12-22 18:10:45+0000] INFO - prefect.S3 | Downloading aircraft-etl/2020-12-22t18-10-34-141798-00-00 from dr-prefect
No module named 'aircraftlib'
Traceback (most recent call last):
  File "/usr/local/bin/prefect", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/prefect/cli/execute.py", line 90, in flow_run
    raise exc
  File "/usr/local/lib/python3.6/site-packages/prefect/cli/execute.py", line 67, in flow_run
    flow = storage.get_flow(storage.flows[flow_data.name])
  File "/usr/local/lib/python3.6/site-packages/prefect/storage/s3.py", line 115, in get_flow
    return cloudpickle.loads(output)
  File "/usr/local/lib/python3.6/site-packages/cloudpickle/cloudpickle.py", line 562, in subimport
    __import__(name)
ModuleNotFoundError: No module named 'aircraftlib'
Any clue how can I make it work?
j

Jim Crist-Harif

12/22/2020, 7:32 PM
How did you install the module? Did you write a
setup.py
and
pip install
it into the image?
p

Pedro Martins

12/22/2020, 7:32 PM
Exactly @Jim Crist-Harif
j

Jim Crist-Harif

12/22/2020, 7:34 PM
It sounds like
aircraftlib
isn't found in your image's python environment. You might try running the image locally and inspecting to see what's on your python path. Something like:
Copy code
docker run -it --rm your-image /usr/bin/env bash
then start a python shell and see if you can import your module.
p

Pedro Martins

12/22/2020, 7:37 PM
Copy code
$ docker run -it --rm drtools/prefect:aircraft-etl-package /usr/bin/env bash
root@058fb224350a:/# python -c "import aircraftlib; print(aircraftlib)"
<module 'aircraftlib' from '/aircraftlib/__init__.py'>
root@058fb224350a:/#
It's there. I had tried that before.
j

Jim Crist-Harif

12/22/2020, 7:39 PM
Hmmmm. That looks like that's in your current working directory, what if you change directories? Is it fully installed, or does it just happen to be on path?
p

Pedro Martins

12/22/2020, 7:40 PM
Copy code
FROM prefecthq/prefect:0.14.0-python3.6
COPY aircraftlib ./aircraftlib
RUN conda install -e ./aircraftlib
This is my Dockerfile. I copy to
/
and pip install from there.
In which directory prefect downloads and execute the task?
j

Jim Crist-Harif

12/22/2020, 7:41 PM
Should be the default working directory (as is done here), just trying to gather information.
Should the
conda install
above be
pip install
?
Next question - are you sure you're using the proper image in your flow run? Are you setting it properly on the
KubernetesRun
?
p

Pedro Martins

12/22/2020, 7:44 PM
Oh yeah! The image is actually with
pip install
, i tried with conda but failed.
And yes, the kubernetes run points to the right image
Copy code
custom_confs = {
    "run_config": KubernetesRun(
        image="drtools/prefect:aircraft-etl-package", 
        image_pull_secrets=["regcred"], 
    ),   
    "storage": S3(bucket="dr-prefect"),
}
j

Jim Crist-Harif

12/22/2020, 7:45 PM
I'm not sure then. All we do is take that image and run
prefect execute flow-run
inside it. Nothing fancy. If you could make a reproducible example, I'd be happy to take a look.
p

Pedro Martins

12/22/2020, 7:48 PM
This image `drtools/prefect:aircraft-etl-package`is actually public. I'll share the NB with tasks.
👍 1
j

Jim Crist-Harif

12/22/2020, 7:50 PM
That doesn't seem to be true -
docker pull
errors with an auth issue.
p

Pedro Martins

12/22/2020, 7:51 PM
oops! hold on.
It's public now
👍 1
j

Jim Crist-Harif

12/22/2020, 8:19 PM
The issue is that
aircraftlib
isn't properly installed, it's only accessible from the
/
directory (where the folder is). If you move the
setup.py
and
requirements
out a level in your module structure (so you have an
aircraftlib
folder with all the code and an
__init__.py
file) and a
setup.py
&
requirements.txt
outside this folder then things work. This is the general structure of most python packages.
Copy code
aircraftlib/
  setup.py
  requirements.txt
  aircraftlib/
    __init__.py
    analysis.py
    data
    database.py
    ...
p

Pedro Martins

12/22/2020, 8:20 PM
Ahh, Nice! I wouldn't have catch that so easily.
Will try.
Cool! The tasks now execute.
Question! Can I execute each task as a separated k8s job?
j

Jim Crist-Harif

12/22/2020, 8:43 PM
You'd have to write that yourself. You might make use of
RunNamespacedJob
(https://docs.prefect.io/api/latest/tasks/kubernetes.html#runnamespacedjob), or use the raw k8s api instead.
👍 1
p

Pedro Martins

12/22/2020, 8:44 PM
Does DaskExecutor executes task in separate workers?
j

Jim Crist-Harif

12/22/2020, 8:46 PM
To be precise, it can execute tasks in separate threads, processes, or machines (distributed across a cluster).
p

Pedro Martins

12/22/2020, 8:46 PM
Ok! Thanks again @Jim Crist-Harif