https://prefect.io logo
Title
r

Raghuram M

09/08/2022, 2:11 PM
Hello Prefect Community, Great to be here! I am running a flow on prefect2 with a few tasks which have dependencies on local modules. We are running the tasks on a dask cluster and are using a s3 block storage. The issue I run into is recreating the local dependencies on the dask workers. Any suggestions on how you go about ensuring that the script environment is available to the dask worker?
:watching: 1
m

Mike Grabbe

09/08/2022, 2:17 PM
are your dependencies in the same folder as your flow? Or are these python package dependencies?
r

Raghuram M

09/08/2022, 2:24 PM
They are in the same project folder as the flows and get uploaded to s3 bucket when building the deployment. These dependencies are local scripts, not python package dependencies.
my_flow.py
has tasks which uses functions/classes from
my_script1.py
,
my_script2.py
and so-on.
project_name/
  internal_dependencies
    my_script1.py
    my_script2.py
    ...
   my_flow.py
@Mike Grabbe can you please opine?
m

Mike Grabbe

09/08/2022, 3:23 PM
Interesting. I am doing something similar, and its been working for me
though I am using in-process flows
I assume youre getting an error that the referened script cannot be found?
r

Raghuram M

09/08/2022, 3:29 PM
Yes, I'm getting a similar error. How did you resolve this? @Mike Grabbe What's an in-process flow, can you share any documentation/link?
m

Mike Grabbe

09/08/2022, 3:34 PM
When I deploy the flow, its using the
--infra process
parameter
so the agent itself is running the flow
This shouldnt matter though. Wherever you run the flow, it should load the full set of files from s3
b

Barada Sahu

09/08/2022, 4:35 PM
Isn’t the agent decoupled from the Dask Runner? Ideally the dask runners should be able to load these dependencies and it shouldn’t matter much if the agents are @Mike Grabbe