https://prefect.io logo
#prefect-community
Title
# prefect-community
l

Luis Muniz

06/28/2020, 10:40 PM
Hi I thought I was being smart when modularizing the types of tasks, and having one module where I construct my flow:
Copy code
from tasks.collect.games import *
from tasks.collect.streamers import *
from tasks.collect.streams import *
from tasks.enrich.game import *
from tasks.enrich.streamer import *
from tasks.enrich.stream import *
from tasks.store.game import *
from tasks.store.streamer import *
from tasks.store.stream import *
from tasks.util.common import *

with Flow("STRDATA POC") as strdata:
    collected_games = collect_games()
    enriched_games = enrich_game.map(collected_games)

    collected_streamers = collect_streamers()
    enriched_streamers = enrich_streamer.map(collected_streamers)

    collected_streams = collect_streams_per_game.map(enriched_games, unmapped(enriched_streamers))
    enriched_streams = enrich_stream.map(flatten(collected_streams))

    store_game.map(enriched_games)
    store_stream.map(enriched_streams)
    store_streamer.map(enriched_streamers)
The Flow runs OK when I run it standalone, but when I register it in my local prefect server, I can see the following error in the dashboard:
Failed to load and execute Flow's environment: ModuleNotFoundError("No module named 'tasks'")
it seems to be similar to an issue I found about not being able to submit flows to prefect cloud because of some peculiarity with pickle? https://github.com/PrefectHQ/prefect/issues/1742 But this was related to packaging the flow in a docker image, so I can't apply the solution to my case The layout of my project is the following:
Copy code
deploy
|_
  prod
  |_
    register.py (contains flow.register)
flows
|_
  strdata_poc.py (contains flow definition - see above)
tasks
|_
  collect
  |_
    games.py
    streamers.py
    streams.py
  enrich
  |_
    ...
c

Chris White

06/28/2020, 11:00 PM
Hi Luis, whether the Flow is stored as a pickle or as a script, the imports you run to build your flow need to also be accessible at runtime. The most common way of achieving this in Docker is to write your own Dockerfile that adds your entire project to the container + adds this project location to your importable
PATH
.” We’re working on trying to make this a little easier to manage within Prefect’s API but ultimately we have to be able to recreate your Flow object at runtime (which requires your imports be accessible)
l

Luis Muniz

06/28/2020, 11:01 PM
Ah, that's something that I seem to have missed, then. When I run my flow with the local executor, the flow is containerized too?
i guess if it uses dask... hm yes
c

Chris White

06/28/2020, 11:02 PM
Ah I’m sorry, are you running this via a local agent + dask?
l

Luis Muniz

06/28/2020, 11:02 PM
yes
all the default execution environment
c

Chris White

06/28/2020, 11:03 PM
OK gotcha; still a very similar story but in this case you additionally need to ensure that each dask worker can import your flow dependencies
the dask workers essentially need to be able to recreate the python code that is submitted to them, which requires your dependencies are importable
l

Luis Muniz

06/28/2020, 11:04 PM
If I'm not mistaken, the default environment spawns en ephemeral dask cluster when a flow is submitted, right?
How do I tell these workers where my code is?
c

Chris White

06/28/2020, 11:06 PM
Yes that’s true; in this case you might be able to get away with adding the import path to your local agent, something like:
Copy code
prefect agent start -p /path/to/my/project -p /maybe/another/path/within/my/project
(you can use as many
-p
flags as you need, and each directory you pass is added to your import path for your process, and in your case taht should also cover the ephemeral dask workers)
l

Luis Muniz

06/28/2020, 11:07 PM
aha
For production, when you use a base docker image and an S3 storage object to specify this non-docker storage for container environments (don't know if I'm getting the vocabulary right) this might be fixed if the full project is deployed on S3 ?
c

Chris White

06/28/2020, 11:12 PM
Hmmm yea, but typically we recommend converting your project into a small python package and installing your package into the base image. That being said, we are currently actively working on making this simpler from the user-perspective. This issue is just the tip of the iceberg for making sure that Prefect starts inferring your import paths in an intelligent way: https://github.com/PrefectHQ/prefect/issues/2857
l

Luis Muniz

06/28/2020, 11:13 PM
Ok, thanks for the big push
you're awesome, guys
c

Chris White

06/28/2020, 11:14 PM
Thank you!! We’re always trying to make things as intuitive as possible, and we appreciate all the feedback the community gives us 😄
l

Luis Muniz

06/28/2020, 11:14 PM
I confirm that your workaround helped for local execution
c

Chris White

06/28/2020, 11:14 PM
awesome, glad to hear
a

Anish Chhaparwal

09/30/2020, 11:31 AM
hey i'm facing a similar issue. is there any update on this? or docker images is the only way..