Hi, I have a pretty simple setup and flow (works f...
# ask-community
m
Hi, I have a pretty simple setup and flow (works fine locally), but I cannot make it run when I choose storage configuration to be S3 and run from the cloud with ECS agent and executor. I don't do anything unusual - simple flow with a few simple tasks. The error I'm getting is
Failed to load and execute Flow's environment: FlowStorageError("An error occurred while unpickling the flow: ModuleNotFoundError("No module named 'transform'")")
If I comment out transform, it will complain about next module. Project setup example is in the thread
My project set up is like:
Copy code
prefect-project
 - extract.py
 - load.py
 - transform.py
 - flow_run.py
In my flow_run.py I import modules (they contain my tasks), e.g.
Copy code
import extract as e
import load as l
import transform as t
and use them in the Flow
I tried uploading them manually in the bucket and specified
Copy code
def get_storage_config():
    return S3(bucket=S3_BUCKET, stored_as_script=True, key="etl/manual/flow_run.py")
but having the same issue (without the unpickling part):
Failed to load and execute Flow's environment: ModuleNotFoundError("No module named 'transform'")")
How can I make it work if I want to keep my code in several modules?
k
Hey @Maria, Flow storage only stores the flow that the file is in, not the dependent modules. In this situation, you can have them on the agent, not the storage. If you use local agent and you run it in the directory that contains these files, it would work. The ideal scenario though is to package the dependencies in a Docker container and use DockerStorage.
m
Thanks @Kevin Kho, we are testing ECS agent currently - so I should package dependencies into it ?
k
Yes. Ideally you should have it set up such that your custom module is a Python package. Copy it into the Docker image and then
pip install -e .
. Are you familiar with this? I have examples that can help. Then you’d need to store that image somewhere ECS can pull from (like ECR)
m
Thanks Kevin, I think it makes sense now. I saw a similar question here in slack but couldn't understand where dependencies need to be installed - in the agent apparently.
We will give it a go today. If it is not too hard, I'm sure your examples will help (us and others who might face this in the future)
t
hi Kevin, you mentioned you have examples that can help above. Could you provide it here? Would like to see more examples.. My setup environment is DockerStorage + KubeRun , if you have example of this setup.. Thanks!
k
I have this minimal example to put the Docker container together. But here I build the container manually, for you it would be using DockerStorage and providing the registry and Dockerfile like this. Then just specify the image for KubeRun and make sure it’s authenticated to pull it down
t
in the github repo - I don’t really see Dockerfile exists. wondering how you refer to this image (or where it exists)?
Copy code
run_config=DockerRun(image="test:latest")
k
Sorry I gave the wrong link. This is the correct one I tested last night.
Yes there should be a Dockerfile
t
ahh, nice! thanks for sharing!
👍 1