https://prefect.io logo
r

Rasmus Lindqvist

07/11/2023, 7:44 AM
Hi all, I am doing the migration from Prefect 1 to Prefect 2. Context: - Using Prefect Cloud - Using GCP (Prefect Agent running on VM, Cloud Run for execution of flows) - Github as storage I have gotten everything running in a production like environment. However I am now running into problems with downloading the src code before execution. I get the error:
Copy code
prefect.exceptions.ScriptError: Script at 'src/flows/flow.py' encountered an exception: FileNotFoundError(2, 'No such file or directory')
We are using a monorepo where the prefect code sits in a subdirectory called “pipeline”. For deployment I am using the python SDK as such:
Copy code
deployment = Deployment.build_from_flow(
        flow=flow,
        name=target_env,
        version=3,
        path="pipeline",
        work_queue_name=target_env,
        tags=[target_env],
        infrastructure=infrastructure_block,
        storage=storage_block,
    )
As mentioned it was working in my other environment when I did not use a subdirectory and did not provide a path. I have searched thoroughly in discourse, Slack, documentation and in the source code it self, but am running out of options. Does anyone see an error with the deployment or have suggestions on resources I can follow to get it working?
c

Christopher Boyd

07/11/2023, 1:37 PM
What’s the actual path in your environment? Why do you have
path="pipeline"
here ? Path should be a qualified path to the entrypoint, but I believe if you are using a storage block shouldn’t be necessary - just the entrypoint itself
I’ve only used
path
if I’m using docker / kubernetes and the code is embedded in the image, not cloned in
r

Rasmus Lindqvist

07/11/2023, 3:33 PM
Hi Christopher, Thank you for the reply! The reason why I did the path=“pipeline” is because I did some digging in the prefect src-code and it seems that Github.get-directory uses the path from the deployment to clone a specific repository.
And if I understood it correctly I would not need to specify the entrypoint when using the python SDK rather than the CLI. I’ve understood it as the entrypoint being constructed automagically from the import path of your flow. Maybe I need to set the entrypoint explicitly?
What I am a bit confused about is where the code is cloned to when using a remote storage
I also tried having the code in the image and not using
storage
, however then I get this error:
Copy code
FileNotFoundError: [Errno 2] No such file or directory: '/home/runner/work/backend/backend/pipeline'
Is there some cache or something in play that I need to reset ? I have checked the Cloud Run image that I am using and the code is there Thanks for the help 🙂 !
c

Christopher Boyd

07/11/2023, 6:04 PM
The entrypoint is always explicitly needed, the path is not and is relative - for example, the best immediately example I have uses docker / kubernetes, but the default workdir in a prefect image is
/opt/prefect
. If I do something like this for the docker image:
Copy code
COPY flow.py /opt/prefect/flows/
Then my deployment is like:
Copy code
deployment = Deployment.build_from_flow(
    flow=flow,
    name="Test HealthCheck Deployment",
    version=1,
    flow_name="healthcheck",
    work_queue_name="kubernetes",
    infrastructure=k8s_job,
    #Add Docker path to flow
    path="/opt/prefect/flows",
    #Add Docker entrypoint relative to path
    entrypoint="healthcheck.py:healthcheck"
)
The path is the full path to the entrypoint, and the entrypoint is what the container / python entrypoint executes
r

Rasmus Lindqvist

07/12/2023, 6:28 AM
Thanks a lot for that explanation, that clarifies things a lot! If you don’t mind could you explain where the
'home/runner/work'
path comes from as well ? And are there any resources where I can read about caching? Because I got it working when I upgraded the name of the deployment. Again - thanks a lot!!