https://prefect.io logo
e

Elliott Wilson

07/10/2023, 3:49 PM
Hey I believe that prefect is still not traversing the path when trying to pull a flow. It is cloning the repo fine and I can see the repo in the docker container but it does not find the flow.
Log:
Copy code
Discovered worker type 'docker' for work pool 'test_docker'.
Worker 'DockerWorker 196bda7f-91c8-4f2d-aaf4-28af729f8af6' started!
12:45:10.451 | INFO    | prefect.flow_runs.worker - Worker 'DockerWorker 196bda7f-91c8-4f2d-aaf4-28af729f8af6' submitting flow run '2ae9462a-c308-4b92-a6c7-244c25905f4b'
12:45:12.816 | INFO    | prefect.worker.docker.dockerworker 196bda7f-91c8-4f2d-aaf4-28af729f8af6 - Creating Docker container 'purple-dormouse'...
12:45:12.874 | INFO    | prefect.worker.docker.dockerworker 196bda7f-91c8-4f2d-aaf4-28af729f8af6 - Docker container 'purple-dormouse' has status 'created'
12:45:13.213 | INFO    | prefect.worker.docker.dockerworker 196bda7f-91c8-4f2d-aaf4-28af729f8af6 - Docker container 'purple-dormouse' has status 'running'
12:45:13.624 | INFO    | prefect.flow_runs.worker - Completed submission of flow run '2ae9462a-c308-4b92-a6c7-244c25905f4b'
/usr/local/lib/python3.9/runpy.py:127: RuntimeWarning: 'prefect.engine' found in sys.modules after import of package 'prefect', but prior to execution of 'prefect.engine'; this may result in unpredictable behaviour
  warn(RuntimeWarning(msg))
<jemalloc>: MADV_DONTNEED does not work (memset will be used instead)
<jemalloc>: (This is the expected behaviour if you are running under QEMU)
Cloning into 'monorepo'...
15:45:43.527 | INFO    | prefect.deployment - Cloned repository '<https://github.com/gaia-family/monorepo.git>' into 'monorepo'
ERROR: Could not open requirements file: [Errno 2] No such file or directory: 'requirements.txt'

[notice] A new release of pip available: 22.3.1 -> 23.1.2
[notice] To update, run: pip install --upgrade pip
15:45:46.830 | ERROR   | Flow run 'purple-dormouse' - Flow could not be retrieved from deployment.
Traceback (most recent call last):
  File "<frozen importlib._bootstrap_external>", line 846, in exec_module
  File "<frozen importlib._bootstrap_external>", line 982, in get_code
  File "<frozen importlib._bootstrap_external>", line 1039, in get_data
FileNotFoundError: [Errno 2] No such file or directory: 'mixpanel_to_s3_test.py'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/prefect/engine.py", line 395, in retrieve_flow_then_begin_flow_run
    flow = await load_flow_from_flow_run(flow_run, client=client)
  File "/usr/local/lib/python3.9/site-packages/prefect/client/utilities.py", line 51, in with_injected_client
    return await fn(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/prefect/deployments/deployments.py", line 218, in load_flow_from_flow_run
    flow = await run_sync_in_worker_thread(load_flow_from_entrypoint, str(import_path))
  File "/usr/local/lib/python3.9/site-packages/prefect/utilities/asyncutils.py", line 91, in run_sync_in_worker_thread
    return await anyio.to_thread.run_sync(
  File "/usr/local/lib/python3.9/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "/usr/local/lib/python3.9/site-packages/prefect/flows.py", line 876, in load_flow_from_entrypoint
    flow = import_object(entrypoint)
  File "/usr/local/lib/python3.9/site-packages/prefect/utilities/importtools.py", line 201, in import_object
    module = load_script_as_module(script_path)
  File "/usr/local/lib/python3.9/site-packages/prefect/utilities/importtools.py", line 164, in load_script_as_module
    raise ScriptError(user_exc=exc, path=path) from exc
prefect.exceptions.ScriptError: Script at 'mixpanel_to_s3_test.py' encountered an exception: FileNotFoundError(2, 'No such file or directory')
 > Running git_clone step...
 > Running pip_install_requirements step...
12:45:49.751 | INFO    | prefect.worker.docker.dockerworker 196bda7f-91c8-4f2d-aaf4-28af729f8af6 - Docker container 'purple-dormouse' has status 'exited'
12:45:49.768 | INFO    | prefect.worker.docker.dockerworker 196bda7f-91c8-4f2d-aaf4-28af729f8af6 - Docker container 'purple-dormouse' has status 'exited'
c

Christopher Boyd

07/10/2023, 3:52 PM
Your path there is fully qualified, but the clone is happening into /opt/prefect
Maybe redact your access token too
e

Elliott Wilson

07/10/2023, 3:53 PM
Thanks
Only exposed it for testing
c

Christopher Boyd

07/10/2023, 4:00 PM
I don’t know the rest for sure, but when you run
prefect deploy
- The entrypoint should be relative to the root of the repo, and so ideally you’d be in the root repo as you run the command; something like:
image.png
If I’m in the root of my repo, and prefect.yaml is in the root of my repo, the entrypoint of the flow is ./healthcheck_flow_a:flow_a
but if it’s in a sub-folder, then the entrypoint would be :
./sub-folder/health_check_flow
- this is all executed in context of the git clone, which by default happens to
/opt/prefect
. So if my repo is
multi-flow-demo
then the fully qualified path is:
/opt/prefect/flow/multi-flow-demo/healthcheck_flow_a
or the relative path to the flow is
./healthcheck_flow_a
e

Elliott Wilson

07/10/2023, 4:03 PM
Ok so
Deployment:
Copy code
# Welcome to your prefect.yaml file! You can you this file for storing and managing
# configuration for deploying your flows. We recommend committing this file to source
# control along with your flow code.

# Generic metadata about this project
name: prefect
prefect-version: 2.10.18

# build section allows you to manage and build docker images
build:
    - prefect_docker.deployments.steps.build_docker_image:
        id: build_image
        requires: prefect-docker>=0.3.1
        image_name: gaia-mixpanel-to-s3-test
        tag: '1.4'

pull:
    - prefect.deployments.steps.git_clone:
        repository: <https://github.com/gaia-family/monorepo.git>
        branch: prefect_test
        access_token:
    - prefect.deployments.steps.pip_install_requirements:
        requirements_file: requirements.txt

# the deployments section allows you to provide configuration for deploying flows
deployments:
- name: mixpanel_to_s3_test
  path: /monorepo/data/prefect/mixpanel_to_s3_test.py
  entrypoint: mixpanel_to_s3_test.py:mixpanel_to_s3_test
  version: null
  tags: []
  description: This flow extracts mixpanel data from the api and sends it to an s3 bucket.
  schedule: {}
  parameters: {}
  schedule:
    cron: 0 2 * * *
    timezone: Europe/London
    day_or: true
  work_pool:
    name: default-agent-pool
    work_queue_name: default
    job_variables:
      image: '{{ build_image.image }}'
      tag: '{{ build_image.tag }}'
The flow is at /monorepo/data/prefect/mixpanel_to_s3_test.py
So path is /monorepo/data/prefect/ and entry point ./mixpanel_to_s3_test.py:mixpanel_to_s3_test
🙌 1
upvote 1
c

Christopher Boyd

07/10/2023, 4:17 PM
you’ll want to update the path to remove the entrypoint
path should be the path leading to the entrypoint - like if you were to do a
cd
to that directory, you’d be able to execute the
entrypoint
as another command
I’m assuming /monorepo/data/prefect is your working directory?
or is that the path inside the repo
e

Elliott Wilson

07/10/2023, 4:21 PM
Working dir is /.
1
Same error:
👀 1
Screenshot 2023-07-10 at 13.29.32.png
👀 1
Screenshot 2023-07-10 at 13.30.58.png
c

Christopher Boyd

07/10/2023, 4:32 PM
do you have a dockerfile I can use to test? I have a simple flow I can use to test, but would like to replicate your image config
e

Elliott Wilson

07/10/2023, 4:32 PM
Sure
# Use a base image for the desired architecture FROM --platform=linux/amd64 prefecthq/prefect:2.10.18-python3.9 # Copy the requirements file COPY requirements.txt /opt/prefect/prefect/requirements.txt # Install the Python dependencies RUN python -m pip install -r /opt/prefect/prefect/requirements.txt # Copy the application files to the container COPY . /opt/prefect/prefect/ # Set the working directory WORKDIR /.
1
I don't believe you need the requirements or the copy
1
c

Christopher Boyd

07/10/2023, 4:33 PM
yep, let me do some testing and get back to you
e

Elliott Wilson

07/10/2023, 4:33 PM
Thanks @Christopher Boyd
c

Christopher Boyd

07/10/2023, 5:12 PM
@Elliott Wilson - I got this working , I can share my steps - I think there are some possible places you can deviate in values, but this worked for me
Copy code
$ cat Dockerfile

FROM --platform=linux/amd64 prefecthq/prefect:2.10.18-python3.9
WORKDIR /.
Copy code
cat prefect.yaml
# Welcome to your prefect.yaml file! You can you this file for storing and managing
# configuration for deploying your flows. We recommend committing this file to source
# control along with your flow code.

# Generic metadata about this project
name: path_and_entry_flow
prefect-version: 2.10.18

# build section allows you to manage and build docker images
build: null

# push section allows you to manage if and how this project is uploaded to remote locations
push: null

# pull section allows you to provide instructions for cloning this project in remote locations
pull:
- prefect.deployments.steps.git_clone:
    repository: <https://github.com/chrisaboyd/Samples.git>
    branch: main
    access_token: null
- prefect.deployments.steps.run_shell_script:
    id: test
    script: ls -l
    stream_output: true

# the deployments section allows you to provide configuration for deploying flows
deployments:
- name: communitypath
  description: null
  flow_name: null
  entrypoint: ./Prefect/hello_world.py:hello_world
  parameters: {}
  work_pool:
    name: kubernetes
    work_queue_name: null
    job_variables:
      image: chaboy/test:community
so the full path my flow exists at once it is cloned into the image is:
/Samples/Prefect/hello_world.py:hello_world
the workdir is the same as you, (at
/
) The clone step itself I believe does a cd into cloned repo, so the paths become relative from there
e

Elliott Wilson

07/10/2023, 5:17 PM
Ahh ok
So it would be path /data/prefect/ and then same entrypoint
c

Christopher Boyd

07/10/2023, 5:18 PM
so I admit, it’s not intuitive when / where you might use the path, and the relative paths
yea
I’d need to mess with the path some more to see what effect that has, I’m not entirely sure
but you can do it without
e

Elliott Wilson

07/10/2023, 5:19 PM
You can have an entrypoint like: /data/prefect/mixpanel_to_s3_test.py:mixpanel_to_s3_test
c

Christopher Boyd

07/10/2023, 5:20 PM
I think so yea - I’d need to test that, but I think that would work
e

Elliott Wilson

07/10/2023, 5:21 PM
I think that will complain locally
on the deploy
I will try with path and let you know
1
🙏 1
Yeah don't think it taking into account the path
Work around is use the script command to change directory to the path.
🙌 1
c

Christopher Boyd

07/10/2023, 5:40 PM
nice, thank you - so you did
set_working_directory
deployment step?
e

Elliott Wilson

07/10/2023, 5:40 PM
Hmm no
Screenshot 2023-07-10 at 14.40.48.png
Dam another error on the shell script
c

Christopher Boyd

07/10/2023, 5:54 PM
what’s interesting is that just returns in a subshell
and just returns stdout / stderr
so I don’t think the run_shell_script is actually doing anything here
e

Elliott Wilson

07/10/2023, 5:55 PM
I am out of ideas what I can do then
e

Elliott Wilson

07/10/2023, 6:15 PM
This works thank you @Christopher Boyd
catjam 1
1
2 Views