hi, I have a question about entrypoints and docker...
# prefect-getting-started
v
hi, I have a question about entrypoints and docker containers
I have a simple flow and a docker work pool and agent running
Copy code
myflow.deploy(
        name="myflow-docker",
        work_pool_name="mypool-docker",
        image=DeploymentImage(
            name="myflow-image",
            tag="latest"
        ),

        push=False,
        build=False, 
    )
After running the above python script the deployment is created If I inspect it with
Copy code
prefect deployment inspect 'myflow-flow/myflow-docker'
it shows the following 'path': '.', 'entrypoint': 'src/path/a/b/c', How can I override the entrypoint and path of this flow when building this container? The only mention I found in your docs is here, and it gives no actual information about how to do this: https://docs.prefect.io/latest/concepts/deployments/?h=entrypoint#required-data
n
hi @Vlad S - if you want to override the entrypoint you should write your own dockerfile and provide that to
.deploy()
like this but you'll have to be careful overriding the entrypoint since prefect leverages a specific one what is your goal with overriding the entrypoint?
v
I do have a custom dockerfile that uses 'FROM prefecthq/prefect:2-python3.11', but how would I override the entrypoint. Do I just give it the flow itself as the cmd parameter or what specifically? Eg. would this work CMD ["python", "flows/myflow.py"]
that's the guide I was following. I would like to be able to mount my application to something like /app/ inside the container (and for example the flow would be at /app/myflow.py). However on my local machine my code lives at a different location, so without setting the entry point it looks like prefect gives it the entrypoint as 'src/path/a/b/c' which the path on my local machine not inside the docker container.
hi. any updates on this?
n
sorry i think there’s some confusion around the language here. i would not recommend using CMD or ENTRYPOINT in a dockerfile that you plan to have a prefect worker execute. if you want to .serve a deployment then sure adding some CMD [python, file.py] makes sense when i was saying entrypoint i meant dockerfile ENTRYPOINT, but i see you might mean entrypoint as in, relative path from repo root to flow decorated function. as for your original question, you should not need to worry about
path
at all and you should just set the entrypoint in your prefect.yaml relative to your repo root
example or two above
v
so that means we have to use yaml files to specifiy the flow's entry point (what if my flow is at /folder1 on my host, but I want it mounted at /app/scripts inside the docker container)? I don't see any other parameter in the deploy() method signature. Basically I'm looking for the equivalent of the path parameter of the build_from_flow() method https://docs.prefect.io/latest/api-ref/prefect/deployments/deployments/#prefect.deployments.deployments.Deployment which is being Deprecated.
n
if you don’t want yaml then yeah .deploy is what you want, can you clarify what your issue is with that?
and do you really mean mounted? if so why? why not just COPY what you need to the image and preserve your repo structure as you do?
@Marvin show a from_source example to show how to set an entrypoint
m
Certainly, if you're looking to see how to use
from_source
with Prefect to set an entrypoint, you'll actually use the
serve
function of the flow. Here is an example to show how this is done:
Copy code
python
import time
from prefect import flow, task
from pydantic import BaseModel, Field

class Data(BaseModel):
    key: str
    value: dict = Field(default_factory=dict)

@task
def fetch_data_for_key(key: str) -> Data:
    """A task that simulates fetching data from some source"""
    return Data(key=key, value={"data": "some data"})

@task
def process_data(data: Data) -> Data:
    """A task that simulates processing data"""
    print(f"Processing {data!r} (I swear)")
    time.sleep(3)
    return data

@flow
def work_i_need_to_do_sometimes(keys: list[str]):
    """Work you might need to do ever so often"""
    
    raw_data_futures = fetch_data_for_key.map(keys)

    process_data.map(raw_data_futures)

if __name__ == "__main__":
    work_i_need_to_do_sometimes.serve(
        name="sporradic-deployment",
        parameters={"keys": ["foo", "bar", "baz"]},
    )
By calling
serve
on the flow, you can start it as a deployment with a given name and parameters for entry. For more in-depth details, you can check the code here.
n
🙄
fwiw this is my preferred programmatic (no yaml) method
mostly bc i don’t like building images every time my flow code changes
v
I'm going to explore the yaml file aproach you mentioned above. We don't want to serve the flows from a github repo, we'd like them served from a local path inside the docker container where the agent is running. Also I thought the .serve() method spins up its own worker and runs the flow in a separate process, I'd like the flow to run on one (or more) workers deployed in our infrastructure.
n
serve does not spin up a worker technically. it’s like a process worker that can’t do arbitrary setup (pull steps), but it’s really nice if you have static infra to serve a process on. the yaml and .deploy (worker based deployments) are when you need dynamic dispatch of infra (ecs /k8s etc) or pull steps
v
so if I have 10 agents each running in their own docker containers all processing from work-pool 'x' let's say. and I have one docker container definition. my source code is at /app/flow1.py, flow2.py etc inside the docker container. what do I need to do to run these? deploy(), serve() or something else? i'm confused
n
first question would be why have 10 agents?
v
just an example. let's say 2
n
v
in the yaml file is there a way to define a deployment in such a way that it doens't automatically build a docker container, but just pulls an existing image that is already available? so instead of doing this, just pull the image named: my-image:latest _#- prefect_docker.deployments.steps.build_docker_image:_ _# id: build_image_ # requires: prefect-docker>=0.3.1 _# image_name: my-image_ # tag: latest
n
and set your build step to just null, and what i linked will override the
image
job variable from the work pool for that deployment, which you can do for any job variable on a given work pool
v
I tried the above. I built a local image with tag t1:latest but getting this: docker.errors.ImageNotFound: 404 Client Error for http+docker://localhost/v1.42/images/create?tag=latest&fromImage=t1: Not Found ("pull access denied for t1, repository does not exist or may require 'docker login': denied: requested access to the resource is denied")
n
I've gotten that error when I did not include my registry/repo name in the
image
v
Copy code
I rebuilt with a repo name, but that doesn't work either:job_variables:
#       image: '{{ build_image.image }}'
        image: local/t1:latest
n
sorry I don't have more helpful advice without more info but if you're getting a 404 you likely aren't correctly referring to your local docker images
v
it looks like the issue is with local docker containers or reaching the local container host (which is running in my case). I can run it with 'image: redis' which is an image from a public repo and it builds it at least. I'm able to run that container with that same tag so this looks like an issue on your end.
I ran
Copy code
» docker build . -t works-fine:local -f Dockerfile.demo
and then
Copy code
» prefect --no-prompt deploy -n test
and
Copy code
» prefect worker start --pool 'docker-work'
» prefect deployment run 'healthcheck/test'
v
yeah I"m pretty much doing the same thing 🤔.
n
i think whats happening is that if we have a
latest
tag, we assume the image is remote based on some k8s convention I think you could open an issue if you don't think that behavior is ideal
v
yeah I think that's what it was. the latest tag was the issue. used a different tag and it's fine
👍 1
i think i got something working for now. thanks. i'm sure will have more questions for u
n
np, will help when I can feel free to ask high level questions that dont require precise syntax in #C04DZJC94DC since its pretty good at concepts but sometimes confabulates syntax
v
the yaml approach worked much better. it would be nice if the deploy() method had parity in functionality with the prefect.yaml approach
1
one question I have, if I run a worker inside a docker container, and the worker will run a docker container when it gets a new flow run, how would that work? I thought docker containers can't run other docker containers. is that an anti pattern?
I'm trying to figure out how to deploy one or more workers now on our infrastructure given that a worker will spin up one or more docker containers for flow runs
n
Copy code
services:
  worker:
    image: prefecthq/prefect:2-python3.12
    restart: always
    command:
      [
        "prefect",
        "worker",
        "start",
        "--pool",
        "docker-work",
        "--install-policy",
        "if-not-present"
      ]
    env_file:
      - ../../.env
    ports:
      - "8081:8081" # if you want healthchecks, otherwise not necessary
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    environment:
      - DOCKER_HOST=unix:///var/run/docker.sock
this part will make the containers get created on the host machine thats running the worker as a container
Copy code
volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    environment:
      - DOCKER_HOST=unix:///var/run/docker.sock
v
thanks will try this out
for the worker service above how would authentication settings be provided for prefect cloud? can auth info be passed in as env vars to that container or is there some other way to run the 'prefect cloud login ... command'?
I think I got my answer from marvin 😉
b
> the yaml approach worked much better. it would be nice if the deploy() method had parity in functionality with the prefect.yaml approach Fully agree! The docs really make these sound equivalent.
334 Views