Pretty new to Prefect. I'm self hosting on k8s and...
# ask-community
s
Pretty new to Prefect. I'm self hosting on k8s and I was thinking about doing the following: • Building and pushing my images without using Prefect • Then declaring my deployments Is this possible? If yes, I am a little lost on direction and would ❤️ some advice.
k
This is definitely possible! I can help if you provide a little more detail by answering these questions: • how are you building and pushing images? • what method are you using for creating Prefect deployments?
s
I want to use Pants2 to build and push images. (Not actually pushing anything yet, but will use a local image registry as I am building this up locally first) So far I don't have a method for creating Prefect deployments. Just starting to get up to speed here.
I have
prefect-server
and
prefect-worker
deployed to my local cluster.
k
When you create Prefect deployments, you can specify the image name you want to use when your deployment runs, and depending on which deployment method you use, you can skip the steps that would build and push an image. With
.deploy
you can set
build=False
and
push=False
and then
image="my_registry/my_image:my_image_tag"
. https://docs.prefect.io/latest/guides/prefect-deploy/?#additional-configuration-with-deploy With
prefect.yaml
and
prefect deploy
you can leave
build:
and
push:
as
null
and set the
image
under your deployment's
job_variables
https://docs.prefect.io/latest/guides/prefect-deploy/?#deployment-actions
👀 1
s
So the pull section is also null?
k
That'll depend on where your code is. Prefect deployments need an
entrypoint
, which is a path to a python script and a flow function name, something like
flows/my_flow.py:flow_func
that path is relative to the current working directory, so if your code is in an image, and that image's starting dir doesn't match your deployment's entrypoint, you can use the pull step to set the working dir
if your deployment is grabbing code from elsewhere, like a git repo or a bucket, you'll need to specify that in the pull step. Having an image for dependency management but actual flow code in a repo or some other remote storage is a common setup since it means you don't have to rebuild your image every time your code changes. In this case the pull step will clone the repo or otherwise copy down your remote code files before attempting to find and execute the deployment's entrypoint
s
Ok I think I'm still a little lost here. I thought I could package my code in an image, and then have Prefect "run" that on my cluster. I don't get why there needs to be an entrypoint, since I am just spinning up a container to do the work and the entrypoint is defined when I build the image.
I''d like to avoid injecting code into an image that might not have the right dependencies. I'm not sure that's a great approach. That's one of the reasons I am using Pants, so that I can make reproducible fast builds in a monorepo.
k
Yeah, grabbing the code from remote at runtime is just an option. You can still package your code in an image. Entrypoint as a property of a deployment enables consistent behavior between all the different ways your flow code can be acquired. Instead of setting the entrypoint in the dockerfile, you can use a Prefect base image like
prefecthq/prefect:2-latest
, add dependencies and code on top of that, and let Prefect handle starting your flow via the deployment's config. You're welcome to override all this behavior or use your own image entirely if you want though.
s
Where can I find the full deployments spec?
s
Thanks for your help. Is there a spec for infra_overrides/job_variables map?
k
Which job variables you can set to start off depends on which type of work pool you're using, since each work pool provides a default template for starting flow runs on their matching infrastructure. Beyond that, you can edit the work pool template to add your own variables, which can then be overridden as
job_variables
on a deployment.
I would recommend creating a work pool of the type you need, then going to the thee dots -> edit -> advanced just to make the mental connection between what the template does and how the variable names are used in a deployment's
job_variables
s
Ok, I think I am starting to put it together. What I actually wanted was a kubernetes job, so I used the cli to generate me something to start with.
👍 1
Hey @Kevin Grismore. I got it all working with Workers, and now it all seems pretty straightforward. I found it pretty confusing in the docks between what is referring to worker type deployments and agent type deployments, and finding complete specs without having to go into the UI and look through the json, and understand how parts of that related to parts of the prefect file. One thing I got really confused about was actually deploying the
prefect.yaml
file. When you look at
prefect deployment apply --help
it refers to a deployment yaml file. This I learnt is not anything to do with the deployment I wanted to deploy. Instead it was
prefect --no-prompt deploy --name example
I was looking for, which doesn't say anything about running against the
prefect.yaml
file, so it was all a bit magic. It's also pretty counter intuitive to have to supply an
entrypoint
when that isn't even an input to the standard kubernetes job template. It's a little bit annoying to have put something valid to get it to run, but mostly a little misleading IMO. I hope you like getting feedback, in general I am liking a lot about the framework, and it feels like it's moving in the right direction. This is what I ended up with, which I would have struggled to achieve with out your help, so big thanks!
Copy code
name: flows
prefect-version: 2.13.8
build:
push:
pull:

deployments:
- name: example
  schedule:
  work_pool:
    name: development
    job_variables:
      image: prefect-example:latest
      namespace: prefect-worker
      image_pull_policy: Never
    work_queue_name:
  entrypoint: src/examples/flow/__main__.py:get_repo_info
  version:
  tags: []
  description:
  parameters: {}
Just a final thought. Pretty much all my pain points are in the guide, but perhaps the guides could be a little more step by step hello world like.
k
Your feedback is much appreciated! We know that there's some confusing terminology and concept overlap, and everything you share helps us improve.
I know I often say "yaml-based deployments" when talking about
prefect.yaml
but that's not entirely clear without more specificity.
s
Documentation is hard 🤣 . I often find projects forget after being around for a while what its like to have no knowledge about the general flow of things when starting out, a lot of prior knowledge is sometimes assumed. I think you've done a good job but it's important to try to think, "What if I knew nothing, would this make sense?" Anyway, appreciate taking the time to chat and have a great weekend!
💙 1