Are there any examples of a Dockerfile for Prefect...
# prefect-server
b
Are there any examples of a Dockerfile for Prefect agents successfully deployed via Kubernetes?
k
The prefect docker image should be able to do this already. Have you seen this part of the docs? It uses the base Prefect image
b
Thanks. I'm running something simple like as a Dockerfile. Do I need to integrate that command into the script?
Copy code
FROM prefecthq/prefect:latest

COPY dockerfile-requirements.txt /requirements.txt

RUN pip install -r requirements.txt
k
When you type the
prefect agent kubernetes install
, it generates a spec. Look for the image line in that spec and replace it with yours. Then you need to push your image somewhere where it can be pulled by kubernetes.
b
Thanks. I suspect I'm not the typical use case, but for this newbie that biggest question after I'm blown away your product is, "How the heck do I get an agent plugged into my cloud UI so I can run some jobs for real?"
And that's where I'm flailing at the moment.
k
We don’t host anything so the agent is just something on your infrastructure that polls the Prefect every 10 seconds for Flows to run. You can spin one up on your local machine just by doing the
prefect agent local start
. You can spin it up in a VM the same way.
For the Kubernetes one, you follow the link above where you do the install and then apply it to your cluster to spin an agent pod
b
Local stuff is working great with UniversalRun and the like. Gonna try a stab at blindly following you k8s docs and report back. Thank you for your help.
This booted your generic image in my Google Cloud k8s cluster. So that feels like progress.
Copy code
prefect agent kubernetes install -k $API_KEY > k8s.yml
kubectl apply -f k8s.yml
k
yes exactly. just open the yaml and stick in your image
b
Now I'll try substituting in the path to my custom Docker image path, which is one of those long Google Artifact Registry strings.
Okay! That's booted. Progress!
So how does it know to connect with my Prefect UI account on your site?
k
With the API key
b
And that needs to be in the YAML that goes up there too?
Yeah, I see it in there.
Logs in the k8s deployment are raising this issue.
k
Probably authentication that you can’t pull the image right?
b
Copy code
raise ClientError("Malformed response received from API.") from exc prefect.exceptions.ClientError: Malformed response received from API.
The prefect welcome message is in the Google k8s logs, so I think it's booting.
k
Seems your API key is off? Does it work on local agent?
b
Trying that now with:
Copy code
pipenv run prefect agent local start -k $API_KEY;
Looks good locally:
$ pipenv run prefect agent local start -k $API_KEY;
Loading .env environment variables...
[2022-01-27 203553,744] INFO - agent | Registering agent...
[2022-01-27 203553,777] INFO - agent | Registration successful!
__ __ _ _ _
| _ \ _ __ _ / _| _ ___| |_ / \ __ _ _ _ _ | |
| |_) | '__/ _ \ |_ / _ \/ __| __| / _ \ / _` |/ _ \ '_ \| __|
| __/| | | __/ _| __/ (_| | / _ \ (_| | _/ | | | |
|_| |_| \___|_| \___|\___|\__| /_/ \_\__, |\___|_| |_|\__|
|___/
[2022-01-27 203553,786] INFO - agent | Starting LocalAgent with labels ['little-tokyo']
[2022-01-27 203553,786] INFO - agent | Agent documentation can be found at https://docs.prefect.io/orchestration/
[2022-01-27 203553,786] INFO - agent | Waiting for flow runs...
[2022-01-27 203553,829] INFO - agent | Deploying flow run 7afc4aaf-28e2-474b-b62d-6894458341c4 to execution environment...
[2022-01-27 203553,876] INFO - agent | Completed deployment of flow run 7afc4aaf-28e2-474b-b62d-6894458341c4
k
Check if the YAML has the same key and it looks alright? Also try giving a bad token to see if you can replicate?
b
It's showing up on my cloud UI
YAML has the same key
Could it be this in the YAML?
- name: PREFECT__BACKEND
value: server
k
Yes that should be cloud i think unless you have your deployment
b
Okay. I'm learning!
Let me try that now.
Oh boy.
👍 1
Thanks for getting me this far. Now I try to run a flow
Now I'm in "label problem" land.
So I'm going to make a label specific this flow and add it to the YAML, I think.
Copy code
- name: PREFECT__CLOUD__AGENT__LABELS
  value: '["warn-prefect-flow"]'
k
Note your flow needs to be in a place the agent can pull it from. Think Github storage
b
Hmm. I thought I just needed to run the prefect "register" command? Does that not do it?
Label sync between the flow and k8s deploy got the task picked up.
The task then failed with this error:
State Message:
(403) Reason: Forbidden HTTP response headers: HTTPHeaderDict({'Audit-Id': 'a4cc413f-ee09-419c-a727-5d307c0f199d', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': 'b72eaaa4-778e-43f7-9d3e-16334e3b1250', 'X-Kubernetes-Pf-Prioritylevel-Uid': '11303276-a26e-4d06-b683-8b59c0220e2f', 'Date': 'Thu, 27 Jan 2022 210154 GMT', 'Content-Length': '311'}) HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch is forbidden: User \"systemserviceaccountdefault:default\" cannot create resource \"jobs\" in API group \"batch\" in the namespace \"default\"","reason":"Forbidden","details":{"group":"batch","kind":"jobs"},"code":403}
Is this the RBAC stuff?
k
Yes it is just use the --rbac flag. Basically the agent needs permission to spin new k8s jobs and doesnt have it by default
b
Great. Did that. Hurdled another bug! We're running now!
After restarting the k8s deploy with the --rbac options, I'm on to the next bug with a fresh flow run.
Failed to load and execute Flow's environment: ModuleNotFoundError("No module named '/home/palewire/Code/warn-prefect-flow/flow'")
So I suspect this is what you meant. I need to bundle the flow file itself into the Dockerfile?
It's just flow.py at the root of my repo, sitting next to the Dockerfile, right now.
Trying this edit to my Dockerfile followed by redeploy of the image artifact and the k8s pod.
FROM prefecthq/prefect:latest
COPY dockerfile-requirements.txt /requirements.txt
RUN pip install -r requirements.txt
COPY flow.py /flow.py
I worry that the error above appears to contain the path to the home directory on my local computer, /home/palewire/Code/. Not sure how Prefect knows about that.
k
Check this message
b
I see. So I need to change how I register the flow?
k
Either use an accessible Storage or use Docker storage to point to the Flow file with KubernetesRun. I have an example one sec
the comments at the top explain
b
So in this case the user makes a separate Dockerfile image strictly to hold the flow files that is separate from the agent image?
k
yes exactly which is why build=False. or you can have the Prefect interface build for you. There are a number of ways. You can fine here or some examples here . The important part if you want to supply an image with the flow file is
stored_as_script=True
this setup is common when you have 1 image but like 10 flows and dont want to make new images for each one
b
I might totally crazy, but implementing uploads to Google Cloud Storage static files seems like it might be less hassle.
k
If you have custom modules that you can’t just pip install, that’s when you need to use Docker otherwise the GCS Storage should be fine
b
Ah. I see.
I do have a custom module. Here's where I'm confused. Why can't I bundle the flow.py file in the same image as the prefect agent?
k
Ah you can! GCS Storage + KubernetesRun. This setup is if your custom module does not change often but if it changes rapidly, this setup becomes cumbersome
I think you know all the scenarios now. Really you can use any Storage + RunConfig combination
b
Gotcha. I think what I'm learning is that the Storage that holds your flows is really conceived as separate thing from the image the runs Prefect Agent.
k
Yes…unless it’s Docker where the whole container is pulled
b
Gotcha. But a k8s that uses a docker image to form its pod is not the same thing?
k
If you use Docker Storage + Kubernetes Run, they are the same thing. But anything like Github, S3, GCS, then it is split