https://prefect.io logo
Title
j

Josh Greenhalgh

01/22/2021, 10:00 AM
Can anyone point me to some detailed notes on setting up prefect server and agent on k8s? I have managed to deploy the UI and the agent all pods running green - however the agent does not know where it needs to query the webserver - how do you set the location of the webserver in the agent?
I am using the helm chart from
<https://github.com/PrefectHQ/server/tree/master/helm/prefect-server>
- started with the option
--set agent.enabled=true
- I would have assumed that requests from the agent to the server would go via cluster networking but it doesn't seem to happen and I think I need to specify the graphql url for the agent but no idea how!
z

Zanie

01/22/2021, 3:17 PM
Hi @Josh Greenhalgh! Hmm the agent should be communicating with the API via the cluster.
What do the logs for the agent pod say?
j

Josh Greenhalgh

01/22/2021, 3:20 PM
I have actually worked it out now! Just a timing thing in regards to when the ip is available and not knowing how to update the agents env vars once its running. I solved by deploying the server first waiting for ip then manually creating the agent yml via
prefect agent kubernetes install ...
and passing it the api url once I knew it - I am still unclear as to why the agent is speaking to the server over the open internet rather than within the cluster?
z

Zanie

01/22/2021, 3:24 PM
You shouldn’t have to do that, the helm chart configures it to use the same API url (using cluster DNS) that all the other services talk to.
j

Josh Greenhalgh

01/22/2021, 3:25 PM
hmmm
Would I be right in thinking that I shouldn't really be using the helm chart in that repo? Its very experimental?
z

Zanie

01/22/2021, 4:40 PM
It's marked as experimental because we're not committed to backwards compatibility as we develop it since it's so new. Lots of people are using it though and it likely won't remain experimental for long.
j

Josh Greenhalgh

01/22/2021, 8:05 PM
So when I start with
--set agent.enabled=true
the agent has the following in its logs;
[2021-01-22 20:02:51,449] ERROR - agent | 400 Client Error: Bad Request for url: <http://prefect-apollo.prefect:4200/graphql>

The following error messages were provided by the GraphQL server:

    INTERNAL_SERVER_ERROR: Variable "$input" got invalid value null at
        "input.tenant_id"; Expected non-nullable type UUID! not to be null.

The GraphQL query was:

    mutation($input: get_runs_in_queue_input!) {
            get_runs_in_queue(input: $input) {
                flow_run_ids
        }
    }

The passed variables were:

    {"input": {"before": "2021-01-22T20:02:51.383799+00:00", "labels": [], "tenant_id": null}}
z

Zanie

01/22/2021, 8:06 PM
That’s failing because the default tenant hasn’t been created yet.
👀 1
j

Josh Greenhalgh

01/22/2021, 8:07 PM
Yeah I have done that!
Can access dashboard perfectly (which I think is not possible until that is done?)
z

Zanie

01/22/2021, 8:07 PM
When the agent pod restarts it should be fine then
j

Josh Greenhalgh

01/22/2021, 8:09 PM
Ok so yeah you are correct the agent has no logs since then - however the dashboard does not see an agent
(prefect-k8s-NWMFqeiw-py3.7) ➜  prefect-server git:(master) ✗ kubectl get pods -n prefect                    
NAME                               READY   STATUS    RESTARTS   AGE
prefect-agent-59db54f4dc-4lwqb     1/1     Running   0          10m
prefect-apollo-b5d9c4cd8-4hqs8     1/1     Running   1          10m
prefect-graphql-5f77cd4674-ktq8g   1/1     Running   0          10m
prefect-hasura-bf4dd5d95-m7mzx     1/1     Running   2          10m
prefect-postgresql-0               1/1     Running   0          10m
prefect-towel-5dc84cb477-6h6c8     1/1     Running   0          10m
prefect-ui-9b74bc6c8-nqdb2         1/1     Running   0          10m
z

Zanie

01/22/2021, 8:11 PM
Can you try killing the pod to see if a new one works?
kubectl delete pod prefect-agent-59db54f4dc-4lwqb
should remove it and the deployment should spawn a new one
🙌 1
j

Josh Greenhalgh

01/22/2021, 8:13 PM
Yep bingo! Is there a way I can turn on much more verbose logging to catch this stuff?
z

Zanie

01/22/2021, 8:23 PM
I don’t think I pass through any logging options right now, but in the agent env you could probably do
PREFECT__CLOUD__AGENT__LEVEL=debug
That should have auto-restarted though, might be something wrong with the health check.
j

Josh Greenhalgh

01/22/2021, 9:52 PM
Thanks very much!