Hello, I have created a project on prefect cloud, ...
# prefect-community
v
Hello, I have created a project on prefect cloud, a simple hello world flow using the getting started documentation, a pending flow-run and installed a kubernetes agent on a GCP Kubernetes cluster The kubernetes agent is started and authenticated to prefect, however it doesn't pickup any pending flows. It is just stuck on "Waiting for flow runs... " Any idea what I am missing ?
s
Did you create a deployment?
v
yes I have a deployment with a pod running the kubernetes agent
deployment manifest :
Copy code
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: prefect-agent
  name: prefect-agent
spec:
  replicas: 1
  selector:
    matchLabels:
      app: prefect-agent
  template:
    metadata:
      labels:
        app: prefect-agent
    spec:
      containers:
      - args:
        - prefect agent kubernetes start
        command:
        - /bin/bash
        - -c
        env:
        - name: PREFECT__CLOUD__AGENT__AUTH_TOKEN
          value: xxxxx
        - name: PREFECT__CLOUD__API
          value: <https://api.prefect.io>
        - name: NAMESPACE
          value: everysens
        - name: IMAGE_PULL_SECRETS
          value: ''
        - name: PREFECT__CLOUD__AGENT__LABELS
          value: '[]'
        - name: JOB_MEM_REQUEST
          value: ''
        - name: JOB_MEM_LIMIT
          value: ''
        - name: JOB_CPU_REQUEST
          value: ''
        - name: JOB_CPU_LIMIT
          value: ''
        - name: IMAGE_PULL_POLICY
          value: ''
        - name: SERVICE_ACCOUNT_NAME
          value: ''
        - name: PREFECT__BACKEND
          value: cloud
        - name: PREFECT__CLOUD__AGENT__AGENT_ADDRESS
          value: http://:8080
        - name: PREFECT__CLOUD__API_KEY
          value: xxxxx
        - name: PREFECT__CLOUD__TENANT_ID
          value: ''
        image: prefecthq/prefect:1.2.1-python3.7
        imagePullPolicy: Always
        livenessProbe:
          failureThreshold: 2
          httpGet:
            path: /api/health
            port: 8080
          initialDelaySeconds: 40
          periodSeconds: 40
        name: agent
---
apiVersion: <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>
kind: Role
metadata:
  name: prefect-agent-rbac
  namespace: everysens
rules:
- apiGroups:
  - batch
  - extensions
  resources:
  - jobs
  verbs:
  - '*'
- apiGroups:
  - ''
  resources:
  - events
  - pods
  verbs:
  - '*'
---
apiVersion: <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>
kind: RoleBinding
metadata:
  name: prefect-agent-rbac
  namespace: everysens
roleRef:
  apiGroup: <http://rbac.authorization.k8s.io|rbac.authorization.k8s.io>
  kind: Role
  name: prefect-agent-rbac
subjects:
- kind: ServiceAccount
  name: default
flow file
Copy code
import prefect
from prefect import task, Flow
from prefect.run_configs import KubernetesRun

@task
def hello_task():
    logger = prefect.context.get("logger")
    <http://logger.info|logger.info>("Hello world!")


flow = Flow(
    "hello-flow-kubernetes",
    tasks=[hello_task],
    run_config=KubernetesRun()
)
flow.register(project_name="prefect-poc")
Ok I think I understood When registring the flow, the prefect CLI automatically adds a label to the flow And the agent I deployed didn't have any label which mean it doesn't pickup flows marked wit the generated label Not very intuitive for a beginner 😛 It would be nice if it was mentionned in the getting started
💯 1
So the next question is how to register a flow without an automatic label ?
I tried this but it still adds a second auto-generated label
Copy code
flow = Flow(
    "hello-flow-kubernetes",
    tasks=[hello_task],
    run_config=KubernetesRun(
        labels=["kubernetes"],
    )
)
a
the answer is: to add Storage - this is the missing piece in your setup
if you set storage such as e.g. GCS storage, the extra hostname label won't get created - it gets added only because not specifying any storage explicitly, Prefect uses the default local Storage. To ensure the flow code exists on the machine from which you run it, it adds the hostname label of the machine to ensure the flow run will only be picked up by an agent that actually has access to that local file TL;DR: I'd recommend adding GCS storage when you use GCP already
v
Thanks, that was the missing piece indeed, I was able to use the GitlabStorage. It then started the flow-run on the kuberneted agent, picked up the code from gitlab and execute the hello world task
a
nice work!