Henning Holgersen01/26/2022, 11:52 AM
. I have added the RBAC stuff to the deployment, and opened outbound traffic to api.prefect.io on port 443, but I have not been able to get anything to run on the AKS agent. Any ideas?
'Connection aborted.', OSError(0, 'Error')
version issue. We had some users reporting similar errors because AKS load balancer was dropping connections and the solution recommended by Azure was to increase timeout settings on the AKS load balancer. I can search for the Github issue that discussed this in more detail.
Henning Holgersen01/26/2022, 12:26 PM
Henning Holgersen01/26/2022, 12:45 PM
livenessProbe: failureThreshold: 2 httpGet: path: /api/health port: 8080 initialDelaySeconds: 40 periodSeconds: 40 name: agent
Henning Holgersen01/26/2022, 1:04 PM
Henning Holgersen01/28/2022, 2:28 PM
Lastly, if the agent you set up yourself doesn’T work, perhaps you can try this one from the Azure marketplace?
from prefect import task, Flow @task(log_stdout=True) def hello_world(): print("hello world") with Flow("hello") as flow: hw = hello_world()
Henning Holgersen01/28/2022, 3:04 PM
Henning Holgersen01/28/2022, 3:57 PM
Henning Holgersen02/01/2022, 1:57 PM
apiVersion: apps/v1 kind: Deployment metadata: labels: app: prefect-agent name: prefect-agent spec: replicas: 1 selector: matchLabels: app: prefect-agent template: metadata: labels: app: prefect-agent spec: containers: - args: - prefect agent kubernetes start command: - /bin/bash - -c env: - name: PREFECT__CLOUD__AGENT__AUTH_TOKEN value: '' - name: PREFECT__CLOUD__API value: <https://api.prefect.io> - name: NAMESPACE value: default - name: IMAGE_PULL_SECRETS value: '' - name: PREFECT__CLOUD__AGENT__LABELS value: '' - name: JOB_MEM_REQUEST value: '' - name: JOB_MEM_LIMIT value: '' - name: JOB_CPU_REQUEST value: '' - name: JOB_CPU_LIMIT value: '' - name: IMAGE_PULL_POLICY value: '' - name: SERVICE_ACCOUNT_NAME value: '' - name: PREFECT__BACKEND value: cloud - name: PREFECT__CLOUD__AGENT__AGENT_ADDRESS value: http://:8080 - name: PREFECT__CLOUD__API_KEY value: '<SuperSecretSystemUserAPIKey>' - name: PREFECT__CLOUD__TENANT_ID value: '' image: prefecthq/prefect:0.15.7-python3.6 imagePullPolicy: Always livenessProbe: failureThreshold: 2 httpGet: path: /api/health port: 8080 initialDelaySeconds: 40 periodSeconds: 40 name: agent --- apiVersion: <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1> kind: Role metadata: name: prefect-agent-rbac namespace: default rules: - apiGroups: - batch - extensions resources: - jobs verbs: - '*' - apiGroups: - '' resources: - events - pods verbs: - '*' --- apiVersion: <http://rbac.authorization.k8s.io/v1beta1|rbac.authorization.k8s.io/v1beta1> kind: RoleBinding metadata: name: prefect-agent-rbac namespace: default roleRef: apiGroup: <http://rbac.authorization.k8s.io|rbac.authorization.k8s.io> kind: Role name: prefect-agent-rbac subjects: - kind: ServiceAccount name: default
Henning Holgersen02/01/2022, 3:45 PM
I shared the issue with our team, I’ll let you know if I get any pointers on what may be wrong in your setup.
env: - name: PREFECT__LOGGING__LEVEL value: 'DEBUG'
Henning Holgersen02/25/2022, 12:47 PM
) that is listed in the portal. Prefect needs to reach this endpoint in order to create new jobs (=flow runs). The needed FW rule is mentioned in the Azure documentation but hard to find. And the error message was completely undecipherable.
Henning Holgersen02/25/2022, 1:57 PM
. or something domain that is listed as the “API Server Address” of the cluster. The address can be found on the AKS overview page in the portal. Depending on the firewall rule type, you might have to find the IP behind the API server domain and use that in the rule instead of the url, but that’s a detail. No extra rules to Prefect Cloud was needed, the trusty old api.prefect.io:443 outbound rule was enough once the AKS/K8S/Firewall rule above had been sorted.