ciaran
04/21/2021, 11:55 AMTyler Wanner
04/21/2021, 12:07 PMTyler Wanner
04/21/2021, 12:08 PMciaran
04/21/2021, 12:09 PMTyler Wanner
04/21/2021, 12:10 PMciaran
04/21/2021, 12:13 PMciaran
04/21/2021, 12:14 PMTyler Wanner
04/21/2021, 12:18 PMTyler Wanner
04/21/2021, 12:20 PMprefect agent kubernetes install
Tyler Wanner
04/21/2021, 12:20 PMresource "kubernetes_namespace" "ci" {
metadata {
name = "prefect"
}
}
resource "kubernetes_role" "prefect_agent" {
metadata {
name = "prefect-agent"
namespace = kubernetes_namespace.ci.metadata[0].name
}
rule {
api_groups = ["batch", "extensions"]
resources = ["jobs"]
verbs = ["*"]
}
rule {
api_groups = [""]
resources = ["events", "pods"]
verbs = ["*"]
}
}
resource "kubernetes_role_binding" "prefect_agent" {
metadata {
name = "prefect-agent"
namespace = kubernetes_namespace.ci.metadata[0].name
}
role_ref {
api_group = "<http://rbac.authorization.k8s.io|rbac.authorization.k8s.io>"
kind = "Role"
name = kubernetes_role.prefect_agent.metadata[0].name
}
subject {
kind = "ServiceAccount"
name = kubernetes_service_account.agent.metadata[0].name
namespace = kubernetes_namespace.ci.metadata[0].name
}
}
resource "kubernetes_service_account" "agent" {
metadata {
name = "agent"
namespace = kubernetes_namespace.ci.metadata[0].name
}
}
resource "kubernetes_deployment" "deployment" {
metadata {
name = <http://var.app|var.app>
namespace = kubernetes_namespace.ci.metadata[0].name
}
spec {
replicas = "1"
selector {
match_labels = {
app = <http://var.app|var.app>
}
}
template {
metadata {
labels = {
app = <http://var.app|var.app>
}
}
spec {
service_account_name = kubernetes_service_account.agent.metadata[0].name
automount_service_account_token = true
container {
args = ["prefect agent kubernetes start"]
command = ["/bin/bash", "-c"]
env {
name = "PREFECT__CLOUD__AGENT__AUTH_TOKEN"
value = var.auth_token
}
env {
name = "PREFECT__CLOUD__AGENT__AGENT_ADDRESS"
value = "http://:8080"
}
env {
name = "NAMESPACE"
value = kubernetes_namespace.ci.metadata[0].name
}
env {
name = "PREFECT__CLOUD__AGENT__LABELS"
value = "['foo']"
}
dynamic env {
for_each = var.env_vars
content {
name = env.name
value = env.value
}
}
image = "prefecthq/prefect:${var.prefect_version}"
name = <http://var.app|var.app>
image_pull_policy = "Always"
liveness_probe {
http_get {
path = "/api/health"
port = 8080
}
failure_threshold = 2
initial_delay_seconds = 40
period_seconds = 40
}
resources {
limits {
cpu = "500m"
memory = "128Mi"
}
}
}
}
}
}
}
variable "auth_token" {}
variable "app" { default = "prefect-agent" }
variable "prefect_version" { default = "latest" }
variable "env_vars" {
type = map
default = null
}
Tyler Wanner
04/21/2021, 12:21 PMTyler Wanner
04/21/2021, 12:22 PMTyler Wanner
04/21/2021, 12:24 PMciaran
04/21/2021, 12:28 PMTyler Wanner
04/21/2021, 12:40 PMTyler Wanner
04/21/2021, 12:47 PMciaran
04/21/2021, 12:57 PMkubectl
to apply that manifest?ciaran
04/21/2021, 1:02 PMTyler Wanner
04/21/2021, 1:04 PMprefect kubernetes agent install --rbac --namespace NAMESPACE -t TOKEN | kubectl apply -n NAMESPACE -f -
Tyler Wanner
04/21/2021, 1:05 PMciaran
04/21/2021, 1:05 PMciaran
04/21/2021, 1:05 PMTyler Wanner
04/21/2021, 1:06 PMciaran
04/21/2021, 1:15 PMTyler Wanner
04/21/2021, 1:16 PMTyler Wanner
04/21/2021, 1:16 PMciaran
04/21/2021, 1:21 PMTyler Wanner
04/21/2021, 1:21 PMTyler Wanner
04/21/2021, 1:23 PMciaran
04/21/2021, 1:23 PMciaran
04/21/2021, 1:26 PMciaran
04/21/2021, 2:27 PMprefect agent kubernetes install -t "<token>" --rbac -n "pangeo-forge-azure-bakery" -l "ciaran-dev" | kubectl apply -f --namespace=pangeo-forge-azure-bakery -
Based on https://docs.prefect.io/orchestration/agents/kubernetes.html#running-in-cluster
But I'm getting:
error: Unexpected args: [-]
If I remove the -
I instead get:
error: the path "--namespace=pangeo-forge-azure-bakery" does not exist
Tyler Wanner
04/21/2021, 2:28 PMTyler Wanner
04/21/2021, 2:28 PMTyler Wanner
04/21/2021, 2:28 PMciaran
04/21/2021, 2:29 PMWarning: <http://rbac.authorization.k8s.io/v1beta1|rbac.authorization.k8s.io/v1beta1> RoleBinding is deprecated in v1.17+, unavailable in v1.22+; use <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1> RoleBinding
Error from server (NotFound): error when creating "STDIN": namespaces "pangeo-forge-azure-bakery" not found
Error from server (NotFound): error when creating "STDIN": namespaces "pangeo-forge-azure-bakery" not found
Error from server (NotFound): error when creating "STDIN": namespaces "pangeo-forge-azure-bakery" not found
ciaran
04/21/2021, 2:29 PMTyler Wanner
04/21/2021, 2:31 PMciaran
04/21/2021, 2:31 PMciaran
04/21/2021, 2:31 PMTyler Wanner
04/21/2021, 2:31 PMciaran
04/21/2021, 2:32 PMTyler Wanner
04/21/2021, 2:32 PMTyler Wanner
04/21/2021, 2:32 PMciaran
04/21/2021, 2:33 PMazurerm_kubernetes_cluster
doesn't offer that option.ciaran
04/21/2021, 2:33 PMkubernetes
provider/kubectl it isTyler Wanner
04/21/2021, 2:33 PMkubectl create namespace NAMESPACE
Tyler Wanner
04/21/2021, 2:33 PMTyler Wanner
04/21/2021, 2:34 PMciaran
04/21/2021, 2:37 PMTyler Wanner
04/21/2021, 2:38 PMTyler Wanner
04/21/2021, 2:39 PMciaran
04/21/2021, 2:40 PMTyler Wanner
04/21/2021, 2:41 PMciaran
04/21/2021, 2:41 PMciaran
04/21/2021, 2:41 PMTyler Wanner
04/21/2021, 2:42 PMTyler Wanner
04/21/2021, 2:42 PMciaran
04/21/2021, 2:47 PMciaran
04/21/2021, 2:49 PMciaran
04/21/2021, 2:55 PMTyler Wanner
04/21/2021, 9:04 PMciaran
04/23/2021, 12:27 PM(403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'Audit-Id': 'af45fea8-a5be-4d4a-a50c-fc8875a83144', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'Date': 'Fri, 23 Apr 2021 12:24:44 GMT', 'Content-Length': '329'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch is forbidden: User \"system:serviceaccount:default:default\" cannot create resource \"jobs\" in API group \"batch\" in the namespace \"pangeo-forge-azure-bakery\"","reason":"Forbidden","details":{"group":"batch","kind":"jobs"},"code":403}
The yaml I'm applying to the cluster looks like:
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: prefect-agent
name: prefect-agent
spec:
replicas: 1
selector:
matchLabels:
app: prefect-agent
template:
metadata:
labels:
app: prefect-agent
spec:
containers:
- args:
- prefect agent kubernetes start
command:
- /bin/bash
- -c
env:
- name: PREFECT__CLOUD__AGENT__AUTH_TOKEN
value: ${PREFECT__CLOUD__AGENT__AUTH_TOKEN}
- name: PREFECT__CLOUD__API
value: <https://api.prefect.io>
- name: NAMESPACE
value: ${BAKERY_NAMESPACE}
- name: IMAGE_PULL_SECRETS
value: ''
- name: PREFECT__CLOUD__AGENT__LABELS
value: '${PREFECT__CLOUD__AGENT__LABELS}'
- name: JOB_MEM_REQUEST
value: ''
- name: JOB_MEM_LIMIT
value: ''
- name: JOB_CPU_REQUEST
value: ''
- name: JOB_CPU_LIMIT
value: ''
- name: IMAGE_PULL_POLICY
value: ''
- name: SERVICE_ACCOUNT_NAME
value: ''
- name: PREFECT__BACKEND
value: cloud
- name: PREFECT__CLOUD__AGENT__AGENT_ADDRESS
value: http://:8080
image: prefecthq/prefect:0.14.16-python3.8
imagePullPolicy: Always
livenessProbe:
failureThreshold: 2
httpGet:
path: /api/health
port: 8080
initialDelaySeconds: 40
periodSeconds: 40
name: agent
---
apiVersion: <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>
kind: Role
metadata:
name: prefect-agent-rbac
namespace: default
rules:
- apiGroups:
- batch
- extensions
resources:
- jobs
verbs:
- '*'
- apiGroups:
- ''
resources:
- events
- pods
verbs:
- '*'
---
apiVersion: <http://rbac.authorization.k8s.io/v1beta1|rbac.authorization.k8s.io/v1beta1>
kind: RoleBinding
metadata:
name: prefect-agent-rbac
namespace: default
roleRef:
apiGroup: <http://rbac.authorization.k8s.io|rbac.authorization.k8s.io>
kind: Role
name: prefect-agent-rbac
subjects:
- kind: ServiceAccount
name: default
And my KubernetesRun
config looks like:
run_config=KubernetesRun(
image="prefecthq/prefect:0.14.16-python3.8",
labels=json.loads(os.environ["PREFECT__CLOUD__AGENT__LABELS"]),
),
Any ideas? Appreciate it!Tyler Wanner
04/23/2021, 1:25 PMTyler Wanner
04/23/2021, 1:27 PM"Failure","message":"jobs.batch is forbidden: User \"system:serviceaccount:default:default\" cannot create resource \"jobs\" in API group \"batch\" in the namespace \"pangeo-forge-azure-bakery\""
^^ this is saying that something in the default namespace, with the serviceaccount default, cannot create jobs in the namespace pangeo-forge-azure-bakeryTyler Wanner
04/23/2021, 1:27 PMciaran
04/23/2021, 1:27 PMciaran
04/23/2021, 1:28 PMTyler Wanner
04/23/2021, 1:29 PMTyler Wanner
04/23/2021, 1:30 PMciaran
04/23/2021, 1:31 PMTyler Wanner
04/23/2021, 1:32 PMTyler Wanner
04/23/2021, 1:33 PMnamespace: pangeo-forge-azure-bakery
on line 8ciaran
04/23/2021, 1:34 PMspec
section?Tyler Wanner
04/23/2021, 1:34 PMciaran
04/23/2021, 1:34 PMTyler Wanner
04/23/2021, 1:36 PMTyler Wanner
04/23/2021, 1:36 PMciaran
04/23/2021, 1:36 PMTyler Wanner
04/23/2021, 1:37 PMciaran
04/23/2021, 1:38 PMciaran
04/23/2021, 1:38 PMFailed to load and execute Flow's environment: AttributeError("'NoneType' object has no attribute 'rstrip'")
Tyler Wanner
04/23/2021, 1:41 PMciaran
04/23/2021, 1:42 PMTyler Wanner
04/23/2021, 4:15 PMTyler Wanner
04/23/2021, 4:16 PMciaran
04/23/2021, 4:17 PMprefecthq/prefect:0.14.16-python3.8
and my local install is also 0.14.16
on Python 3.8.6
Tyler Wanner
04/23/2021, 4:17 PMciaran
04/23/2021, 4:18 PMprefect[azure, kubernetes]
locally...Kevin Kho
Kevin Kho
ciaran
04/23/2021, 4:20 PMKevin Kho
ciaran
04/23/2021, 4:29 PMKevin Kho
Kevin Kho
ciaran
04/23/2021, 4:34 PMciaran
04/23/2021, 4:34 PMciaran
04/23/2021, 4:35 PMKevin Kho
ciaran
04/23/2021, 4:36 PM>>> import json
>>> import os
>>> json.loads(os.environ["PREFECT__CLOUD__AGENT__LABELS"])
['ciarandev']
Tried it in the python interpreter, looks like a list.Kevin Kho
ciaran
04/23/2021, 4:36 PMKevin Kho
ciaran
04/23/2021, 4:38 PMciaran
04/26/2021, 9:08 AMaccess_token_secret
in https://docs.prefect.io/api/latest/storage.html#github is only necessary for private repositories?ciaran
04/26/2021, 9:40 AMaccess_token_secret
. I get:ciaran
04/26/2021, 9:40 AMciaran
04/26/2021, 1:21 PMKevin Kho
ciaran
04/26/2021, 3:12 PMKevin Kho
ciaran
04/26/2021, 3:16 PMKevin Kho
ciaran
04/26/2021, 3:21 PM>>> con_string = "<the con string>"
>>> from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
>>> blob_service_client = BlobServiceClient.from_connection_string(con_string)
>>> for container in blob_service_client.list_containers():
... print(container)
...
{'name': 'ciarandev-bakery-flow-storage-container', 'last_modified': datetime.datetime(2021, 4, 26, 8, 14, 17, tzinfo=datetime.timezone.utc), 'etag': '"0x8D9088B4947D0D2"', 'lease': {'status': 'unlocked', 'state': 'available', 'duration': None}, 'public_access': None, 'has_immutability_policy': False, 'deleted': None, 'version': None, 'has_legal_hold': False, 'metadata': None, 'encryption_scope': <azure.storage.blob._models.ContainerEncryptionScope object at 0x106ce0f10>}
ciaran
04/26/2021, 3:21 PMKevin Kho
ciaran
04/26/2021, 3:23 PMciaran
04/26/2021, 3:24 PMprefecthq/prefect:0.14.16-python3.8
imageKevin Kho
ciaran
04/26/2021, 3:25 PMciaran
04/26/2021, 3:28 PMciaran
04/26/2021, 3:30 PMFailed to load and execute Flow's environment: AttributeError("'NoneType' object has no attribute 'rstrip'")
Kevin Kho
Kevin Kho
AzureStorage
class does not store the connection_string needed to retrieve your flow. This might be something we need to fix on our end (I raised the issue to the team). In the meantime, you need to pass an environment variable to get around this.Kevin Kho
with Flow(
"azure_flow",
run_config=LocalRun(env={"AZURE_STORAGE_CONNECTION_STRING": connection_string}),
storage=storage.Azure(
container="test",
connection_string=connection_string,
)
) as flow:
hello_result = say_hello()
ciaran
04/27/2021, 8:23 AMconnection_string
to storage.Azure
? If the environment variable is the one it uses?ciaran
04/27/2021, 9:39 AMKevin Kho
ciaran
04/27/2021, 1:08 PMKevin Kho
ciaran
04/27/2021, 1:11 PMciaran
04/27/2021, 1:12 PMKevin Kho
ciaran
04/27/2021, 1:29 PMKevin Kho
ciaran
04/27/2021, 1:30 PMciaran
04/27/2021, 1:31 PMciaran
04/27/2021, 1:32 PMKevin Kho
ciaran
04/27/2021, 1:36 PMciaran
04/27/2021, 1:37 PMciaran
04/27/2021, 4:12 PMdef get_cluster():
pod_spec = make_pod_spec(
image="prefecthq/prefect:0.14.16-python3.8",
labels={"flow": flow_name},
memory_limit='4G',
memory_request='4G'
)
return KubeCluster(pod_spec)
...
executor=DaskExecutor(
cluster_class=get_cluster(),
)
ciaran
04/27/2021, 4:12 PMTraceback (most recent call last):
File "flow_test/manual_flow.py", line 8, in <module>
from dask_kubernetes import KubeCluster, make_pod_spec
File "/Users/ciaran/Library/Caches/pypoetry/virtualenvs/pangeo-forge-azure-bakery-IMqFot_V-py3.8/lib/python3.8/site-packages/dask_kubernetes/__init__.py", line 3, in <module>
from .core import KubeCluster
File "/Users/ciaran/Library/Caches/pypoetry/virtualenvs/pangeo-forge-azure-bakery-IMqFot_V-py3.8/lib/python3.8/site-packages/dask_kubernetes/core.py", line 19, in <module>
from .objects import (
File "/Users/ciaran/Library/Caches/pypoetry/virtualenvs/pangeo-forge-azure-bakery-IMqFot_V-py3.8/lib/python3.8/site-packages/dask_kubernetes/objects.py", line 34, in <module>
SERIALIZATION_API_CLIENT = DummyApiClient()
File "/Users/ciaran/Library/Caches/pypoetry/virtualenvs/pangeo-forge-azure-bakery-IMqFot_V-py3.8/lib/python3.8/site-packages/dask_kubernetes/objects.py", line 28, in __init__
self.configuration = Configuration.get_default_copy()
AttributeError: type object 'Configuration' has no attribute 'get_default_copy'
Kevin Kho
cluster_class
should be callable. Can you try removing the ()
?Kevin Kho
KubeCluster
there and use the cluster_kwargs
to pass the pod spec.ciaran
04/27/2021, 4:20 PMcluster_kwargs
?
executor=DaskExecutor(
cluster_class="dask_kubernetes.KubeCluster",
cluster_kwargs={
"image": "prefecthq/prefect:0.14.16-python3.8",
"labels": {"flow": flow_name},
"memory_limit": "4G",
"memory_request": "4G"
}
)
Gives the same errorKevin Kho
Kevin Kho
ciaran
04/27/2021, 4:32 PMciaran
04/27/2021, 4:33 PMciaran
04/27/2021, 4:33 PMciaran
04/27/2021, 4:33 PMciaran
04/27/2021, 4:34 PMfrom dask_kubernetes import KubeCluster, make_pod_spec
lineKevin Kho
ciaran
04/27/2021, 4:35 PMciaran
04/27/2021, 4:36 PMdask-kubernetes 2021.3.0
ciaran
04/27/2021, 4:38 PMciaran
04/27/2021, 4:39 PMprefect[kubernetes]
is installing an older versionciaran
04/27/2021, 4:39 PMkubernetes 11.0.0b2
ciaran
04/27/2021, 4:43 PMKevin Kho
ciaran
04/27/2021, 4:50 PMKevin Kho
ciaran
04/27/2021, 4:54 PMdask-kubernetes
has that error if k8s is less than v12
, but dask-kubernetes
was installed via Prefectciaran
04/27/2021, 4:55 PMciaran
04/27/2021, 4:56 PMdask-kubernetes
might work, but it feels like we shouldn't drop versions downKevin Kho
Kevin Kho
ciaran
04/27/2021, 4:59 PMciaran
04/27/2021, 4:59 PMkubernetes
that prefect
installs.ciaran
04/28/2021, 9:52 AMkubernetes
to 12.0.1
via pip because Poetry wouldn't let me do it via that (as Prefects range doesn't include it).
However, when I uncommented out those dask_kubernetes imports, the flow registration happened without error. So I think there's definitely evidence of Prefect pulling in a Kubernetes version that is incompatible with the dask_kubernetes
version it pulls in.ciaran
04/28/2021, 10:17 AMKevin Kho
ciaran
04/30/2021, 4:04 PMkubernetes==12.0.1
and made a image for my agents/pods that is just the prefect==0.14.17
image, with the newer k8s version installedciaran
04/30/2021, 4:05 PMKubeCluster
in my Flow registrationciaran
04/30/2021, 4:05 PMHTTP response headers: <CIMultiDictProxy('Audit-Id': 'e60cf00e-93e6-4fb0-802e-75a298fa0867', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'Date': 'Fri, 30 Apr 2021 16:01:26 GMT', 'Content-Length': '386')>
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods \"dask-root-857b2f2e-0k55nf\" is forbidden: User \"system:serviceaccount:pangeo-forge-azure-bakery:default\" cannot get resource \"pods/log\" in API group \"\" in the namespace \"pangeo-forge-azure-bakery\"","reason":"Forbidden","details":{"name":"dask-root-857b2f2e-0k55nf","kind":"pods"},"code":403}
ciaran
04/30/2021, 4:05 PMciaran
04/30/2021, 4:08 PMTyler Wanner
04/30/2021, 4:10 PMTyler Wanner
04/30/2021, 4:11 PMTyler Wanner
04/30/2021, 4:11 PMTyler Wanner
04/30/2021, 4:12 PMciaran
04/30/2021, 4:15 PMapiVersion: <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>
kind: Role
metadata:
name: prefect-agent-rbac
namespace: ${BAKERY_NAMESPACE}
rules:
- apiGroups:
- batch
- extensions
resources:
- jobs
verbs:
- '*'
- apiGroups:
- ''
resources:
- events
- pods
verbs:
- '*'
---
apiVersion: <http://rbac.authorization.k8s.io/v1beta1|rbac.authorization.k8s.io/v1beta1>
kind: RoleBinding
metadata:
name: prefect-agent-rbac
namespace: ${BAKERY_NAMESPACE}
roleRef:
apiGroup: <http://rbac.authorization.k8s.io|rbac.authorization.k8s.io>
kind: Role
name: prefect-agent-rbac
subjects:
- kind: ServiceAccount
name: default
Tyler Wanner
04/30/2021, 4:22 PMciaran
04/30/2021, 4:26 PMciaran
04/30/2021, 4:26 PMTyler Wanner
04/30/2021, 4:30 PMciaran
04/30/2021, 4:30 PMTyler Wanner
04/30/2021, 4:30 PMkubectl get
your rolebinding and role just to make sure they're as-setTyler Wanner
04/30/2021, 4:31 PMciaran
04/30/2021, 4:31 PMTyler Wanner
04/30/2021, 4:34 PMkubectl get rolebindings -n $BAKERY_NAMESPACE -o yaml
ciaran
04/30/2021, 4:35 PMciaran
04/30/2021, 4:37 PMapiVersion: v1
items:
- apiVersion: <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>
kind: RoleBinding
metadata:
annotations:
<http://kubectl.kubernetes.io/last-applied-configuration|kubectl.kubernetes.io/last-applied-configuration>: |
{"apiVersion":"<http://rbac.authorization.k8s.io/v1beta1|rbac.authorization.k8s.io/v1beta1>","kind":"RoleBinding","metadata":{"annotations":{},"name":"prefect-agent-rbac","namespace":"pangeo-forge-azure-bakery"},"roleRef":{"apiGroup":"<http://rbac.authorization.k8s.io|rbac.authorization.k8s.io>","kind":"Role","name":"prefect-agent-rbac"},"subjects":[{"kind":"ServiceAccount","name":"default"}]}
creationTimestamp: "2021-04-30T15:33:05Z"
managedFields:
- apiVersion: <http://rbac.authorization.k8s.io/v1beta1|rbac.authorization.k8s.io/v1beta1>
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
<f:kubectl.kubernetes.io/last-applied-configuration>: {}
f:roleRef:
f:apiGroup: {}
f:kind: {}
f:name: {}
f:subjects: {}
manager: kubectl-client-side-apply
operation: Update
time: "2021-04-30T15:33:05Z"
name: prefect-agent-rbac
namespace: pangeo-forge-azure-bakery
resourceVersion: "3887"
selfLink: /apis/rbac.authorization.k8s.io/v1/namespaces/pangeo-forge-azure-bakery/rolebindings/prefect-agent-rbac
uid: 5e1207cb-10ab-4974-b2b6-91901c2d9f44
roleRef:
apiGroup: <http://rbac.authorization.k8s.io|rbac.authorization.k8s.io>
kind: Role
name: prefect-agent-rbac
subjects:
- kind: ServiceAccount
name: default
kind: List
metadata:
resourceVersion: ""
selfLink: ""
Tyler Wanner
04/30/2021, 4:39 PMkubectl get roles -n $BAKERY_NAMESPACE prefect-agent-rbac -o yaml
ciaran
04/30/2021, 4:41 PMapiVersion: <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>
kind: Role
metadata:
annotations:
<http://kubectl.kubernetes.io/last-applied-configuration|kubectl.kubernetes.io/last-applied-configuration>: |
{"apiVersion":"<http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>","kind":"Role","metadata":{"annotations":{},"name":"prefect-agent-rbac","namespace":"pangeo-forge-azure-bakery"},"rules":[{"apiGroups":["batch","extensions"],"resources":["jobs"],"verbs":["*"]},{"apiGroups":[""],"resources":["events","pods"],"verbs":["*"]}]}
creationTimestamp: "2021-04-30T15:33:05Z"
managedFields:
- apiVersion: <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
<f:kubectl.kubernetes.io/last-applied-configuration>: {}
f:rules: {}
manager: kubectl-client-side-apply
operation: Update
time: "2021-04-30T15:33:05Z"
name: prefect-agent-rbac
namespace: pangeo-forge-azure-bakery
resourceVersion: "3885"
selfLink: /apis/rbac.authorization.k8s.io/v1/namespaces/pangeo-forge-azure-bakery/roles/prefect-agent-rbac
uid: 5adabb97-9970-4634-9f82-dcb4bc91d801
rules:
- apiGroups:
- batch
- extensions
resources:
- jobs
verbs:
- '*'
- apiGroups:
- ""
resources:
- events
- pods
verbs:
- '*'
Tyler Wanner
04/30/2021, 4:53 PMTyler Wanner
04/30/2021, 4:53 PMpods/log
is not actually covered by pods
ciaran
04/30/2021, 4:53 PMTyler Wanner
04/30/2021, 4:54 PMresources:
- events
- pods
- pods/log
Tyler Wanner
04/30/2021, 4:54 PMciaran
04/30/2021, 4:55 PMciaran
04/30/2021, 4:57 PMHTTP response headers: <CIMultiDictProxy('Audit-Id': '80b5f484-e14a-4097-94cb-f3339f4d1356', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'Date': 'Fri, 30 Apr 2021 16:57:23 GMT', 'Content-Length': '332')>
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"services is forbidden: User \"system:serviceaccount:pangeo-forge-azure-bakery:default\" cannot create resource \"services\" in API group \"\" in the namespace \"pangeo-forge-azure-bakery\"","reason":"Forbidden","details":{"kind":"services"},"code":403}
ciaran
04/30/2021, 4:58 PMTyler Wanner
04/30/2021, 5:00 PMciaran
04/30/2021, 5:00 PMciaran
04/30/2021, 5:03 PMciaran
04/30/2021, 5:03 PMTyler Wanner
04/30/2021, 5:05 PMciaran
04/30/2021, 5:06 PMciaran
04/30/2021, 5:07 PMciaran
04/30/2021, 5:08 PMciaran
04/30/2021, 5:08 PMmessage: '0/2 nodes are available: 2 Insufficient memory.'
Looks like I probably need to set adaptive scaling of some sorts.Tyler Wanner
04/30/2021, 5:09 PMTyler Wanner
04/30/2021, 5:09 PMciaran
04/30/2021, 5:10 PMdefault_node_pool {
name = "default"
node_count = 2
vm_size = "Standard_D2_v2"
os_disk_size_gb = 30
}
Have 7GB and I'm asking for 4. I wonder if I've just used up the nodes I have? 🤷ciaran
04/30/2021, 5:11 PMenable_auto_scaling
, I should probably set that to true 😅Tyler Wanner
04/30/2021, 5:17 PMciaran
04/30/2021, 5:20 PMciaran
04/30/2021, 5:48 PMreason: FailedScheduling
message: '0/1 nodes are available: 1 Insufficient memory.'
So i can have a maximum of 1000 nodes (all 7GB vms...), surely if it needs 4GB, that spins up a new node?ciaran
04/30/2021, 5:51 PMHTTP response headers: <CIMultiDictProxy('Audit-Id': '84d06158-c85b-4c3b-b093-6d0dc9884c5f', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'Date': 'Fri, 30 Apr 2021 17:47:48 GMT', 'Content-Length': '398')>
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"poddisruptionbudgets.policy is forbidden: User \"system:serviceaccount:pangeo-forge-azure-bakery:default\" cannot create resource \"poddisruptionbudgets\" in API group \"policy\" in the namespace \"pangeo-forge-azure-bakery\"","reason":"Forbidden","details":{"group":"policy","kind":"poddisruptionbudgets"},"code":403}
To the config!ciaran
04/30/2021, 5:54 PM- apiGroups:
- policy
resources:
- poddisruptionbudgets
verbs:
- '*'
?ciaran
04/30/2021, 6:07 PMTyler Wanner
04/30/2021, 8:04 PMciaran
04/30/2021, 8:05 PMciaran
04/30/2021, 8:05 PMTyler Wanner
04/30/2021, 8:05 PMciaran
04/30/2021, 8:06 PMciaran
04/30/2021, 8:06 PMTyler Wanner
04/30/2021, 8:07 PMciaran
04/30/2021, 8:08 PMciaran
05/04/2021, 4:25 PMciaran
05/04/2021, 4:25 PMciaran
05/04/2021, 4:25 PMciaran
05/04/2021, 4:26 PMciaran
05/04/2021, 4:28 PMciaran
05/04/2021, 4:40 PMciaran
05/04/2021, 4:40 PMKevin Kho
ciaran
05/04/2021, 4:42 PMciaran
05/04/2021, 4:42 PMciaran
05/04/2021, 4:42 PMKevin Kho
say_hello
?ciaran
05/05/2021, 9:22 AMciaran
05/05/2021, 9:23 AMciaran
05/05/2021, 9:41 AMKevin Kho
ciaran
05/05/2021, 1:33 PM