Pedro Martins
12/17/2020, 6:31 PMError: ErrImagePull
custom_confs = {
"run_config": KubernetesRun(
image="drtools/prefect:aircraft-etl",
image_pull_secrets=["regcred"],
),
"storage": S3(bucket="dr-prefect"),
}
with Flow("Aircraft-ETL", **custom_confs) as flow:
airport = Parameter("airport", default = "IAD")
radius = Parameter("radius", default = 200)
reference_data = extract_reference_data()
live_data = extract_live_data(airport, radius, reference_data)
transformed_live_data = transform(live_data, reference_data)
load_reference_data(reference_data)
load_live_data(transformed_live_data)
Name: prefect-job-ded2fd39-k6kpp
Namespace: default
Priority: 0
Node: ****
Start Time: Thu, 17 Dec 2020 15:20:15 -0300
Labels: controller-uid=386ac185-8bba-47b4-85b0-358c3601179c
job-name=prefect-job-ded2fd39
<http://prefect.io/flow_id=3228aac5-a762-40db-9858-63c536ce5b8f|prefect.io/flow_id=3228aac5-a762-40db-9858-63c536ce5b8f>
<http://prefect.io/flow_run_id=93c58ae5-1bc4-4a3c-bb70-7bb6a50ff10e|prefect.io/flow_run_id=93c58ae5-1bc4-4a3c-bb70-7bb6a50ff10e>
<http://prefect.io/identifier=ded2fd39|prefect.io/identifier=ded2fd39>
Annotations: <http://kubernetes.io/psp|kubernetes.io/psp>: eks.privileged
Status: Pending
IP: 10.0.1.16
IPs:
IP: 10.0.1.16
Controlled By: Job/prefect-job-ded2fd39
Containers:
flow:
Container ID:
Image: drtools/prefect:aircraft-etl
Image ID:
Port: <none>
Host Port: <none>
Args:
prefect
execute
flow-run
State: Waiting
Reason: ImagePullBackOff
Ready: False
Restart Count: 0
Environment:
PREFECT__CLOUD__API: <http://prefect-server-apollo.default.svc.cluster.local:4200>
PREFECT__CLOUD__AUTH_TOKEN:
PREFECT__CLOUD__USE_LOCAL_SECRETS: false
PREFECT__CONTEXT__FLOW_RUN_ID: 93c58ae5-1bc4-4a3c-bb70-7bb6a50ff10e
PREFECT__CONTEXT__FLOW_ID: 3228aac5-a762-40db-9858-63c536ce5b8f
PREFECT__CONTEXT__IMAGE: drtools/prefect:aircraft-etl
PREFECT__LOGGING__LOG_TO_CLOUD: true
PREFECT__ENGINE__FLOW_RUNNER__DEFAULT_CLASS: prefect.engine.cloud.CloudFlowRunner
PREFECT__ENGINE__TASK_RUNNER__DEFAULT_CLASS: prefect.engine.cloud.CloudTaskRunner
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-n28d2 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-n28d2:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-n28d2
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: <http://node.kubernetes.io/not-ready:NoExecute|node.kubernetes.io/not-ready:NoExecute> op=Exists for 300s
<http://node.kubernetes.io/unreachable:NoExecute|node.kubernetes.io/unreachable:NoExecute> op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 9m29s default-scheduler Successfully assigned default/prefect-job-ded2fd39-k6kpp to ip-10-0-1-20.eu-west-1.compute.internal
Normal Pulling 7m58s (x4 over 9m28s) kubelet Pulling image "drtools/prefect:aircraft-etl"
Warning Failed 7m57s (x4 over 9m28s) kubelet Failed to pull image "drtools/prefect:aircraft-etl": rpc error: code = Unknown desc = Error response from daemon: pull access denied for drtools/prefect, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
Warning Failed 7m57s (x4 over 9m28s) kubelet Error: ErrImagePull
Normal BackOff 7m44s (x6 over 9m27s) kubelet Back-off pulling image "drtools/prefect:aircraft-etl"
Warning Failed 4m23s (x20 over 9m27s) kubelet Error: ImagePullBackOff
Dylan
Pedro Martins
12/17/2020, 6:39 PM$ kubectl get secrets -n default
NAME TYPE DATA AGE
aws-secret Opaque 2 3d1h
default-token-n28d2 <http://kubernetes.io/service-account-token|kubernetes.io/service-account-token> 3 8d
prefect-server-postgresql Opaque 1 6d21h
prefect-server-serviceaccount-token-lc6n2 <http://kubernetes.io/service-account-token|kubernetes.io/service-account-token> 3 6d21h
regcred <http://kubernetes.io/dockerconfigjson|kubernetes.io/dockerconfigjson> 1 2d20h
sh.helm.release.v1.prefect-server.v1 <http://helm.sh/release.v1|helm.sh/release.v1> 1 6d21h
PREFECT__CLOUD__USE_LOCAL_SECRETS: false
be set to true?Dylan
Jim Crist-Harif
12/17/2020, 6:50 PMimage_pull_secrets
field.Pedro Martins
12/17/2020, 6:52 PMJim Crist-Harif
12/17/2020, 6:55 PMapiVersion: batch/v1
kind: Job
metadata:
labels:
<http://prefect.io/flow_id|prefect.io/flow_id>: new_id
<http://prefect.io/flow_run_id|prefect.io/flow_run_id>: id
<http://prefect.io/identifier|prefect.io/identifier>: 453321ca
name: prefect-job-453321ca
spec:
template:
imagePullSecrets:
- name: regcred
metadata:
labels:
<http://prefect.io/flow_id|prefect.io/flow_id>: new_id
<http://prefect.io/flow_run_id|prefect.io/flow_run_id>: id
<http://prefect.io/identifier|prefect.io/identifier>: 453321ca
spec:
containers:
- args:
- prefect
- execute
- flow-run
env:
- name: PREFECT__CLOUD__API
value: <https://api.prefect.io>
- name: PREFECT__CLOUD__AUTH_TOKEN
value: <redacted>
- name: PREFECT__CLOUD__USE_LOCAL_SECRETS
value: 'false'
- name: PREFECT__CONTEXT__FLOW_RUN_ID
value: id
- name: PREFECT__CONTEXT__FLOW_ID
value: new_id
- name: PREFECT__CONTEXT__IMAGE
value: drtools/prefect:aircraft-etl
- name: PREFECT__LOGGING__LOG_TO_CLOUD
value: 'true'
- name: PREFECT__ENGINE__FLOW_RUNNER__DEFAULT_CLASS
value: prefect.engine.cloud.CloudFlowRunner
- name: PREFECT__ENGINE__TASK_RUNNER__DEFAULT_CLASS
value: prefect.engine.cloud.CloudTaskRunner
image: drtools/prefect:aircraft-etl
name: flow
resources:
limits: {}
requests: {}
restartPolicy: Never
kubectl get job <your-job-id> -o yaml
?Pedro Martins
12/17/2020, 6:59 PMapiVersion: batch/v1
kind: Job
metadata:
creationTimestamp: "2020-12-17T18:25:53Z"
labels:
<http://prefect.io/flow_id|prefect.io/flow_id>: 1d0ff4aa-da07-4309-82c5-d96f05502a03
<http://prefect.io/flow_run_id|prefect.io/flow_run_id>: 2529c19e-0e6c-428f-b777-54b04d19fb9f
<http://prefect.io/identifier|prefect.io/identifier>: 93e105ba
name: prefect-job-93e105ba
namespace: default
resourceVersion: "2330069"
selfLink: /apis/batch/v1/namespaces/default/jobs/prefect-job-93e105ba
uid: f0246452-fcb8-41e5-b9a8-b816a5ec9a96
spec:
backoffLimit: 6
completions: 1
parallelism: 1
selector:
matchLabels:
controller-uid: f0246452-fcb8-41e5-b9a8-b816a5ec9a96
template:
metadata:
creationTimestamp: null
labels:
controller-uid: f0246452-fcb8-41e5-b9a8-b816a5ec9a96
job-name: prefect-job-93e105ba
<http://prefect.io/flow_id|prefect.io/flow_id>: 1d0ff4aa-da07-4309-82c5-d96f05502a03
<http://prefect.io/flow_run_id|prefect.io/flow_run_id>: 2529c19e-0e6c-428f-b777-54b04d19fb9f
<http://prefect.io/identifier|prefect.io/identifier>: 93e105ba
spec:
containers:
- args:
- prefect
- execute
- flow-run
env:
- name: PREFECT__CLOUD__API
value: <http://prefect-server-apollo.default.svc.cluster.local:4200>
- name: PREFECT__CLOUD__AUTH_TOKEN
- name: PREFECT__CLOUD__USE_LOCAL_SECRETS
value: "false"
- name: PREFECT__CONTEXT__FLOW_RUN_ID
value: 2529c19e-0e6c-428f-b777-54b04d19fb9f
- name: PREFECT__CONTEXT__FLOW_ID
value: 1d0ff4aa-da07-4309-82c5-d96f05502a03
- name: PREFECT__CONTEXT__IMAGE
value: drtools/prefect:aircraft-etl
- name: PREFECT__LOGGING__LOG_TO_CLOUD
value: "true"
- name: PREFECT__ENGINE__FLOW_RUNNER__DEFAULT_CLASS
value: prefect.engine.cloud.CloudFlowRunner
- name: PREFECT__ENGINE__TASK_RUNNER__DEFAULT_CLASS
value: prefect.engine.cloud.CloudTaskRunner
image: drtools/prefect:aircraft-etl
imagePullPolicy: IfNotPresent
name: flow
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Never
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
status:
active: 1
startTime: "2020-12-17T18:25:53Z"
Jim Crist-Harif
12/17/2020, 7:02 PMPedro Martins
12/17/2020, 7:05 PMIMAGE_PULL_SECRETS
variable on the agent and it doesn't pass to the pods.apiVersion: v1
kind: Pod
metadata:
annotations:
<http://kubernetes.io/psp|kubernetes.io/psp>: eks.privileged
creationTimestamp: "2020-12-17T22:17:46Z"
generateName: prefect-agent-545bccd6c8-
labels:
app: prefect-agent
pod-template-hash: 545bccd6c8
name: prefect-agent-545bccd6c8-rqmg8
namespace: default
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: prefect-agent-545bccd6c8
uid: 870f927b-75d5-4da1-95aa-963b936ff204
resourceVersion: "2377225"
selfLink: /api/v1/namespaces/default/pods/prefect-agent-545bccd6c8-rqmg8
uid: 69e10ea0-9ff4-434c-bc71-9ec6b085e3fa
spec:
containers:
- args:
- prefect agent kubernetes start
command:
- /bin/bash
- -c
env:
- name: PREFECT__CLOUD__AGENT__AUTH_TOKEN
- name: PREFECT__CLOUD__API
value: <http://prefect-server-apollo.default.svc.cluster.local:4200>
- name: NAMESPACE
value: default
- name: IMAGE_PULL_SECRETS
value: regcred
- name: PREFECT__CLOUD__AGENT__LABELS
value: '[]'
- name: JOB_MEM_REQUEST
- name: JOB_MEM_LIMIT
- name: JOB_CPU_REQUEST
- name: JOB_CPU_LIMIT
- name: IMAGE_PULL_POLICY
- name: SERVICE_ACCOUNT_NAME
- name: PREFECT__BACKEND
value: server
- name: PREFECT__CLOUD__AGENT__AGENT_ADDRESS
value: http://:8080
- name: PREFECT__CLOUD__AGENT__LEVEL
value: DEBUG
image: prefecthq/prefect:0.14.0-python3.6
imagePullPolicy: Always
livenessProbe:
failureThreshold: 2
httpGet:
path: /api/health
port: 8080
scheme: HTTP
initialDelaySeconds: 40
periodSeconds: 40
successThreshold: 1
timeoutSeconds: 1
name: agent
resources:
limits:
cpu: 100m
memory: 128Mi
requests:
cpu: 100m
memory: 128Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-n28d2
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
nodeName: ip-10-0-1-20.eu-west-1.compute.internal
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: <http://node.kubernetes.io/not-ready|node.kubernetes.io/not-ready>
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: <http://node.kubernetes.io/unreachable|node.kubernetes.io/unreachable>
operator: Exists
tolerationSeconds: 300
volumes:
- name: default-token-n28d2
secret:
defaultMode: 420
secretName: default-token-n28d2
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2020-12-17T22:17:46Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2020-12-17T22:17:49Z"
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2020-12-17T22:17:49Z"
status: "True"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2020-12-17T22:17:46Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: <docker://cf748aa6b79bcc1d1aaa0b39eda0a0c07342a6d1a39e51637d11c6f89fbdb6b2>
image: prefecthq/prefect:0.14.0-python3.6
imageID: <docker-pullable://prefecthq/prefect@sha256:3ebe46f840d46044b9521c9380aa13bd0755670d06e2fdfe0e23c69de5a78fc0>
lastState: {}
name: agent
ready: true
restartCount: 0
started: true
state:
running:
startedAt: "2020-12-17T22:17:48Z"
hostIP: 10.0.1.20
phase: Running
podIP: 10.0.1.54
podIPs:
- ip: 10.0.1.54
qosClass: Guaranteed
startTime: "2020-12-17T22:17:46Z"
apiVersion: batch/v1
kind: Job
metadata:
creationTimestamp: "2020-12-17T22:19:52Z"
labels:
<http://prefect.io/flow_id|prefect.io/flow_id>: 0320c90d-56c0-40b1-a259-75ef587d24e3
<http://prefect.io/flow_run_id|prefect.io/flow_run_id>: 34028fb2-2cd2-4cf4-88e1-82d983c650b2
<http://prefect.io/identifier|prefect.io/identifier>: ba7cd008
name: prefect-job-ba7cd008
namespace: default
resourceVersion: "2377667"
selfLink: /apis/batch/v1/namespaces/default/jobs/prefect-job-ba7cd008
uid: 9aa7782e-3bfa-4282-a021-25732e1a862a
spec:
backoffLimit: 6
completions: 1
parallelism: 1
selector:
matchLabels:
controller-uid: 9aa7782e-3bfa-4282-a021-25732e1a862a
template:
metadata:
creationTimestamp: null
labels:
controller-uid: 9aa7782e-3bfa-4282-a021-25732e1a862a
job-name: prefect-job-ba7cd008
<http://prefect.io/flow_id|prefect.io/flow_id>: 0320c90d-56c0-40b1-a259-75ef587d24e3
<http://prefect.io/flow_run_id|prefect.io/flow_run_id>: 34028fb2-2cd2-4cf4-88e1-82d983c650b2
<http://prefect.io/identifier|prefect.io/identifier>: ba7cd008
spec:
containers:
- args:
- prefect
- execute
- flow-run
env:
- name: PREFECT__CLOUD__API
value: <http://prefect-server-apollo.default.svc.cluster.local:4200>
- name: PREFECT__CLOUD__AUTH_TOKEN
- name: PREFECT__CLOUD__USE_LOCAL_SECRETS
value: "false"
- name: PREFECT__CONTEXT__FLOW_RUN_ID
value: 34028fb2-2cd2-4cf4-88e1-82d983c650b2
- name: PREFECT__CONTEXT__FLOW_ID
value: 0320c90d-56c0-40b1-a259-75ef587d24e3
- name: PREFECT__CONTEXT__IMAGE
value: drtools/prefect:aircraft-etl
- name: PREFECT__LOGGING__LOG_TO_CLOUD
value: "true"
- name: PREFECT__ENGINE__FLOW_RUNNER__DEFAULT_CLASS
value: prefect.engine.cloud.CloudFlowRunner
- name: PREFECT__ENGINE__TASK_RUNNER__DEFAULT_CLASS
value: prefect.engine.cloud.CloudTaskRunner
image: drtools/prefect:aircraft-etl
imagePullPolicy: IfNotPresent
name: flow
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Never
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
status:
active: 1
startTime: "2020-12-17T22:19:52Z"
[2020-12-17 22:19:52,076] INFO - agent | Found 1 flow run(s) to submit for execution.
[2020-12-17 22:19:52,079] DEBUG - agent | Updating states for flow run 34028fb2-2cd2-4cf4-88e1-82d983c650b2
[2020-12-17 22:19:52,096] DEBUG - agent | Flow run 34028fb2-2cd2-4cf4-88e1-82d983c650b2 is in a Scheduled state, updating to Submitted
[2020-12-17 22:19:52,110] DEBUG - agent | Next query for flow runs in 0.25 seconds
[2020-12-17 22:19:52,236] INFO - agent | Deploying flow run 34028fb2-2cd2-4cf4-88e1-82d983c650b2
[2020-12-17 22:19:52,238] DEBUG - agent | Loading job template from '/usr/local/lib/python3.6/site-packages/prefect/agent/kubernetes/job_template.yaml'
[2020-12-17 22:19:52,298] DEBUG - agent | Creating namespaced job prefect-job-ba7cd008
[2020-12-17 22:19:52,317] DEBUG - agent | Job prefect-job-ba7cd008 created
[2020-12-17 22:19:52,360] DEBUG - agent | Querying for flow runs
[2020-12-17 22:19:52,476] DEBUG - agent | Completed flow run submission (id: 34028fb2-2cd2-4cf4-88e1-82d983c650b2)
[2020-12-17 22:19:52,508] DEBUG - agent | No flow runs found
[2020-12-17 22:19:52,510] DEBUG - agent | Next query for flow runs in 0.5 seconds
[2020-12-17 22:19:53,010] DEBUG - agent | Querying for flow runs
[2020-12-17 22:19:53,067] DEBUG - agent | No flow runs found
[2020-12-17 22:19:53,072] DEBUG - agent | Next query for flow runs in 1.0 seconds
[2020-12-17 22:19:54,072] DEBUG - agent | Querying for flow runs
[2020-12-17 22:19:54,105] DEBUG - agent | No flow runs found
[2020-12-17 22:19:54,106] DEBUG - agent | Next query for flow runs in 2.0 seconds
[2020-12-17 22:19:56,106] DEBUG - agent | Querying for flow runs
[2020-12-17 22:19:56,148] DEBUG - agent | No flow runs found
[2020-12-17 22:19:56,148] DEBUG - agent | Next query for flow runs in 4.0 seconds
[2020-12-17 22:19:59,582] DEBUG - agent | Running agent heartbeat...
[2020-12-17 22:19:59,582] DEBUG - agent | Retrieving information of jobs that are currently in the cluster...
[2020-12-17 22:19:59,590] DEBUG - agent | Deleting job prefect-job-37fb2fd1
[2020-12-17 22:19:59,616] DEBUG - agent | Failing flow run 34028fb2-2cd2-4cf4-88e1-82d983c650b2 due to pod ErrImagePull
[2020-12-17 22:19:59,675] ERROR - agent | Error while managing existing k8s jobs
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/prefect/agent/kubernetes/agent.py", line 357, in heartbeat
self.manage_jobs()
File "/usr/local/lib/python3.6/site-packages/prefect/agent/kubernetes/agent.py", line 215, in manage_jobs
pod_events.items, key=lambda x: x.last_timestamp
TypeError: '<' not supported between instances of 'datetime.datetime' and 'NoneType'
[2020-12-17 22:19:59,714] DEBUG - agent | Sleeping heartbeat for 60.0 seconds
[2020-12-17 22:20:00,149] DEBUG - agent | Querying for flow runs
[2020-12-17 22:20:00,197] DEBUG - agent | No flow runs found
[2020-12-17 22:20:00,198] DEBUG - agent | Next query for flow runs in 8.0 seconds
Jim Crist-Harif
12/17/2020, 10:36 PMflow.diagnostics()
for your flow?Pedro Martins
12/17/2020, 10:38 PM(looks like you've also found another unrelated bug with that error log :))yeah! that error might be because it cannot connect to the pod it tried to create 🤷♂️
{
"config_overrides": {},
"env_vars": [],
"flow_information": {
"environment": false,
"result": false,
"run_config": {
"cpu_limit": false,
"cpu_request": false,
"env": false,
"image": true,
"image_pull_secrets": true,
"job_template": false,
"job_template_path": false,
"labels": false,
"memory_limit": false,
"memory_request": false,
"service_account_name": false,
"type": "KubernetesRun"
},
"schedule": false,
"storage": {
"_flows": false,
"_labels": false,
"add_default_labels": true,
"bucket": true,
"client_options": false,
"flows": false,
"key": false,
"local_script_path": false,
"result": true,
"secrets": false,
"stored_as_script": false,
"type": "S3"
},
"task_count": 7
},
"system_information": {
"platform": "Linux-4.14.203-156.332.amzn2.x86_64-x86_64-with-glibc2.10",
"prefect_backend": "server",
"prefect_version": "0.14.0",
"python_version": "3.8.6"
}
}
Jim Crist-Harif
12/17/2020, 10:51 PMNuno Silva
12/18/2020, 9:20 AMTypeError
and no imagePullSecrets. I'm reinstalling prefect using conda as well, trying to see if there's any dependency not automatically update when installing prefect that is causing this discrepancy. Thanks for all the debuggingPedro Martins
12/21/2020, 7:06 PMimagePullSecrets
tag.
The Kubernetes Agent deploy_flow
actually creates the job specification with the secret:
{'apiVersion': 'batch/v1',
'kind': 'Job',
'spec': {'template': {'spec': {'containers': [{'name': 'flow',
'image': 'drtools/prefect:aircraft-etl',
'args': ['prefect', 'execute', 'cloud-flow'],
'env': [{'name': 'PREFECT__CLOUD__API',
'value': 'http://****:4200'},
{'name': 'PREFECT__CLOUD__AUTH_TOKEN', 'value': ''},
{'name': 'PREFECT__CLOUD__USE_LOCAL_SECRETS', 'value': 'false'},
{'name': 'PREFECT__CONTEXT__FLOW_RUN_ID',
'value': 'a7b69781-cee0-4b40-811c-1aeb47a8cf60'},
{'name': 'PREFECT__CONTEXT__FLOW_ID',
'value': 'a7b69781-cee0-4b40-811c-1aeb47a8cf60'},
{'name': 'PREFECT__CONTEXT__IMAGE',
'value': 'drtools/prefect:aircraft-etl'},
{'name': 'PREFECT__LOGGING__LOG_TO_CLOUD', 'value': 'true'},
{'name': 'PREFECT__ENGINE__FLOW_RUNNER__DEFAULT_CLASS',
'value': 'prefect.engine.cloud.CloudFlowRunner'},
{'name': 'PREFECT__ENGINE__TASK_RUNNER__DEFAULT_CLASS',
'value': 'prefect.engine.cloud.CloudTaskRunner'}],
'resources': {'requests': {}, 'limits': {}}}],
'restartPolicy': 'Never'},
'metadata': {'labels': {'<http://prefect.io/identifier|prefect.io/identifier>': 'fb944cb5',
'<http://prefect.io/flow_run_id|prefect.io/flow_run_id>': 'a7b69781-cee0-4b40-811c-1aeb47a8cf60',
'<http://prefect.io/flow_id|prefect.io/flow_id>': 'a7b69781-cee0-4b40-811c-1aeb47a8cf60'}},
'imagePullSecrets': [{'name': 'regcred'}]}},
'metadata': {'labels': {'<http://prefect.io/identifier|prefect.io/identifier>': 'fb944cb5',
'<http://prefect.io/flow_run_id|prefect.io/flow_run_id>': 'a7b69781-cee0-4b40-811c-1aeb47a8cf60',
'<http://prefect.io/flow_id|prefect.io/flow_id>': 'a7b69781-cee0-4b40-811c-1aeb47a8cf60'},
'name': 'prefect-job-fb944cb5'}}
Then it calls self.batch_client.create_namespaced_job
. There are some sanitization in the payload but they don't remove the pull secret from the body. When it calls the kubernetes api self.api_client.call_api
the body is complete!
However the job specification that reaches the cluster doesn't contain the secret. It gets lost in the way or it is removed in the cluster api server.
Are you aware of some API incompatibility here?generate_job_spec_from_run_config
is adding the secret in the wrong level. The secret should be added to the same level of container specification. The fix should be this:
pod_template["spec"]["imagePullSecrets"] = [{"name": s} for s in image_pull_secrets]
https://github.com/PrefectHQ/prefect/blob/master/src/prefect/agent/kubernetes/agent.py#L623Jim Crist-Harif
12/21/2020, 8:11 PMAnanthapadmanabhan P
03/27/2021, 1:30 PMEvent: 'Failed' on pod 'prefect-job-42420d16-bn54h'
Message: Error: ErrImagePull
I have a prefect server running on kubernetes, which I installed using the helm chart available here - https://github.com/PrefectHQ/server/tree/master/helm/prefect-server. Tried two things - pass the arg image_pull_secrets
to KubernetesRun()
and tried editing the k8s deployment of agent to have the correct secret
IMAGE_PULL_SECRETS: [vi-dockerhub-key]
Neither worked for me and I could see that pod does not have the pull secrets in its description. Also the above secret is in default
namespace.
Since, the above issue seem to be fixed, am I missing something trivial/obvious?