Hey folks I m getting a weird error on my Kubernetes cluster Prefect Community #ask-community

Hey folks, I'm getting a weird error on my Kuberne...

Michael Law

06/23/2021, 10:02 AM

Hey folks, I'm getting a weird error on my Kubernetes cluster (AKS) when my agent kicks off its first job. The agent does start successfully, and subsequent jobs are ran to completion, so doesn't appear to stop anything functionally. Just wondered if anyone else has came across this? I am packaging up my dependencies to docker storage, and storing my flows in blob storage. Cheers

Copy code

[2021-06-23 08:53:35,179] ERROR - agent | Error while managing existing k8s jobs
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/prefect/agent/kubernetes/agent.py", line 384, in heartbeat
    self.manage_jobs()
  File "/usr/local/lib/python3.7/site-packages/prefect/agent/kubernetes/agent.py", line 230, in manage_jobs
    pod_events.items, key=lambda x: x.last_timestamp
TypeError: '<' not supported between instances of 'datetime.datetime' and 'NoneType'
ERROR:agent:Error while managing existing k8s jobs
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/prefect/agent/kubernetes/agent.py", line 384, in heartbeat
    self.manage_jobs()
  File "/usr/local/lib/python3.7/site-packages/prefect/agent/kubernetes/agent.py", line 230, in manage_jobs
    pod_events.items, key=lambda x: x.last_timestamp
TypeError: '<' not supported between instances of 'datetime.datetime' and 'NoneType'

Prabin Mehta

06/23/2021, 10:16 AM

Its looks like there's a comparison between two variables where one is of type datetime.datetime and the other seems to have None value. Pls, check any date variables like the scheduled time of your flows.

ciaran

06/23/2021, 10:20 AM

Hey Michael, so just to confirm, this doesn't error/stop anything, you're just noticing it in the logs?

Michael Law

06/23/2021, 10:30 AM

Exactly @ciaran, everything is running fine, but it looks like its doing a comparison against something which is unexpected on the first run

Michael Law

06/23/2021, 10:31 AM

I'm guessing x.last_timestamp potentially has no value on first run.

ciaran

06/23/2021, 10:31 AM

I'm just redeploying mine and I'll follow the agent logs, kick off a job and see if I get the same thing

ciaran

06/23/2021, 10:32 AM

Sounds about right though

ciaran

06/23/2021, 10:44 AM

@Michael Law what Prefect version are you on?

ciaran

06/23/2021, 10:54 AM

Copy code

____            __           _        _                    _
|  _ \ _ __ ___ / _| ___  ___| |_     / \   __ _  ___ _ __ | |_
| |_) | '__/ _ \ |_ / _ \/ __| __|   / _ \ / _` |/ _ \ '_ \| __|
|  __/| | |  __/  _|  __/ (__| |_   / ___ \ (_| |  __/ | | | |_
|_|   |_|  \___|_|  \___|\___|\__| /_/   \_\__, |\___|_| |_|\__|
                                           |___/

[2021-06-23 10:36:20,842] INFO - agent | Starting KubernetesAgent with labels ['ciarandev']
[2021-06-23 10:36:20,842] INFO - agent | Agent documentation can be found at <https://docs.prefect.io/orchestration/>
[2021-06-23 10:36:20,842] INFO - agent | Agent connecting to the Prefect API at <https://api.prefect.io>
[2021-06-23 10:36:20,975] INFO - agent | Waiting for flow runs...
INFO:agent:Found 1 flow run(s) to submit for execution.
[2021-06-23 10:42:13,356] INFO - agent | Found 1 flow run(s) to submit for execution.
[2021-06-23 10:42:13,559] INFO - agent | Deploying flow run 67799ea1-789e-4ca4-94e5-e8d16ae2c482
INFO:agent:Deploying flow run 67799ea1-789e-4ca4-94e5-e8d16ae2c482

On Prefect

0.14.19

I don't get the error

Michael Law

06/23/2021, 10:59 AM

hmm, prefect:latest-python3.7

ciaran

06/23/2021, 11:05 AM

https://github.com/PrefectHQ/prefect/blob/master/src/prefect/agent/kubernetes/agent.py#L232-L240 these lines were 'recently' (Higher than my version) modified

ciaran

06/23/2021, 11:05 AM

But I think the

sorted

call a few lines above is what's failing

ciaran

06/23/2021, 11:05 AM

Like you say, the initial jobs likely have no values

ciaran

06/23/2021, 11:10 AM

https://github.com/PrefectHQ/prefect/pull/4544#issuecomment-866745624 I've commented on the PR that fixed that issue, looks like tests didn't go in to double check this behaviour, I think the problem got bubbled up to a different place

👍 1

Michael Law

06/23/2021, 12:33 PM

Nice one mate, thanks. All seems sensible to me, no dramas too on this end as it still works

🙌 1

ciaran

06/23/2021, 12:35 PM

No worries!

ciaran

06/23/2021, 3:18 PM

@Michael Law just to keep you in the loop, this should address that error https://github.com/PrefectHQ/prefect/pull/4693/files

👍 1

ciaran

06/23/2021, 3:21 PM

(From https://github.com/PrefectHQ/prefect/pull/4544#issuecomment-866919141)

3 Views

Open in Slack

Previous Next