https://prefect.io logo
Title
j

Jake

04/21/2022, 3:58 PM
Hi everyone, I’ve been running into an issue today
AttributeError: 'V1Job' object has no attribute 'name'
and I’m not sure what this means. Prefect cloud is reporting that
No heartbeat detected from the remote task; marking the run as failed.
This is a Kubernetes Agent. I looked at the logs and here is a stack trace:
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/prefect/agent/kubernetes/agent.py", line 413, in heartbeat
    self.manage_jobs()
  File "/usr/local/lib/python3.6/site-packages/prefect/agent/kubernetes/agent.py", line 193, in manage_jobs
ERROR:agent:Error while managing existing k8s jobs
Traceback (most recent call last):
    f"Job {job.name!r} is for flow run {flow_run_id!r} "
AttributeError: 'V1Job' object has no attribute 'name'
  File "/usr/local/lib/python3.6/site-packages/prefect/agent/kubernetes/agent.py", line 190, in manage_jobs
    flow_run_state = self.client.get_flow_run_state(flow_run_id)
  File "/usr/local/lib/python3.6/site-packages/prefect/client/client.py", line 1664, in get_flow_run_state
    raise ObjectNotFoundError(f"Flow run {flow_run_id!r} not found.")
prefect.exceptions.ObjectNotFoundError: Flow run 'af8b8a74-8ed0-4417-812a-566de859ce64' not found.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/prefect/agent/kubernetes/agent.py", line 413, in heartbeat
    self.manage_jobs()
  File "/usr/local/lib/python3.6/site-packages/prefect/agent/kubernetes/agent.py", line 193, in manage_jobs
    f"Job {job.name!r} is for flow run {flow_run_id!r} "
AttributeError: 'V1Job' object has no attribute 'name'
Deleting the Agent pod did not solve the issue. Any ideas?
k

Kevin Kho

04/21/2022, 4:00 PM
The heartbeat seems to just suggest we lost communication with your task because something happened like it ran out of memory or crashed. Since we don’t hear from the task, we just mark it as failed, otherwise it would show as running forever. This looks like there is an issue with the job spinning up? Is this happening during a flow run start? Does the flow interact with Kubernetes in anyway? Is this consistent or intermittent>?
j

Jake

04/21/2022, 4:06 PM
Flows are not running anymore, they stay stuck in a scheduled state. It happened suddenly and all the subsequent flow runs failed. I’m not sure what the root cause is. The flows should not affect any kubernetes processes.
k

Kevin Kho

04/21/2022, 4:07 PM
Does a hello world flow work? I am wondering if there is something with your job definition?
j

Jake

04/21/2022, 4:12 PM
I can’t get any flows to run anymore; I think there is something wrong with the agent.
It appears that the agent is looking for a run that does not exist?
File "/usr/local/lib/python3.6/site-packages/prefect/client/client.py", line 1664, in get_flow_run_state
raise ObjectNotFoundError(f"Flow run {flow_run_id!r} not found.")
prefect.exceptions.ObjectNotFoundError: Flow run 'ed00b008-7224-4386-bea3-707684420326' not found.
k

Kevin Kho

04/21/2022, 5:35 PM
I think Prefect 1 is hard to find the pod for a given flow, but maybe you can try deleting that pod or cancelling that flow that it’s looking for?
a

Anna Geller

04/21/2022, 6:31 PM
can you send your flow and Kubernetes job template definition? it looks like an issue with Secrets. This user has a similar issue, check if the solution at the bottom can help you https://discourse.prefect.io/t/issues-using-gitlab-storage-with-kubernetesagent-and-pre[…]r-404-project-not-found-or-file-or-directory-not-found/644
j

Jake

04/21/2022, 9:43 PM
I’m going to try deleting all the old prefect jobs. I highly doubt it has anything to do with secrets as the flows ran just fine yesterday with no change in between. I will update here!
a

Anna Geller

04/21/2022, 10:37 PM
awesome, keep us posted!
j

Jake

04/22/2022, 3:18 PM
Can someone explain to me what this error means?
prefect.exceptions.ObjectNotFoundError: Flow run '0758ed94-ebda-44c4-a439-e698af7cb675' not found.
It’s being emitted by the agent on our cluster. I’m not sure if there is something wrong on the cloud side. This is the flow run id for a job that was completed about 8 days ago. Searching for this ID in prefect cloud has no results.
Why is the agent looking for jobs that have happened so long ago?
a

Anna Geller

04/22/2022, 3:33 PM
what Prefect version do you use? this error should be fixed with this PR https://github.com/PrefectHQ/prefect/pull/5577
j

Jake

04/22/2022, 4:45 PM
Ah that would explain it. We are on version
0.15.4
. Thanks!
👍 1