Hi, I keep getting "The replica workerpool0-0 exit...
# random
o
Hi, I keep getting "The replica workerpool0-0 exited with a non-zero status of 127" in the console using the Vertex agent. Looking at the flow logs, I see the job submitted to Vertex, then after a few minutes I get "Rescheduled by a Lazarus process. This is attempt 1." and it just keeps retrying. Anyone had similar issues?
k
127 is command not found. did you supply your own image or use the Prefect base image?
a
Also: do you use Prefect Cloud or Server?
o
I use Cloud. No, I'm using a custom image
a
any chance you can share your run config and storage? you can redact any private information
o
I'll PM you!
thanks!
a
thanks for that! Your GCS and VertexRun look good. By googling a bit, it looks like it may be a dependency/PATH issue. 1. Do you have some custom libraries that are not included in your custom docker image? 2. Did you set a WORKDIR within your Dockerfile? 3. Are you perhaps running some shell commands with ShellTask within your flow? If so, it could be that the command from which you run it does not exist within the container’s PATH. Quote: “Value 127 is returned by 
/bin/sh
 when the given command is not found within your 
PATH
 system variable and it is not a built-in shell command. In other words, the system doesn’t understand your command, because it doesn’t know where to find the binary you’re trying to call.” 4. How did you start the agent? 5. Can you confirm that the agent is running and healthy? Could you perhaps enable debug logs on the agent?
prefect agent vertex start --log-level DEBUG
6. Can you check the GCP job logs to see if perhaps it’s some transient issue e.g. because the machine you requested is not available? Those logs may be helpful to see what is going on within the Vertex job
🙌 1