Hello Everyone, Currently running Prefect 2.0 on ...
# prefect-community
m
Hello Everyone, Currently running Prefect 2.0 on AKS. We’ve been working on deploying Flows via Azure DevOps and have successfully deployed them to a default workqueue. However, when testing a flow run, the agent (which is started for the default workqueue) doesn’t pick up the flow run. Does anyone have any ideas on which the issue might be here?
1
r
Are you using Prefect Cloud, or a self-hosted Orion instance? In either case, check to make sure
PREFECT_API_URL
is set on the machine/container where the agent is running, either by setting an environment variable or by running
prefect config set
. If using Prefect Cloud, make sure
PREFECT_API_KEY
is set for the agent as well.
m
It’s a self-hosted orion instance. The PREFECT_API_URL is already set on the agent and orion api. What do you mean by on the machine/container?
r
By 'machine/container', I just mean wherever you are running the agent. You mentioned AKS, so it sounds like you're running everything in containers, but you could run an agent on a VM that launches flow runs in AKS, so I was trying not to make too many assumptions about your infrastructure. It sounds like you've already set
PREFECT_API_URL
for the agent, though. Would you be able to post the output the agent shows when you start it?
m
Sure, Let me get that for you. I may need to restart since we start the agent on deployment.
Here’s the screenshot. Just had to scroll a bit…
r
Thanks! It looks okay from the agent side. And just to confirm, the API URL it says it connects to is correct? I'm sure you checked that already, but I just want to ensure we cover the easy things before looking elsewhere 🙂
m
Yep. I will say that the Prefect API URL is technically different than our Orion API. We had to specify that url differently due to an OpenSSL issue we saw.
But should be pointing to the same place.
r
Right; as long as they're pointing to the same place, it should work. How are you triggering the test flow runs? Via the deployments page in the UI?
m
Yes. Via the deployments page in the UI. We were testing just running the flow from the UI .
r
Ok. And the work queue definitely shows up as
default
in the UI, similar to this? If so, can you verify that there aren't any leading or trailing spaces in the queue name? I believe those would get trimmed away automatically, but it's worth checking
m
I’m pretty sure there are no leading or trailing spaces in the queue name.
c
if the PREFECT_API_URL is set locally in AKS, are you using the node dns
e.g. deployments/prefect-orion
or are you using an ingress url
m
An ingress URL for the Orion API side. The Agent points directly to the http service within the cluster.
c
So you an agent running in cluster , pointing to the external ingress url ?
Wouldn’t you just want to point to directly to the service ? This is going to route outside the nodes to the external load balancer of the ingress
If everything is running in the same cluster , you can just use pod dns - if you’re running the agent outside the cluster , then I would check your network connectivity from where the agent is to where prefect Orion is
m
No, the agent is pointing to the service. The Orion API is pointing to the Ingress URL.
Orion API & the Agent is running in the same namespace.
@Christopher Boyd Any other ideas on what might be happening?
c
Im not sure, I was attempting to test and re-produce this
one thing I would consider or try to do
Is try to shell into the pod, and curl the API url you have set on teh agent
by default I don’t think curl would be included, so you either might need to build your own package to include / add it, or test it from the node
m
Due to our company’s policies on AKS namespaces, we can’t currently shell into the pod (working on that access). Will be working through today to see if we can isolate the problem. Glad the obvious errors are out of the way though.
We got it working; had to direct to the cluster IP (at the moment) until we figure out what was wrong with our specification of the local DNS.