https://prefect.io logo
Title
m

Mark Li

10/17/2022, 1:43 PM
Hello Everyone, Currently running Prefect 2.0 on AKS. We’ve been working on deploying Flows via Azure DevOps and have successfully deployed them to a default workqueue. However, when testing a flow run, the agent (which is started for the default workqueue) doesn’t pick up the flow run. Does anyone have any ideas on which the issue might be here?
1
r

Ryan Peden

10/17/2022, 1:51 PM
Are you using Prefect Cloud, or a self-hosted Orion instance? In either case, check to make sure
PREFECT_API_URL
is set on the machine/container where the agent is running, either by setting an environment variable or by running
prefect config set
. If using Prefect Cloud, make sure
PREFECT_API_KEY
is set for the agent as well.
m

Mark Li

10/17/2022, 3:47 PM
It’s a self-hosted orion instance. The PREFECT_API_URL is already set on the agent and orion api. What do you mean by on the machine/container?
r

Ryan Peden

10/17/2022, 3:53 PM
By 'machine/container', I just mean wherever you are running the agent. You mentioned AKS, so it sounds like you're running everything in containers, but you could run an agent on a VM that launches flow runs in AKS, so I was trying not to make too many assumptions about your infrastructure. It sounds like you've already set
PREFECT_API_URL
for the agent, though. Would you be able to post the output the agent shows when you start it?
m

Mark Li

10/17/2022, 4:00 PM
Sure, Let me get that for you. I may need to restart since we start the agent on deployment.
Here’s the screenshot. Just had to scroll a bit…
r

Ryan Peden

10/17/2022, 4:12 PM
Thanks! It looks okay from the agent side. And just to confirm, the API URL it says it connects to is correct? I'm sure you checked that already, but I just want to ensure we cover the easy things before looking elsewhere 🙂
m

Mark Li

10/17/2022, 4:15 PM
Yep. I will say that the Prefect API URL is technically different than our Orion API. We had to specify that url differently due to an OpenSSL issue we saw.
But should be pointing to the same place.
r

Ryan Peden

10/17/2022, 4:27 PM
Right; as long as they're pointing to the same place, it should work. How are you triggering the test flow runs? Via the deployments page in the UI?
m

Mark Li

10/17/2022, 5:20 PM
Yes. Via the deployments page in the UI. We were testing just running the flow from the UI .
r

Ryan Peden

10/17/2022, 5:34 PM
Ok. And the work queue definitely shows up as
default
in the UI, similar to this? If so, can you verify that there aren't any leading or trailing spaces in the queue name? I believe those would get trimmed away automatically, but it's worth checking
m

Mark Li

10/17/2022, 5:42 PM
I’m pretty sure there are no leading or trailing spaces in the queue name.
c

Christopher Boyd

10/17/2022, 6:23 PM
if the PREFECT_API_URL is set locally in AKS, are you using the node dns
e.g. deployments/prefect-orion
or are you using an ingress url
m

Mark Li

10/17/2022, 7:24 PM
An ingress URL for the Orion API side. The Agent points directly to the http service within the cluster.
c

Christopher Boyd

10/17/2022, 8:17 PM
So you an agent running in cluster , pointing to the external ingress url ?
Wouldn’t you just want to point to directly to the service ? This is going to route outside the nodes to the external load balancer of the ingress
If everything is running in the same cluster , you can just use pod dns - if you’re running the agent outside the cluster , then I would check your network connectivity from where the agent is to where prefect Orion is
m

Mark Li

10/17/2022, 8:52 PM
No, the agent is pointing to the service. The Orion API is pointing to the Ingress URL.
Orion API & the Agent is running in the same namespace.
@Christopher Boyd Any other ideas on what might be happening?
c

Christopher Boyd

10/18/2022, 7:13 PM
Im not sure, I was attempting to test and re-produce this
one thing I would consider or try to do
Is try to shell into the pod, and curl the API url you have set on teh agent
by default I don’t think curl would be included, so you either might need to build your own package to include / add it, or test it from the node
m

Mark Li

10/20/2022, 1:29 PM
Due to our company’s policies on AKS namespaces, we can’t currently shell into the pod (working on that access). Will be working through today to see if we can isolate the problem. Glad the obvious errors are out of the way though.
We got it working; had to direct to the cluster IP (at the moment) until we figure out what was wrong with our specification of the local DNS.