https://prefect.io logo
Title
m

Muhammad Usman Ghani

06/30/2022, 4:21 PM
Hello everyone, We have been using prefect 1.0 for quite some time in lower and higher environments. Recently, there was some activity ran by the security team and the server where prefect db is hosted was moved to private subnet. Right after that, several issues start popping up. • Unable to run the jobs from UI • Flows can't be scheduled because scheduler option is not working • Agent is not picking up the scheduled jobs Whole stack is deployed on AWS ECS Fargate. Issues still persist even after bringing back the subnets to public.
k

Kevin Kho

07/01/2022, 5:33 AM
It’s a bit hard to diagnose this because it’s really specific to your set-up. The best I can do is provide some tips to debug: Check if you can run a graphql API using the interactive API. If you can’t then it means that the UI is not hitting the right API These three all seem as if the API endpoint is misconfigured because it may have changed
m

Muhammad Usman Ghani

07/01/2022, 5:51 AM
I ran a few calls through graphql and it seems to be working fine. However, there is a call which is not returning the expected results.
mutation($input: get_runs_in_queue_input!) {
 get_runs_in_queue(input: $input {flow_run_ids}
}
There are few scheduled jobs available in the DB but the mutation is returning empty flow_run_ids list
k

Kevin Kho

07/01/2022, 5:52 AM
That sounds like the UI is working. Maybe it’s the agent that is misconfigured, and maybe the towel container is down?
m

Muhammad Usman Ghani

07/01/2022, 5:56 AM
my understanding is that towel is responsible for generating flow schedules. If this is correct, I can see the many entries in the DB in flow_run_states table with state = "Scheduled"
k

Kevin Kho

07/01/2022, 2:15 PM
Yes towel is. I just thought from the first post it wasn’t working
o

Oliver Tosky

07/14/2022, 6:29 PM
@Muhammad Usman Ghani were you able to find a root cause here? We just started seeing this issue sporadically in one of our EKS environments since making an underlying infra change. We are still on v0.15.4 though