Hi prefect team I am seeing consistent timeout issues trying Prefect Community #ask-community

Hi :prefect: team I am seeing consistent timeout i...

Jeff Brainerd

10/14/2020, 12:54 PM

Hi P team I am seeing consistent timeout issues trying to run graphql queries, either programmatically or in the UI. Other parts of the UI seem fine. Is this just me?

Copy code

{
  "graphQLErrors": [
    {
      "path": [
        "flow_run"
      ],
      "message": "Operation timed out",
      "extensions": {
        "code": "API_ERROR"
      }
    }
  ],
  "networkError": null,
  "message": "GraphQL error: Operation timed out"
}

josh

10/14/2020, 12:55 PM

Hey @Jeff Brainerd I am not seeing this on a basic query. 🤔 What does your query look like?

Jeff Brainerd

10/14/2020, 12:58 PM

It’s a query we run all the time:

Copy code

query {
            flow_run(where: {
                _and: {
                    flow: {
                        name: { _eq: "jf_data_pipeline" }
                        version_group_id: { _eq: "prd"}
                    }
                    state: { _in: ["Running", "Scheduled", "Submitted", "Queued"] },
                    
                }
            })
            {
                id
                name
                state
                start_time
                flow {
                  environment
                }
            }
        }

Jeff Brainerd

10/14/2020, 12:59 PM

but you are right, a query like

flow_run_by_pk

works fine

Jeremiah

10/14/2020, 1:02 PM

I don’t see anything off the bat troubling in your query but sometimes the translation from graphql to sql introduces suboptimal queries. I will do my best to replicate this morning if I can

Jeff Brainerd

10/14/2020, 1:03 PM

thanks @Jeremiah

Jeff Brainerd

10/14/2020, 1:03 PM

if I simplify the where clause it does appear to work

Jeremiah

10/14/2020, 1:03 PM

The fact that this used to work and now doesn’t suggests to me it might be related to the growth of data (perhaps adding a limit would help) - I will try to report back

Jeremiah

10/14/2020, 1:04 PM

Which piece of the where clause? Perhaps we just have a missing index

Jeff Brainerd

10/14/2020, 1:09 PM

weird — removing any one of the three and it seems to work…

Jeff Brainerd

10/14/2020, 1:14 PM

FYI adding a limit does not help (and this query only pulls back a handful of flow runs anyway)

Jeremiah

10/14/2020, 2:31 PM

@Jeff Brainerd I was able to replicate and I think it will operate properly now

Jeremiah

10/14/2020, 2:33 PM

Obviously it’s great that we are able to expose the entire db to users via GQL but the downside is that as the DB grows the probability of generating ineffficient SQL grows too - in this case, a suboptimal query was planned (not because of any fault of yours). We cleaned up the table statistics that seems to have helped the query planner choose a better route. I’ll keep an eye on this.

Jeff Brainerd

10/14/2020, 2:47 PM

Yes, working again — thank you for jumping on that so quickly @Jeremiah!

Jeff Brainerd

10/14/2020, 2:48 PM

I’ve been bitten before by the same kind of thing!

Jeremiah

10/14/2020, 2:48 PM

No problem! Please let us know if you see something like that again - our objective is to provide full db access via the GQL API but with great power comes great responsibility so it’s important for us to determine if we’ve encouraged a bad query pattern

Jeff Brainerd

10/14/2020, 3:07 PM

👍 rgr will do

Jeff Brainerd

10/14/2020, 3:07 PM

BTW we are building quite a bit of tooling and monitoring on top of this API. If you have — now or in the future — any guidance on good or bad patterns that would be helpful.

Jeff Brainerd

10/14/2020, 3:08 PM

FWIW we are careful to make our queries as specific as possible to bring back as little data as possible.

Jeremiah

10/14/2020, 3:08 PM

One thing we’ve learned ourselves is favor multiple simple queries over complex large ones, not just as a matter of data practice but also because it helps the query planner immensely

👍 1

Jeremiah

10/14/2020, 3:08 PM

yup

Jeff Brainerd

10/14/2020, 3:09 PM

ok makes sense

Jeff Brainerd

10/21/2020, 12:37 PM

@josh @Jeremiah Hey guys — seeing the same timeout again today 😞

Jeremiah

10/21/2020, 12:38 PM

Sorry about that Jeff - well have a closer look

Jeremiah

10/21/2020, 12:53 PM

Jeff, I have an idea but going to switch to DM to discuss in case it overlaps with anything private

3 Views

Open in Slack

Previous Next