https://prefect.io logo
j

Jeff Brainerd

10/14/2020, 12:54 PM
Hi P team I am seeing consistent timeout issues trying to run graphql queries, either programmatically or in the UI. Other parts of the UI seem fine. Is this just me?
Copy code
{
  "graphQLErrors": [
    {
      "path": [
        "flow_run"
      ],
      "message": "Operation timed out",
      "extensions": {
        "code": "API_ERROR"
      }
    }
  ],
  "networkError": null,
  "message": "GraphQL error: Operation timed out"
}
j

josh

10/14/2020, 12:55 PM
Hey @Jeff Brainerd I am not seeing this on a basic query. 🤔 What does your query look like?
j

Jeff Brainerd

10/14/2020, 12:58 PM
It’s a query we run all the time:
Copy code
query {
            flow_run(where: {
                _and: {
                    flow: {
                        name: { _eq: "jf_data_pipeline" }
                        version_group_id: { _eq: "prd"}
                    }
                    state: { _in: ["Running", "Scheduled", "Submitted", "Queued"] },
                    
                }
            })
            {
                id
                name
                state
                start_time
                flow {
                  environment
                }
            }
        }
but you are right, a query like
flow_run_by_pk
works fine
j

Jeremiah

10/14/2020, 1:02 PM
I don’t see anything off the bat troubling in your query but sometimes the translation from graphql to sql introduces suboptimal queries. I will do my best to replicate this morning if I can
j

Jeff Brainerd

10/14/2020, 1:03 PM
thanks @Jeremiah
if I simplify the where clause it does appear to work
j

Jeremiah

10/14/2020, 1:03 PM
The fact that this used to work and now doesn’t suggests to me it might be related to the growth of data (perhaps adding a limit would help) - I will try to report back
Which piece of the where clause? Perhaps we just have a missing index
j

Jeff Brainerd

10/14/2020, 1:09 PM
weird — removing any one of the three and it seems to work…
FYI adding a limit does not help (and this query only pulls back a handful of flow runs anyway)
j

Jeremiah

10/14/2020, 2:31 PM
@Jeff Brainerd I was able to replicate and I think it will operate properly now
Obviously it’s great that we are able to expose the entire db to users via GQL but the downside is that as the DB grows the probability of generating ineffficient SQL grows too - in this case, a suboptimal query was planned (not because of any fault of yours). We cleaned up the table statistics that seems to have helped the query planner choose a better route. I’ll keep an eye on this.
j

Jeff Brainerd

10/14/2020, 2:47 PM
Yes, working again — thank you for jumping on that so quickly @Jeremiah!
I’ve been bitten before by the same kind of thing!
j

Jeremiah

10/14/2020, 2:48 PM
No problem! Please let us know if you see something like that again - our objective is to provide full db access via the GQL API but with great power comes great responsibility so it’s important for us to determine if we’ve encouraged a bad query pattern
j

Jeff Brainerd

10/14/2020, 3:07 PM
👍 rgr will do
BTW we are building quite a bit of tooling and monitoring on top of this API. If you have — now or in the future — any guidance on good or bad patterns that would be helpful.
FWIW we are careful to make our queries as specific as possible to bring back as little data as possible.
j

Jeremiah

10/14/2020, 3:08 PM
One thing we’ve learned ourselves is favor multiple simple queries over complex large ones, not just as a matter of data practice but also because it helps the query planner immensely
👍 1
yup
j

Jeff Brainerd

10/14/2020, 3:09 PM
ok makes sense
@josh @Jeremiah Hey guys — seeing the same timeout again today 😞
j

Jeremiah

10/21/2020, 12:38 PM
Sorry about that Jeff - well have a closer look
Jeff, I have an idea but going to switch to DM to discuss in case it overlaps with anything private
3 Views