Hi, I was wondering if it is possible to grab/filter by a flow parameter’s value via a GraphQL query...
b
Hi, I was wondering if it is possible to grab/filter by a flow parameter’s value via a GraphQL query. I’ve been messing around with the interactive API and can see you can filter
flow_runs
by parameter by using something like:
flow_runs(where:{parameters: {_has_key: "process_date"}}
but what I would like to do is filter the query for any flows that ran on a specific process date, such as, 8/17/2021, rather than filtering for flows that have the
process_date
parameter. Thanks!
k
Hey @Ben Sack, I am not 100% sure this can be done with a query only. Am checking, otherwise you might have to do that filter on the response in Python
b
Sorry just seeing this now, but thanks for the input! I didn’t think it was possible but I wanted to check before proceeding with any workarounds
k
So I asked the team…and I don’t believe it’s possible yep
b
Thanks again Kevin! As a workaround, I am attempting to query all of the successful flow runs given a specific flow name, grab the parameters from the flow runs and then filter through the results using python. Strangely enough, many of the recent flow_runs that come back from my query have
"parameters": {}
, yet when I check in the parameters section for the flow_runs in the Perfect UI you can plainly see that there are parameters. Any ideas on why this would be? I can provide my query and results if that would help better understand the issue. Thanks
k
Yeah! I think there is another field for
default_parameters
.
parameters
are the ones explicitly supplied, but there is a
default_parameter
in the flow or flow run where you are probably passing the parameters through. Does that make sense for you?
b
Hey Kevin, so I tried using the
default_parameters
but that didn’t seem to do the trick either since that is a filter for the
flow_group
rather than
flow_run
. Here is the query I’m running and a snippet of the Prefect UI so you can better see what I’m struggling with. Query:
Copy code
{
  flow(
    where: {_and: [{name: {_eq: "snowflake_cost_usage"}}, {project: {name: {_eq: "DSS-DE-Prod"}}}]}
  ) {
    id
    name
    flow_runs(where: {state: {_eq: "Success"}}) {
      parameters
      start_time
      id
      state
      name
    }
  }
}
Part of the response:
Copy code
{
            "parameters": {},
            "start_time": "2021-09-12T04:00:09.013801+00:00",
            "id": "aa1c50c3-5aed-44e2-90bb-634b605152a6",
            "state": "Success",
            "name": "zippy-malamute"
          },
Snippet of specific flow_run’s parameters from Prefect UI:
Sorry for bombarding with messages, but basically long-story-short, I am trying to query those parameters shown above in the Prefect UI. The strangest thing about this is that one of the
flow_runs
does in fact return the correct result but the rest of them show
"parameters": {}
. I can provide the entire result from the query if needed. Thanks!
k
I see your point. Are these supplied to the flow or on the default parameters on the schedule (flow group)? And no worries at all 🙂 Here to help.
b
I believe they are supplied to the flow here:
Copy code
with Flow("snowflake_cost_usage") as flow:
    process_date = Parameter(name="process_date", default=date_minus_seven())
    spoke_accounts = Parameter(
        name="spoke_accounts", default=list(SPOKE_ACCT_DBS.keys())
    )
    spoke_dbs = Parameter(name="spoke_dbs", default=list(SPOKE_ACCT_DBS.values()))
    hub_account = Parameter(name="hub_account", default=HUB_ACCT)
    hub_db = Parameter(name="hub_db", default=HUB_DB)
but I do see them in the Default Parameters section in the settings in Prefect UI as well. The parameters are static except for the
process_date
k
Gotcha I’ll ask the UI team how that info is retrieved
b
Thanks a lot!
k
Did you try getting them at the flow level like this?
Copy code
query {
  flow_run {
    parameters
    flow {
      parameters
    }
  }
}
And then
flow
parameters
will have the default like this:
Copy code
{
        "parameters": {
          "account_identifier": "inrun"
        },
        "flow": {
          "parameters": [
            {
              "name": "account_identifier",
              "slug": "account_identifier",
              "tags": [],
              "type": "prefect.core.parameter.Parameter",
              "default": null,
              "outputs": "typing.Any",
              "required": false,
              "__version__": "0.15.5"
            }
          ]
        }
      },
n
Hi @Ben Sack - if you want to query your runs directly to see what they ran with, you can do this from the IAPI:
Copy code
query($parameters: jsonb) {
  flow_run(where:{parameters: { _contains: $parameters }}) {
    id
    name
    parameters
  }
}
and in the
Query Variables
section, you would paste the parameters you’re looking for, like this:
Copy code
{
  "parameters": {
    "lat": 37.2761
  }
}
But replacing
"lat": 37.2761
with whatever parameters you’re searching for
b
This is great let me give it a shot now - thanks again for assist!
n
For sure! Let us know if that doesn’t work
b
@nicholas for the
query variables
section I am getting the error with that syntax saying
Expected value of type "jsonb"
This is what I have in the query section:
Copy code
{
  "parameters": {
    "name": "process_date"
  }
}
n
You’ll want to use key/value there, so:
Copy code
"parameters": {
  "process_date": "<<the process data you're looking for"
}
And if you’re not seeing what you expect, I have another query for you that’ll join on the flow object as well, the syntax is a little different but not much
b
Yeah so that was what I originally had but it still yells at me regarding expecting ‘jsonb’ type
as for @Kevin Kho’s query, I would like to break it down so I only pull the values for a specific
flow_name
rather than all of the parameters for all of the flows. Any thoughts on how to filter that since it seems I cannot use the
where
clause for the
flow_run.flow
k
Do you mean like this?
Copy code
query {
  flow_run (where: {flow: {name: {_eq: "test"}}}) {
    parameters
    flow  {
      name
      parameters
    }
  }
}
b
exactly! Sorry that was a very simply question just getting use to the graphQL syntax.
I’m going to play around with those results and see if I can get the desired outcome and keep you guys posted - thanks again
k
No worries. I chatted with nicholas about this and there really might be a bug on our end because the parameters should be populated and it shouldn’t be that hard. The UI has some code that collects the parameters in these different places and reduces them to produce the parameters in the card. This shouldn’t have to be this hard to query. Default parameters from the flow group are passed to the flow run, BUT default parameters from the flow aren’t
👍 1
n
Sorry for the slowness over here, been in and out of meetings
Here’s the query I was talking about that might help, though @Kevin Kho is completely correct
Copy code
query($parameters: jsonb) {
  flow_run(where:{flow: {parameters: { _contains: $parameters }}}) {
    id
    name
    parameters
    
    flow {
      id
      parameters
    }
  }
}
with
$parameters
variable of:
Copy code
{
  "parameters": [
    {
      "name": "lat",
      "default": 37.2761
    }
  ]
}
The query variables box might get angry but it should work
that’ll let you make the same reduction the UI does on those
b
Thanks! So both are working now but unfortunately the
default_parameters
for
process_date
is always the same (since it is the
process_date
that was initially calculated at flow build) rather than the
process_date
that is generated from the flow’s docker storage (stored as script) each time the flow runs
Are we able to confirm whether this issue with querying
flow_run
parameters is a bug on the Prefect side? If not, I can check with other groups to see if they are able to get the correct results with the query I have. What I find strange is that the query returns the correct parameter results for two flow runs but empty results (
"parameters": {}
) for all of the rest.
k
I think it is a bug for the population of
parameters
part yep. Will open a ticket for this today. Just wondering, does the
process_date
show up in the UI correctly?
b
Thanks! I’ll take a look now
So in the parameter section of a given flow run, the
process_date
parameter is always the same since that is the value that was calculated at flow build. But when you click into the
task run
section for
process_date
it shows the
Result Location
equal to the correct
process_date
that is calculated each time a flow runs (using docker storage). I have two snippets to better explain what I’m looking at:
Now that i’m looking at it. Although the
Result Location
is equal to the correct
process_date
, I can see at the bottom it shows
Parameter:{}
Default Params for flow run:
Process Date task run:
What is strange is a run from the previous version of the flow calculates/pulls the correct
process_date
(as seen below). But the only difference between the versions was a change to a SQL statement to fix an issue on the database side and I don’t believe anything else was changed that would affect the flow or its parameters.
k
Gotcha so it may have been a change introduced on our end? Thanks for all of the info. I haven’t gotten a chance to write the ticket yet but I’ll include these images.
b
Awesome, thanks for all the help!
Hi @Kevin Kho, Sorry for checking back so late, I’ve been engulfed in Splunk training the past two-weeks. Was this issue ever looked at/was a ticket ever created? I just wanted to check-in before I resume my work on this effort tomorrow. Thanks!
k
I am so sorry, it’s been on my to do list but I get to writing the issue. Will do ASAP and ping you today, but on the development side, we are definitely aware of it but haven’t gotten to it yet.
b
Ok cool - no problem at all, thanks!
k
Hey @Ben Sack, I tried to make this as succinct as possible. Can I get feedback if it accurately describes your issue? https://github.com/PrefectHQ/prefect/issues/5009
There was a fix for this here
b
Thank you, I will take a look today!
k
Not deployed yet though
b
No problem, appreciate the update! My coworker was working on a separate prefect project and discovered the use of KV Stores in Prefect. By adding logic to the flow registration processes, we can populate the Prefect KV store from the config file within the repo. We figured this would be a way we could store the last run date as a key value pair in the kv store for the flow and use that date to calculate the
process_date
. Any thoughts/feedback on this method?
k
That sounds good yep