Ben Sack

    Ben Sack

    1 year ago
    Hi, I was wondering if it is possible to grab/filter by a flow parameter’s value via a GraphQL query. I’ve been messing around with the interactive API and can see you can filter
    flow_runs
    by parameter by using something like:
    flow_runs(where:{parameters: {_has_key: "process_date"}}
    but what I would like to do is filter the query for any flows that ran on a specific process date, such as, 8/17/2021, rather than filtering for flows that have the
    process_date
    parameter. Thanks!
    Kevin Kho

    Kevin Kho

    1 year ago
    Hey @Ben Sack, I am not 100% sure this can be done with a query only. Am checking, otherwise you might have to do that filter on the response in Python
    Ben Sack

    Ben Sack

    1 year ago
    Sorry just seeing this now, but thanks for the input! I didn’t think it was possible but I wanted to check before proceeding with any workarounds
    Kevin Kho

    Kevin Kho

    1 year ago
    So I asked the team…and I don’t believe it’s possible yep
    Ben Sack

    Ben Sack

    1 year ago
    Thanks again Kevin! As a workaround, I am attempting to query all of the successful flow runs given a specific flow name, grab the parameters from the flow runs and then filter through the results using python. Strangely enough, many of the recent flow_runs that come back from my query have
    "parameters": {}
    , yet when I check in the parameters section for the flow_runs in the Perfect UI you can plainly see that there are parameters. Any ideas on why this would be? I can provide my query and results if that would help better understand the issue. Thanks
    Kevin Kho

    Kevin Kho

    1 year ago
    Yeah! I think there is another field for
    default_parameters
    .
    parameters
    are the ones explicitly supplied, but there is a
    default_parameter
    in the flow or flow run where you are probably passing the parameters through. Does that make sense for you?
    Ben Sack

    Ben Sack

    1 year ago
    Hey Kevin, so I tried using the
    default_parameters
    but that didn’t seem to do the trick either since that is a filter for the
    flow_group
    rather than
    flow_run
    . Here is the query I’m running and a snippet of the Prefect UI so you can better see what I’m struggling with. Query:
    {
      flow(
        where: {_and: [{name: {_eq: "snowflake_cost_usage"}}, {project: {name: {_eq: "DSS-DE-Prod"}}}]}
      ) {
        id
        name
        flow_runs(where: {state: {_eq: "Success"}}) {
          parameters
          start_time
          id
          state
          name
        }
      }
    }
    Part of the response:
    {
                "parameters": {},
                "start_time": "2021-09-12T04:00:09.013801+00:00",
                "id": "aa1c50c3-5aed-44e2-90bb-634b605152a6",
                "state": "Success",
                "name": "zippy-malamute"
              },
    Snippet of specific flow_run’s parameters from Prefect UI:
    Sorry for bombarding with messages, but basically long-story-short, I am trying to query those parameters shown above in the Prefect UI. The strangest thing about this is that one of the
    flow_runs
    does in fact return the correct result but the rest of them show
    "parameters": {}
    . I can provide the entire result from the query if needed. Thanks!
    Kevin Kho

    Kevin Kho

    1 year ago
    I see your point. Are these supplied to the flow or on the default parameters on the schedule (flow group)? And no worries at all 🙂 Here to help.
    Ben Sack

    Ben Sack

    1 year ago
    I believe they are supplied to the flow here:
    with Flow("snowflake_cost_usage") as flow:
        process_date = Parameter(name="process_date", default=date_minus_seven())
        spoke_accounts = Parameter(
            name="spoke_accounts", default=list(SPOKE_ACCT_DBS.keys())
        )
        spoke_dbs = Parameter(name="spoke_dbs", default=list(SPOKE_ACCT_DBS.values()))
        hub_account = Parameter(name="hub_account", default=HUB_ACCT)
        hub_db = Parameter(name="hub_db", default=HUB_DB)
    but I do see them in the Default Parameters section in the settings in Prefect UI as well. The parameters are static except for the
    process_date
    Kevin Kho

    Kevin Kho

    1 year ago
    Gotcha I’ll ask the UI team how that info is retrieved
    Ben Sack

    Ben Sack

    1 year ago
    Thanks a lot!
    Kevin Kho

    Kevin Kho

    1 year ago
    Did you try getting them at the flow level like this?
    query {
      flow_run {
        parameters
        flow {
          parameters
        }
      }
    }
    And then
    flow
    parameters
    will have the default like this:
    {
            "parameters": {
              "account_identifier": "inrun"
            },
            "flow": {
              "parameters": [
                {
                  "name": "account_identifier",
                  "slug": "account_identifier",
                  "tags": [],
                  "type": "prefect.core.parameter.Parameter",
                  "default": null,
                  "outputs": "typing.Any",
                  "required": false,
                  "__version__": "0.15.5"
                }
              ]
            }
          },
    nicholas

    nicholas

    1 year ago
    Hi @Ben Sack - if you want to query your runs directly to see what they ran with, you can do this from the IAPI:
    query($parameters: jsonb) {
      flow_run(where:{parameters: { _contains: $parameters }}) {
        id
        name
        parameters
      }
    }
    and in the
    Query Variables
    section, you would paste the parameters you’re looking for, like this:
    {
      "parameters": {
        "lat": 37.2761
      }
    }
    But replacing
    "lat": 37.2761
    with whatever parameters you’re searching for
    Ben Sack

    Ben Sack

    1 year ago
    This is great let me give it a shot now - thanks again for assist!
    nicholas

    nicholas

    1 year ago
    For sure! Let us know if that doesn’t work
    Ben Sack

    Ben Sack

    1 year ago
    @nicholas for the
    query variables
    section I am getting the error with that syntax saying
    Expected value of type "jsonb"
    This is what I have in the query section:
    {
      "parameters": {
        "name": "process_date"
      }
    }
    nicholas

    nicholas

    1 year ago
    You’ll want to use key/value there, so:
    "parameters": {
      "process_date": "<<the process data you're looking for"
    }
    And if you’re not seeing what you expect, I have another query for you that’ll join on the flow object as well, the syntax is a little different but not much
    Ben Sack

    Ben Sack

    1 year ago
    Yeah so that was what I originally had but it still yells at me regarding expecting ‘jsonb’ type
    as for @Kevin Kho’s query, I would like to break it down so I only pull the values for a specific
    flow_name
    rather than all of the parameters for all of the flows. Any thoughts on how to filter that since it seems I cannot use the
    where
    clause for the
    flow_run.flow
    Kevin Kho

    Kevin Kho

    1 year ago
    Do you mean like this?
    query {
      flow_run (where: {flow: {name: {_eq: "test"}}}) {
        parameters
        flow  {
          name
          parameters
        }
      }
    }
    Ben Sack

    Ben Sack

    1 year ago
    exactly! Sorry that was a very simply question just getting use to the graphQL syntax.
    I’m going to play around with those results and see if I can get the desired outcome and keep you guys posted - thanks again
    Kevin Kho

    Kevin Kho

    1 year ago
    No worries. I chatted with nicholas about this and there really might be a bug on our end because the parameters should be populated and it shouldn’t be that hard. The UI has some code that collects the parameters in these different places and reduces them to produce the parameters in the card. This shouldn’t have to be this hard to query. Default parameters from the flow group are passed to the flow run, BUT default parameters from the flow aren’t
    nicholas

    nicholas

    1 year ago
    Sorry for the slowness over here, been in and out of meetings
    Here’s the query I was talking about that might help, though @Kevin Kho is completely correct
    query($parameters: jsonb) {
      flow_run(where:{flow: {parameters: { _contains: $parameters }}}) {
        id
        name
        parameters
        
        flow {
          id
          parameters
        }
      }
    }
    with
    $parameters
    variable of:
    {
      "parameters": [
        {
          "name": "lat",
          "default": 37.2761
        }
      ]
    }
    The query variables box might get angry but it should work
    that’ll let you make the same reduction the UI does on those
    Ben Sack

    Ben Sack

    1 year ago
    Thanks! So both are working now but unfortunately the
    default_parameters
    for
    process_date
    is always the same (since it is the
    process_date
    that was initially calculated at flow build) rather than the
    process_date
    that is generated from the flow’s docker storage (stored as script) each time the flow runs
    Are we able to confirm whether this issue with querying
    flow_run
    parameters is a bug on the Prefect side? If not, I can check with other groups to see if they are able to get the correct results with the query I have. What I find strange is that the query returns the correct parameter results for two flow runs but empty results (
    "parameters": {}
    ) for all of the rest.
    Kevin Kho

    Kevin Kho

    1 year ago
    I think it is a bug for the population of
    parameters
    part yep. Will open a ticket for this today. Just wondering, does the
    process_date
    show up in the UI correctly?
    Ben Sack

    Ben Sack

    1 year ago
    Thanks! I’ll take a look now
    So in the parameter section of a given flow run, the
    process_date
    parameter is always the same since that is the value that was calculated at flow build. But when you click into the
    task run
    section for
    process_date
    it shows the
    Result Location
    equal to the correct
    process_date
    that is calculated each time a flow runs (using docker storage). I have two snippets to better explain what I’m looking at:
    Now that i’m looking at it. Although the
    Result Location
    is equal to the correct
    process_date
    , I can see at the bottom it shows
    Parameter:{}
    Default Params for flow run:
    Process Date task run:
    What is strange is a run from the previous version of the flow calculates/pulls the correct
    process_date
    (as seen below). But the only difference between the versions was a change to a SQL statement to fix an issue on the database side and I don’t believe anything else was changed that would affect the flow or its parameters.
    Kevin Kho

    Kevin Kho

    1 year ago
    Gotcha so it may have been a change introduced on our end? Thanks for all of the info. I haven’t gotten a chance to write the ticket yet but I’ll include these images.
    Ben Sack

    Ben Sack

    1 year ago
    Awesome, thanks for all the help!
    Hi @Kevin Kho, Sorry for checking back so late, I’ve been engulfed in Splunk training the past two-weeks. Was this issue ever looked at/was a ticket ever created? I just wanted to check-in before I resume my work on this effort tomorrow. Thanks!
    Kevin Kho

    Kevin Kho

    11 months ago
    I am so sorry, it’s been on my to do list but I get to writing the issue. Will do ASAP and ping you today, but on the development side, we are definitely aware of it but haven’t gotten to it yet.
    Ben Sack

    Ben Sack

    11 months ago
    Ok cool - no problem at all, thanks!
    Kevin Kho

    Kevin Kho

    11 months ago
    Hey @Ben Sack, I tried to make this as succinct as possible. Can I get feedback if it accurately describes your issue? https://github.com/PrefectHQ/prefect/issues/5009
    There was a fix for this here
    Ben Sack

    Ben Sack

    11 months ago
    Thank you, I will take a look today!
    Kevin Kho

    Kevin Kho

    11 months ago
    Not deployed yet though
    Ben Sack

    Ben Sack

    11 months ago
    No problem, appreciate the update! My coworker was working on a separate prefect project and discovered the use of KV Stores in Prefect. By adding logic to the flow registration processes, we can populate the Prefect KV store from the config file within the repo. We figured this would be a way we could store the last run date as a key value pair in the kv store for the flow and use that date to calculate the
    process_date
    . Any thoughts/feedback on this method?
    Kevin Kho

    Kevin Kho

    11 months ago
    That sounds good yep