Hi all I just spotted that we now have a first implementatio Prefect Community #ask-community

Hi all. I just spotted that we now have a first im...

Christian

06/09/2020, 8:53 PM

Hi all. I just spotted that we now have a first implementation of GreatExpectations in GitHub HEAD! Great stuff... I try to run the example and wonder how to access the GE return json with the validation results?

🎉 1

Jim Crist-Harif

06/09/2020, 8:56 PM

Glad you're interested in the GE integration, you have @Laura Lorenz (she/her) to thank for that :)

😊 1

👏 5

❤️ 1

Christian

06/09/2020, 9:00 PM

I was just looking how to integrate GE into a prefect flow when I spotted the PR on github. Waited another 5h and it was merged 😉 - but I'm a prefect noob and not quite sure how the taks/ flow objects work. I need to access the large JSON that is returned by GE after a validation

Jim Crist-Harif

06/09/2020, 9:02 PM

I'm not very familiar with GE. It looks like our builtin prefect task returns the output of

DataContext.run_validation_operator

Jim Crist-Harif

06/09/2020, 9:03 PM

https://docs.greatexpectations.io/en/latest/how_to_guides/validation/spare_parts/how_to_configure_a_validation_operator.html#invoking-an-operator

Christian

06/09/2020, 9:12 PM

I think I'm not really getting how I can can access the result value itself: This is from the example:

Copy code

with Flow("great expectations example flow") as flow:
    checkpoint_name = Parameter("checkpoint_name")
    validations = ge_task.map(checkpoint_name)
    print("Result:", validations.result)

but the print returns None. Maybe that's due to the use of map() ?

Jim Crist-Harif

06/09/2020, 9:18 PM

Ah, nothing has run yet at that point. Prefect has two stages of use: • Flow build. This is everything inside the

with Flow(...)

block. At this point no tasks have run, you're just describing the flow you'd like to run later. • Flow run. This happens locally when you call

flow.run

, but can also happen if you register a flow to run later with either cloud or server. To access the results after a run, look at:

Copy code

state = flow.run()
print(state.result[validations].result)

If you want to add a task after the GE task to handle the output, you can add another task to the flow and pass the results directly:

Copy code

@task
def handle_ge_output(result):
   # do your stuff here

with Flow("great expectations example flow") as flow:
    checkpoint_name = Parameter("checkpoint_name")
    validations = ge_task.map(checkpoint_name)
    handle_ge_output.map(validations)

Jim Crist-Harif

06/09/2020, 9:21 PM

I suggest reading through this for more info: https://docs.prefect.io/core/getting_started/first-steps.html#running-the-flow

Christian

06/09/2020, 9:21 PM

Dooooh, yes I just figured that out. Sorry for the noob mistake 🙄

Jim Crist-Harif

06/09/2020, 9:22 PM

No worries! Glad you figured it out :)

Christian

06/09/2020, 9:22 PM

Thanks for the help. 👏 Back to my experiments 😎

Christian

06/09/2020, 11:49 PM

... now I'm only struggling to force result_format to be COMPLETE for all expectations ... not sure if this is possible though

Open in Slack

Previous Next