https://prefect.io logo
Title
c

Christian

06/09/2020, 8:53 PM
Hi all. I just spotted that we now have a first implementation of GreatExpectations in GitHub HEAD! Great stuff... I try to run the example and wonder how to access the GE return json with the validation results?
šŸŽ‰ 1
j

Jim Crist-Harif

06/09/2020, 8:56 PM
Glad you're interested in the GE integration, you have @Laura Lorenz (she/her) to thank for that :)
😊 1
šŸ‘ 5
ā¤ļø 1
c

Christian

06/09/2020, 9:00 PM
I was just looking how to integrate GE into a prefect flow when I spotted the PR on github. Waited another 5h and it was merged šŸ˜‰ - but I'm a prefect noob and not quite sure how the taks/ flow objects work. I need to access the large JSON that is returned by GE after a validation
j

Jim Crist-Harif

06/09/2020, 9:02 PM
I'm not very familiar with GE. It looks like our builtin prefect task returns the output of
DataContext.run_validation_operator
.
c

Christian

06/09/2020, 9:12 PM
I think I'm not really getting how I can can access the result value itself: This is from the example:
with Flow("great expectations example flow") as flow:
    checkpoint_name = Parameter("checkpoint_name")
    validations = ge_task.map(checkpoint_name)
    print("Result:", validations.result)
but the print returns None. Maybe that's due to the use of map() ?
j

Jim Crist-Harif

06/09/2020, 9:18 PM
Ah, nothing has run yet at that point. Prefect has two stages of use: • Flow build. This is everything inside the
with Flow(...)
block. At this point no tasks have run, you're just describing the flow you'd like to run later. • Flow run. This happens locally when you call
flow.run
, but can also happen if you register a flow to run later with either cloud or server. To access the results after a run, look at:
state = flow.run()
print(state.result[validations].result)
If you want to add a task after the GE task to handle the output, you can add another task to the flow and pass the results directly:
@task
def handle_ge_output(result):
   # do your stuff here

with Flow("great expectations example flow") as flow:
    checkpoint_name = Parameter("checkpoint_name")
    validations = ge_task.map(checkpoint_name)
    handle_ge_output.map(validations)
c

Christian

06/09/2020, 9:21 PM
Dooooh, yes I just figured that out. Sorry for the noob mistake šŸ™„
j

Jim Crist-Harif

06/09/2020, 9:22 PM
No worries! Glad you figured it out :)
c

Christian

06/09/2020, 9:22 PM
Thanks for the help. šŸ‘ Back to my experiments šŸ˜Ž
... now I'm only struggling to force result_format to be COMPLETE for all expectations ... not sure if this is possible though