https://prefect.io logo
t

Tomek Florek

01/19/2022, 3:41 PM
Hey community! I’m trying to set up Prefect with GreatExpectations, using the official task. I’m trying to run a validation on an in-memory Dataframe (result of previous task) but have trouble to set it up correctly. I’m trying to use v3 API of GE, set up the expectation suite, checkpoint according to the guide (until _context.run_checkpoint)_ but struggle to pass said dataframe. Would anyone be able to offer some guidance?
k

Kevin Kho

01/19/2022, 3:42 PM
I don’t think you pass an in memory dataframe for that task. You point to a DataFrame in storage and run the checkpoint against it
I think would you need to modify the task to support that.
t

Tomek Florek

01/19/2022, 3:44 PM
Thanks for getting back Kevin! What do you mean by DataFrame in storage? A saved csv/parquet file?
k

Kevin Kho

01/19/2022, 3:45 PM
Yeah exactly
t

Tomek Florek

01/19/2022, 3:47 PM
Alright, that makes things much easier, I guess I was trying to complicate my life for no big reason. Thanks Kevin 👌 , will run with that and let know how it went.
All went well 🙂
k

Kevin Kho

01/20/2022, 5:02 PM
Nice! Thanks for circling back
8 Views