Happy Friday! Please is there an example I can fol...
# ask-community
d
Happy Friday! Please is there an example I can follow on using great expectations validation in prefect? I have it set up but when I run the flow, I get the error "ConfigNotFoundError('Error: No great_expectations directory was found here!" not sure what config I'm missing and where to add it. Thank you
a
Did you run "great_expectations init"? This would generate great_expectations directory
a
Hey Dammy! Great Expectations needs to know which directory your
great_expectations.yml
is located in. You can configure where
great_expectations
looks for
great_expectations.yml
by setting
context_root_dir
equal to the path to the directory where
great_expectations.yml
is located.
upvote 1
d
Ok, thank you. Trying to set it up for s3
Yes, I used init and there's a great expectations directory but prefect is not finding it
a
What type of storage are you using for flow?
d
S3 bucket
a
To make sure that I understand, the code for your flow is stored in S3, correct? Is your Great Expectations config stored somewhere else?
d
Yes
The Great expectation folder is in my computer
a
You’ll need to make sure that your Great Expectations folder is on the same machine that is executing the flow. One way to do that would be to commit and push your Great Expectation config to a remote git repository so that it can be pulled down and used as part of your flow.
d
Ok, thank you! Let me try this
Hi Alex! I have committed and pushed my Great Expectation config to a remote git repository, please where do I add it in the Kubernetes run config so it can be used? Thanks
a
@dammy arinde I would expect that you need to explicitly clone your GE repository in a separate task that runs before the GE task, similarly to this:
Copy code
import pygit2

@task
def pull_ge_repo(repo_url: str, branch: str = None):
    pygit2.clone_repository(url=repo_url, path="your_path_to_clone_the_repo_into", checkout_branch=branch)
Then, once it’s cloned, you could point GE task to the path specified above. But maybe @alex can confirm
a
Yes, that’s exactly what I was thinking
👍 1
d
Thank you! I will try using pygit2 now
seems I have to install pygit2 on the docker image first
a
that’s correct - it must be installed on the agent / environment you run your flow in