Srujan Jabbireddy
04/02/2024, 5:18 PMfrom prefect.filesystems import S3, GCS, Azure, GitHub
gh = GitHub(
repository="<https://github.com/data-engineering.git>",
reference=os.environ.get("GITHUB_DATAPLATFORM_BRANCH", "main"),
# access_token is needed for private repositories, supported in Prefect>=2.6.2
access_token=os.environ.get("GITHUB_PERSONAL_ACCESS_TOKEN", DEFAULT_BLOCK),
)
save_block(gh)
Marvin
04/02/2024, 5:19 PMprefect-github
package. However, I can deduce a method similar to that of managing other filesystem blocks like S3
, based on familiarity with Prefect's block system.
Typically, for filesystem blocks, you initialize a block with the necessary configuration parameters and then call the .save()
method to save it. Presumably, GitHub block creation would follow a similar pattern. However, since I don't have information specific to saving a GitHub block via Prefect 2.x or the prefect-github
package, I suggest you refer to Prefect's official documentation or another source for updated instructions.
If a specific GitHub block interface is provided in Prefect 2.x or the prefect-github
package, you would use that interface, filling in appropriate parameters such as the repository, reference, and access token, before saving the block. Given the lack of precise documentation returned, you might need to review any updated Prefect 2.x and prefect-github
documentation or check for issues on the GitHub repository that provide relevant examples or guidance.Nate
04/02/2024, 5:37 PMSrujan Jabbireddy
04/02/2024, 5:38 PMSrujan Jabbireddy
04/02/2024, 5:38 PMNate
04/02/2024, 5:39 PMSrujan Jabbireddy
04/02/2024, 5:40 PMSrujan Jabbireddy
04/02/2024, 5:41 PMNate
04/02/2024, 5:41 PMprefect-github
how are you creating deployments? prefect deploy
and a prefect.yaml
or with something elseSrujan Jabbireddy
04/02/2024, 5:43 PMSrujan Jabbireddy
04/02/2024, 5:43 PMNate
04/02/2024, 5:48 PMprefect deploy
+ prefect.yaml
you do not use a GitHub
block, you use a git_clone
pull stepNate
04/02/2024, 5:48 PMSrujan Jabbireddy
04/02/2024, 5:49 PMNate
04/02/2024, 5:50 PMSrujan Jabbireddy
04/02/2024, 5:50 PMNate
04/02/2024, 5:51 PMSnowflakeConnection
are still perfectly valid, I just mean you do not need blocks specifically for code storage anymoreSrujan Jabbireddy
04/02/2024, 5:52 PMNate
04/02/2024, 5:53 PMprefect deployment build path.py:flow --infra-block k8s/someblock --storage-block s3/some-block
and now instead you say
prefect deploy /path/to/file.py:my_flow --pool my-work-pool
where my-work-pool
is now the place where you config your infra (supersedes infra blocks like ECSTask
KubernetesJob
etc)
and a pull
step takes the role of the storage block, ie "where to get flow code at runtime?"Srujan Jabbireddy
04/02/2024, 5:57 PMif __name__ == "__main__":
flow.from_source(
source=GitRepository(
url="<https://github.com/dataengineering-workflow-orchestration.git>",
branch="main",
credentials={
"access_token": Secret.load("github-access-token-srujan-pat") #Secret.load("github-access-token")
}
),
entrypoint="flows/extraction/main.py:records_to_database",
).deploy(
name="private-git-storage-deployment",
# build=False,
work_pool_name="abcd-prefect-managed-pool",
cron="0 */4 * * *",
)
Earlier when I ran through EC2, I never mentioned the if __name__ == "__main__".
I used prefect deploy build.
So now with prefect cloud work pool, should I mention the above code?