<@ULVA73B9P> How to create Github block and save i...
# ask-marvin
s
@Marvin How to create Github block and save it using prefect-github package. Earlier I used the following
Copy code
from prefect.filesystems import S3, GCS, Azure, GitHub
gh = GitHub(
    repository="<https://github.com/data-engineering.git>",
    reference=os.environ.get("GITHUB_DATAPLATFORM_BRANCH", "main"),
    # access_token is needed for private repositories, supported in Prefect>=2.6.2
    access_token=os.environ.get("GITHUB_PERSONAL_ACCESS_TOKEN", DEFAULT_BLOCK),
)
save_block(gh)
m
It appears the search didn't find a direct example for creating and saving a GitHub block using the
prefect-github
package. However, I can deduce a method similar to that of managing other filesystem blocks like
S3
, based on familiarity with Prefect's block system. Typically, for filesystem blocks, you initialize a block with the necessary configuration parameters and then call the
.save()
method to save it. Presumably, GitHub block creation would follow a similar pattern. However, since I don't have information specific to saving a GitHub block via Prefect 2.x or the
prefect-github
package, I suggest you refer to Prefect's official documentation or another source for updated instructions. If a specific GitHub block interface is provided in Prefect 2.x or the
prefect-github
package, you would use that interface, filling in appropriate parameters such as the repository, reference, and access token, before saving the block. Given the lack of precise documentation returned, you might need to review any updated Prefect 2.x and
prefect-github
documentation or check for issues on the GitHub repository that provide relevant examples or guidance.
n
hi @Srujan Jabbireddy - what are you using the block for?
s
I am trying to create blocks and save those programmatically.
I don't want to create the blocks via UI. Because you know they can be changed and no version control on them
n
sorry, i mean why are you creating the blocks? what do you plan to use them for?
s
with respect to github, I want to access the pipeline code from repo and also I want to access the company's dbt repo
Earlier I used EC2 to orchestrate the pipelines. Now last week we subscribed to cloud, no pipeline works. It says no access to repo
n
i would not use the block from
prefect-github
how are you creating deployments?
prefect deploy
and a
prefect.yaml
or with something else
s
So there is no other option other than to create on UI with respect to github
Earlier I used prefect deploy. But that doesn't seem to work any more even when I changed the cloud work-pool
n
if you're using
prefect deploy
+
prefect.yaml
you do not use a
GitHub
block, you use a
git_clone
pull step
s
Thank you very much Nate. Will go through these resource as re-setup my repo
n
sure! feel free to reach out here if something isnt making sense
s
Would you also not recommend, creating block with respect to snowflake or other integrations?
n
blocks for integrations like
SnowflakeConnection
are still perfectly valid, I just mean you do not need blocks specifically for code storage anymore
s
understood
n
basically previously you had
prefect deployment build path.py:flow --infra-block k8s/someblock --storage-block s3/some-block
and now instead you say
Copy code
prefect deploy /path/to/file.py:my_flow --pool my-work-pool
where
my-work-pool
is now the place where you config your infra (supersedes infra blocks like
ECSTask
KubernetesJob
etc) and a
pull
step takes the role of the storage block, ie "where to get flow code at runtime?"
s
And does that mean, should I mention the following in the main code.
Copy code
if __name__ == "__main__":
    flow.from_source(
        source=GitRepository(
        url="<https://github.com/dataengineering-workflow-orchestration.git>",
        branch="main",
        credentials={
            "access_token": Secret.load("github-access-token-srujan-pat") #Secret.load("github-access-token")
        }
    ),
    entrypoint="flows/extraction/main.py:records_to_database",
    ).deploy(
        name="private-git-storage-deployment",
        # build=False,
        work_pool_name="abcd-prefect-managed-pool",
        cron="0 */4 * * *",
    )
Earlier when I ran through EC2, I never mentioned the
if __name__ == "__main__".
I used prefect deploy build. So now with prefect cloud work pool, should I mention the above code?