Hey all. We’re trying to move to github storage fo...
# ask-community
b
Hey all. We’re trying to move to github storage for our flows, but don’t use Prefect secrets. Is there a way we could set the Github access token to make this work?
z
Hey @Brian Mesick --
GithubStorage
will try to pull an access token from the
GITHUB_ACCESS_TOKEN
environment variable if a secret name isn't set.
🤔 1
b
Hmm ok. It’s stored in our own Vault, but maybe we can do something in K8s land to expose it.
z
You can also set secrets locally (without interacting with Prefect) using environment variables
PREFECT__CONTEXT__SECRETS__<SECRET_NAME>=<SECRET_VALUE>
👍 1
b
Ah I right, I remember now. Thanks!
t
Hi, I'm also on Brian's team. I was just curious about a possible modification to these script-based storage classes to allow them to somehow invoke python to fetch a secret. For instance, the PrefectSecret task by default doesn't support Vault, but we have been able to work around this by creating our own VaultKVSecret task (extends SecretBase) and simply using that in place of PrefectSecret: https://github.com/edx/edx-prefectutils/blob/master/edx_prefectutils/vault_secrets.py I saw that the Github storage class uses Secret() under the hood, and I was wondering if we could eventually do a similar thing and just extend the Secret class or something similar to what we've done with VaultKVSecret: https://github.com/PrefectHQ/prefect/blob/9774893720570180e6c3a458919ccbe40371aa30/src/prefect/storage/github.py#L161
we'd like to invest in as few different ways as possible to fetch secrets from vault because it is such a sensitive code path, and having a central place, like our VaultKVSecret class, to consolidate that logic is aligned with that goal.
z
I believe you could patch the
Secret
class to use vault instead and just run the agent on your modified version of Prefect. I'll open an issue for first-class support for custom secret loading at flow setup time.
@Marvin open "Allow custom secret classes to be used during flow deployment process"
t
thanks!
j
Hi, I am also from Brian's team, I want to ask two questions: 1- We are trying to use GitHub storage using the following syntax
Copy code
flow.storage = GitHub(repo="my/repo", path="/flows/flow.py")
but when I add a new task in my existing flow Prefect flows and push it to GitHub my Prefect-flow fails
Copy code
File "/usr/local/lib/python3.8/site-packages/prefect/client/client.py", line 445, in _request
    response = self._send_request(
  File "/usr/local/lib/python3.8/site-packages/prefect/client/client.py", line 374, in _send_request
    raise ClientError(f"{exc}\n{graphql_msg}") from exc
prefect.utilities.exceptions.ClientError: 400 Client Error: Bad Request for url: <https://api.prefect.io/graphql>

The following error messages were provided by the GraphQL server:

    INTERNAL_SERVER_ERROR: Variable "$input" got invalid value null at
        "input.states[0].task_run_id"; Expected non-nullable type UUID! not to be null.

The GraphQL query was:

    mutation($input: set_task_run_states_input!) {
            set_task_run_states(input: $input) {
                states {
                    status
                    id
                    message
            }
        }
    }

The passed variables were:

    {"input": {"states": [{"state": {"_result": {"__version__": "0.14.5", "type": "NoResultType"}, "cached_inputs": {}, "message": "Starting task run.", "context": {"tags": []}, "__version__": "0.14.5", "type": "Running"}, "task_run_id": null, "version": null}]}}
2- We were using docker storage before and installing some python pip dependencies while using
prefecthq/prefect:latest-python3.
base image. Now I am trying to use Github storage should I remove the Docker storage ? if yes how the python dependencies can be managed in GitHub storage, can we use both Docker storage and Github storage together ?
b
I think we should only have 1 storage object, but I’m also not clear on how github storage works with k8s
z
Re that error: I will add better error handling for that case. That message is entirely unhelpful 🙂. When you add a new task to your flow you'll need to call
prefect register
again so the DAG is registered with the backend. That error indicates that there's a task that we don't have a UUID for. You cannot use both types of storage, but you can use a
Github
storage with a
DockerRun
run config. Your flow script will be pulled from github then executed on the docker image. This is one of the better ways to set up a project. There's an example repo (WIP but still helpful) at https://github.com/jcrist/prefect-github-example If you do not specify a run config, your flow will run on K8s in the default prefect container. Storage does not determine how/where your flow is run.
👍 1
j
Thanks @Zanie
Hey, one more question we were installing python dependencies in
docker storage
by passing those in
python_dependencies
can we do similar kind of thing with
github storage
? I can't see
python_dependencies
in github storage @Zanie
z
You'd install your python dependencies in a
Dockerfile
that you manage separately and reference with your
DockerRun
It's a bit more work than just passing them as a kwarg but it gives you a lot clearer control over your container
j
z
Ah I forgot we added that! It's a sweet feature 😄
j
Thanks for the help @Zanie I got more question for you, when we started using GitHub storage we also have prod.toml that we are referring in Makefile like this
PREFECT__USER_CONFIG_PATH=prod.toml
but due to change in behavior how make command works after GitHub storage now our flows throwing following error
Copy code
Failed to load and execute Flow's environment: BoxKeyError("'Config' object has no attribute 'snowflake'")
Because
config.snowflake
 should be coming from the 
prod.toml
 can you tell how I can refer it in my flow ?
b
Hey Jazib, we’re actually discussing this over here… https://prefect-community.slack.com/archives/CL09KU1K7/p1621441872112000
🙏 1