b

    Brian Mesick

    1 year ago
    Hey all. We’re trying to move to github storage for our flows, but don’t use Prefect secrets. Is there a way we could set the Github access token to make this work?
    Michael Adkins

    Michael Adkins

    1 year ago
    Hey @Brian Mesick --
    GithubStorage
    will try to pull an access token from the
    GITHUB_ACCESS_TOKEN
    environment variable if a secret name isn't set.
    b

    Brian Mesick

    1 year ago
    Hmm ok. It’s stored in our own Vault, but maybe we can do something in K8s land to expose it.
    Michael Adkins

    Michael Adkins

    1 year ago
    You can also set secrets locally (without interacting with Prefect) using environment variables
    PREFECT__CONTEXT__SECRETS__<SECRET_NAME>=<SECRET_VALUE>
    b

    Brian Mesick

    1 year ago
    Ah I right, I remember now. Thanks!
    t

    Troy Sankey

    1 year ago
    Hi, I'm also on Brian's team. I was just curious about a possible modification to these script-based storage classes to allow them to somehow invoke python to fetch a secret. For instance, the PrefectSecret task by default doesn't support Vault, but we have been able to work around this by creating our own VaultKVSecret task (extends SecretBase) and simply using that in place of PrefectSecret: https://github.com/edx/edx-prefectutils/blob/master/edx_prefectutils/vault_secrets.py I saw that the Github storage class uses Secret() under the hood, and I was wondering if we could eventually do a similar thing and just extend the Secret class or something similar to what we've done with VaultKVSecret: https://github.com/PrefectHQ/prefect/blob/9774893720570180e6c3a458919ccbe40371aa30/src/prefect/storage/github.py#L161
    we'd like to invest in as few different ways as possible to fetch secrets from vault because it is such a sensitive code path, and having a central place, like our VaultKVSecret class, to consolidate that logic is aligned with that goal.
    Michael Adkins

    Michael Adkins

    1 year ago
    I believe you could patch the
    Secret
    class to use vault instead and just run the agent on your modified version of Prefect. I'll open an issue for first-class support for custom secret loading at flow setup time.
    @Marvin open "Allow custom secret classes to be used during flow deployment process"
    Marvin

    Marvin

    1 year ago
    t

    Troy Sankey

    1 year ago
    thanks!
    Jazib Humayun

    Jazib Humayun

    1 year ago
    Hi, I am also from Brian's team, I want to ask two questions: 1- We are trying to use GitHub storage using the following syntax
    flow.storage = GitHub(repo="my/repo", path="/flows/flow.py")
    but when I add a new task in my existing flow Prefect flows and push it to GitHub my Prefect-flow fails
    File "/usr/local/lib/python3.8/site-packages/prefect/client/client.py", line 445, in _request
        response = self._send_request(
      File "/usr/local/lib/python3.8/site-packages/prefect/client/client.py", line 374, in _send_request
        raise ClientError(f"{exc}\n{graphql_msg}") from exc
    prefect.utilities.exceptions.ClientError: 400 Client Error: Bad Request for url: <https://api.prefect.io/graphql>
    
    The following error messages were provided by the GraphQL server:
    
        INTERNAL_SERVER_ERROR: Variable "$input" got invalid value null at
            "input.states[0].task_run_id"; Expected non-nullable type UUID! not to be null.
    
    The GraphQL query was:
    
        mutation($input: set_task_run_states_input!) {
                set_task_run_states(input: $input) {
                    states {
                        status
                        id
                        message
                }
            }
        }
    
    The passed variables were:
    
        {"input": {"states": [{"state": {"_result": {"__version__": "0.14.5", "type": "NoResultType"}, "cached_inputs": {}, "message": "Starting task run.", "context": {"tags": []}, "__version__": "0.14.5", "type": "Running"}, "task_run_id": null, "version": null}]}}
    2- We were using docker storage before and installing some python pip dependencies while using
    prefecthq/prefect:latest-python3.
    base image. Now I am trying to use Github storage should I remove the Docker storage ? if yes how the python dependencies can be managed in GitHub storage, can we use both Docker storage and Github storage together ?
    b

    Brian Mesick

    1 year ago
    I think we should only have 1 storage object, but I’m also not clear on how github storage works with k8s
    Michael Adkins

    Michael Adkins

    1 year ago
    Re that error: I will add better error handling for that case. That message is entirely unhelpful 🙂. When you add a new task to your flow you'll need to call
    prefect register
    again so the DAG is registered with the backend. That error indicates that there's a task that we don't have a UUID for. You cannot use both types of storage, but you can use a
    Github
    storage with a
    DockerRun
    run config. Your flow script will be pulled from github then executed on the docker image. This is one of the better ways to set up a project. There's an example repo (WIP but still helpful) at https://github.com/jcrist/prefect-github-example If you do not specify a run config, your flow will run on K8s in the default prefect container. Storage does not determine how/where your flow is run.
    Jazib Humayun

    Jazib Humayun

    1 year ago
    Thanks @Michael Adkins
    Hey, one more question we were installing python dependencies in
    docker storage
    by passing those in
    python_dependencies
    can we do similar kind of thing with
    github storage
    ? I can't see
    python_dependencies
    in github storage @Michael Adkins
    Michael Adkins

    Michael Adkins

    1 year ago
    You'd install your python dependencies in a
    Dockerfile
    that you manage separately and reference with your
    DockerRun
    It's a bit more work than just passing them as a kwarg but it gives you a lot clearer control over your container
    Jazib Humayun

    Jazib Humayun

    1 year ago
    Michael Adkins

    Michael Adkins

    1 year ago
    Ah I forgot we added that! It's a sweet feature 😄
    Jazib Humayun

    Jazib Humayun

    1 year ago
    Thanks for the help @Michael Adkins I got more question for you, when we started using GitHub storage we also have prod.toml that we are referring in Makefile like this
    PREFECT__USER_CONFIG_PATH=prod.toml
    but due to change in behavior how make command works after GitHub storage now our flows throwing following error
    Failed to load and execute Flow's environment: BoxKeyError("'Config' object has no attribute 'snowflake'")
    Because
    config.snowflake
     should be coming from the 
    prod.toml
     can you tell how I can refer it in my flow ?
    b

    Brian Mesick

    1 year ago
    Hey Jazib, we’re actually discussing this over here… https://prefect-community.slack.com/archives/CL09KU1K7/p1621441872112000