Thread
#prefect-community
    Mars

    Mars

    4 months ago
    Hi all, I’m having trouble diagnosing a GitHub storage problem. I’ve created a trivial testing flow similar to the example script-based workflow for GitHub. I’ve deployed a k8s agent using
    prefect k8s agent install
    . I’ve uploaded my flow to a private GitHub repo and registered it with Prefect. And I’ve added a Cloud Secret called
    GITHUB_ACCESS_TOKEN
    that holds a valid GitHub personal access token. When I run my flow the agent’s GitHub storage gives me an
    UnknownObjectException(404, 'Not Found')
    error. If I change the flow to use a different Cloud Secret key for the PAT, such as
    access_token_secret='MYKEY'
    , then the agent tells me
    ValueError('Local Secret "MYKEY" was not found.')
    . How can I introspect the kubernetes agent to verify that the GitHub PAT secret is being loaded from Prefect Cloud correctly?
    Kevin Kho

    Kevin Kho

    4 months ago
    It’s weird that it’s looking for local secrets. Do you have an env variable
    PREFECT___CLOUD____USE__LOCAL_SECRETS
    set to True? 404 is the repo couldn’t be found I think. Could you show me what your Github storage looks like?
    Do you have a private hosted version of Github by chance?
    Mars

    Mars

    4 months ago
    I have not set the
    PREFECT___CLOUD____USE__LOCAL_SECRETS
    key. It’s set to whatever the default value is. I’m looking at the agent logs in the pod but there isn’t much there: it mentions the labels but doesn’t display any startup configuration.
    Kevin Kho

    Kevin Kho

    4 months ago
    You can add it explicitly and get it to False, but the setup with
    GITHUB_ACCESS_TOKEN
    seeme fine
    Mars

    Mars

    4 months ago
    no, we don’t use hosted GitHub. I’m using a private repo. I verified the PAT by starting a clean docker container and cloning the repo there: git asked for the username and PAT as expected, so I assume the PAT works as expected.
    Kevin Kho

    Kevin Kho

    4 months ago
    Can you show me what the Github storage definition looks like? Just redact any sensitive info
    Mars

    Mars

    4 months ago
    storage = GitHub(
        repo="myorg/sandbox",
        path="flows/check_repo_access.py",
    )
    
    # Taken from the 'script-based flows' example
    with Flow("check-repo-access", storage=storage) as flow:
        data = get_data()
        print_value(data)
    Kevin Kho

    Kevin Kho

    4 months ago
    Just making sure, you already uploaded the file at that path right? Do you have other branches in the repo?
    Mars

    Mars

    4 months ago
    This give me the ValueError:
    storage = GitHub(
        repo="myorg/sandbox",
        path="flows/check_repo_access.py",
        access_token_secret="GITHUB_ACCESS_TOKEN",
    )
    or this:
    storage = GitHub(
        repo="myorg/sandbox",
        path="flows/check_repo_access.py",
        access_token_secret="MYKEY",
    )
    yes, I have uploaded the code to the repo
    I’ve been trying to verify that the agent is: a) loading the flow I want, and b) has access to the resources (storage, secrets) that the flow requires.
    does the agent have a debug log setting that may dump more information about its state?
    Kevin Kho

    Kevin Kho

    4 months ago
    So I think this is a big hard to debug, the first thing I can suggest is to speed up the testing by removing the need for registration. You can debug by doing:
    with Flow(..) as flow:
        ...
    
    storage =...
    flow.storage = ...
    
    storage.get_flow(flow_name)
    and this
    get_flow
    is what the agent calls. For agent settings, you can add
    --show-flow-logs
    and
    --log-level=DEBUG
    upon agent start
    Mars

    Mars

    4 months ago
    This is even weirder: I tried to verify the access to Prefect Cloud Secrets using a trivial flow loaded from S3 storage. That flow prints out a trivial cloud secret,
    FOO
    . When I run that flow the agent gives me a pickle error about the flow running Python 3.7.13. However I deployed the agent image
    prefecthq/prefect:1.2.0-python3.9
    and I verified that the Python version in the agent is 3.9.12. Looks like a bug with the flow execution environment?
    import prefect
    from prefect import Flow, task
    from prefect.run_configs import UniversalRun
    from prefect.storage import S3
    from prefect.tasks.secrets import PrefectSecret
    
    
    @task
    def print_value(secret):
        logger = prefect.context.get("logger")
        <http://logger.info|logger.info>(f"value: {secret}")
    
    
    with Flow(
            "print-cloud-secret",
            storage=S3(bucket="my-bucket"),
            run_config=UniversalRun(labels=[])) \
            as flow:
        s = PrefectSecret("FOO")
        print_value(s)
    
    
    if __name__ == "__main__":
        flow.run()
    └── 11:06:54 | INFO    | Entered state <Failed>: Failed to load and execute flow run: FlowStorageError("An error occurred while unpickling the flow:\n  TypeError('code() takes at most 15 arguments (16 given)')\nThis may be due to one of the following version mismatches between the flow build and execution environments:\n  - python: (flow built with '3.10.4', currently running with '3.7.13')")
    └── 11:06:54 | ERROR   | Failed to load and execute flow run: FlowStorageError("An error occurred while unpickling the flow:\n  TypeError('code() takes at most 15 arguments (16 given)')\nThis may be due to one of the following version mismatches between the flow build and execution environments:\n  - python: (flow built with '3.10.4', currently running with '3.7.13')")
    Kevin Kho

    Kevin Kho

    4 months ago
    The agent image is not carried over to the Flow image. The Kubernetes, Docker, and ECS agents allow you to specify a container through the RunConfiguration. So if you don’t specify, 3.7 will be the default
    Mars

    Mars

    4 months ago
    and I configured the flow with UniversalRun, so it used the default image 3.7 image.
    Kevin Kho

    Kevin Kho

    4 months ago
    It used
    prefecthq/prefect:latest
    which is 3.7 by default yep 🙂. You’d need to choose a differently tagged image (I am not sure we support 3.10)
    Mars

    Mars

    4 months ago
    fixed the “print-cloud-secret” flow. It runs, however it fails with
    ValueError: Local Secret "FOO" was not found.
    That means that the call to
    PrefectSecret("FOO")
    is not pulling values from the cloud vault. Why would that be?
    FOO
    is a valid key under https://cloud.prefect.io/team/secrets
    Kevin Kho

    Kevin Kho

    4 months ago
    Your backend is cloud right?
    Are you using
    flow.run()
    somehow? Because
    flow.run()
    will attempt to pull locally
    Mars

    Mars

    4 months ago
    the backend is cloud yes. How do I verify that?
    Kevin Kho

    Kevin Kho

    4 months ago
    That should be the default unless you have an environment variable that changes it so server. Do you have a
    config.toml
    somewhere?
    Mars

    Mars

    4 months ago
    I’m not using
    flow.run()
    except in a
    if name == main
    block:
    with Flow(
            "print-cloud-secret",
            storage=S3(bucket="my-bucket"),
            run_config=UniversalRun(labels=[])) \
            as flow:
        s = PrefectSecret("FOO")
        print_value(s)
    
    
    if __name__ == "__main__":
        flow.run()
    Kevin Kho

    Kevin Kho

    4 months ago
    But anyway you can force the cloud pull with
    PREFECT__CLOUD__USE_LOCAL_SECRETS=false
    Mars

    Mars

    4 months ago
    no, I do not use a config.toml
    Kevin Kho

    Kevin Kho

    4 months ago
    Can you try KubernetesRun and see if it persists?
    with Flow(
            "print-cloud-secret",
            storage=S3(bucket="my-bucket"),
            run_config=KubernetesRun(image="...", labels=[])) \
            as flow:
        s = PrefectSecret("FOO")
        print_value(s)
    You are on Kubernetes right?
    Mars

    Mars

    4 months ago
    sorry, outdated paste. here is what I tried:
    import prefect
    from prefect import Flow, task
    from prefect.run_configs import UniversalRun, KubernetesRun
    from prefect.storage import S3
    from prefect.tasks.secrets import PrefectSecret
    
    
    @task
    def print_value(secret):
        logger = prefect.context.get("logger")
        <http://logger.info|logger.info>(f"value: {secret}")
    
    
    with Flow(
            "print-cloud-secret",
            storage=S3(bucket="my-bucket"),
            run_config=KubernetesRun(
                image="prefecthq/prefect:1.2.0-python3.9",
                image_pull_policy="IfNotPresent",
                labels=[])) \
            as flow:
        s = PrefectSecret("FOO")
        print_value(s)
    
    
    if __name__ == "__main__":
        flow.run()
    Kevin Kho

    Kevin Kho

    4 months ago
    That is pretty weird, you can set the env var from here too to make it use Cloud secrets:
    with Flow(
            "print-cloud-secret",
            storage=S3(bucket="my-bucket"),
            run_config=KubernetesRun(
                image="prefecthq/prefect:1.2.0-python3.9",
                image_pull_policy="IfNotPresent",
                env={"PREFECT__CLOUD__USE_LOCAL_SECRETS": False}
                labels=[])) \
            as flow:
        s = PrefectSecret("FOO")
        print_value(s)
    Mars

    Mars

    4 months ago
    Yes I am on kubernetes. And I am executing the flow from the command line with:
    prefect run --name print-cloud-secret --watch
    I verified that the job was run by the agent by checking the agent logs. I can see the failed run on the Kubernetes Agent dashboard in the Cloud UI.
    I tried explicitly setting the value
    env={"PREFECT__CLOUD__USE_LOCAL_SECRETS": False}
    from the run_config. Still fails saying the local secret was not found.
    Kevin Kho

    Kevin Kho

    4 months ago
    Ok let me double check some stuff
    Wait sorry I’m confused. Why is the storage S3 there? Is that just wrong?
    I also see now that Github storage will fall back to local if the Cloud pull fails here. So it looks like the Secret pull is failing in general. Wondering what the S3 usage is there
    Mars

    Mars

    4 months ago
    I first asked for help with GitHub storage, because it was failing. I couldn’t verify that the agent had access to cloud secrets using the agent logs, so to narrow down the problem and verify the cloud secret pull I made a second trivial flow that pulls a cloud secret, using known-working S3 storage. That second, more trivial script with S3 storage also fails on the cloud secret pull step.
    If I can’t pull cloud secrets using a flow with S3 storage, then I certainly can’t pull cloud secrets to get GitHub storage working, either.
    Does that help?
    Kevin Kho

    Kevin Kho

    4 months ago
    Yeah, can you try doing
    from prefect.client import Secret
    Secret("GITHUB_ACCESS_TOKEN").get()
    inside the pod?
    Mars

    Mars

    4 months ago
    the agent pod?
    from the agent pod:
    root@prefect-agent-69688996d-qltbc:/# python
    Python 3.9.12 (main, Mar 29 2022, 14:20:48) 
    [GCC 10.2.1 20210110] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> from prefect.client import Secret
    >>> Secret("GITHUB_ACCESS_TOKEN").get()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/local/lib/python3.9/site-packages/prefect/client/secrets.py", line 140, in get
        raise ValueError(
    ValueError: Local Secret "GITHUB_ACCESS_TOKEN" was not found.
    Kevin Kho

    Kevin Kho

    4 months ago
    Ah sorry let’s add the env var to pull from Cloud. It’s injected when you run the flow:
    import os
    os.environ["PREFECT__CLOUD__USE_LOCAL_SECRETS"] = "false"
    
    from prefect.client import Secret
    Secret("GITHUB_ACCESS_TOKEN").get()
    Mars

    Mars

    4 months ago
    Same error
    >>> import os
    >>> os.environ["PREFECT__CLOUD__USE_LOCAL_SECRETS"] = "false"
    >>> 
    >>> from prefect.client import Secret
    >>> Secret("GITHUB_ACCESS_TOKEN").get()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/local/lib/python3.9/site-packages/prefect/client/secrets.py", line 140, in get
        raise ValueError(
    ValueError: Local Secret "GITHUB_ACCESS_TOKEN" was not found.
    Kevin Kho

    Kevin Kho

    4 months ago
    Testing myself one sec
    I can’t figure out why it’s still pulling locally. If the secret doesn’t exist and you are pointed to cloud, it would be:
    KeyError: 'The secret SECRET_NAME was not found.  Please ensure that it was set correctly in your tenant: <https://docs.prefect.io/orchestration/concepts/secrets.html>'
    It should only go there if you are not configured to hit Cloud as a backend
    How about this?
    import os
    os.environ["PREFECT__CLOUD__USE_LOCAL_SECRETS"] = "false"
    os.environ["PREFECT__BACKEND"] = "cloud"
    
    from prefect.client import Secret
    print(Secret("GCP_CREDENTIALS").get())
    Mars

    Mars

    4 months ago
    That explains it:
    >>> os.environ.get("PREFECT__BACKEND")
    'server'
    >>>
    Kevin Kho

    Kevin Kho

    4 months ago
    I think….we can fix all our problems now 😅. Set the backend to Cloud. How did the agent pick up the Flow? It picked it up from server?
    Mars

    Mars

    4 months ago
    However, setting the value to “cloud” in os.environ doesn’t work. I think I’ll have to redeploy the agent.
    >>> os.environ.get("PREFECT__BACKEND")
    'server'
    >>> import os
    >>> os.environ["PREFECT__CLOUD__USE_LOCAL_SECRETS"] = "false"
    >>> os.environ["PREFECT__BACKEND"] = "cloud"
    >>> 
    >>> from prefect.client import Secret
    >>> print(Secret("GCP_CREDENTIALS").get())
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/local/lib/python3.9/site-packages/prefect/client/secrets.py", line 140, in get
        raise ValueError(
    ValueError: Local Secret "GCP_CREDENTIALS" was not found.
    Kevin Kho

    Kevin Kho

    4 months ago
    Yeah let’s try that
    Mars

    Mars

    4 months ago
    it picked up the flow from cloud as far as I’m aware. is there debug output from
    prefect run
    that I can verify with?
    Kevin Kho

    Kevin Kho

    4 months ago
    I don’t think so but you could just check the Cloud UI and see the runs you’re expecting?
    Mars

    Mars

    4 months ago
    yes, the runs are in the cloud UI
    Kevin Kho

    Kevin Kho

    4 months ago
    That’s very weird. I am guessing you have something like:
    PREFECT__BACKEND="server"
    PREFECT__SERVER__ENDPOINT="<http://api.prefect.io|api.prefect.io>"
    which is why it works.
    But the code for the secret uses the backend to identify to pull cloud/local so that specifically fails
    Mars

    Mars

    4 months ago
    so I think what happened is that I ran
    prefect agent kubernetes install
    before I had run
    prefect backend cloud
    to switch it over. That
    server
    setting ended up in the manifest that I deployed the agent with. I switched the backend on the CLI later but I didn’t regenerate the manifest from scratch.
    Kevin Kho

    Kevin Kho

    4 months ago
    Ahh I see. Were you able to figure things out?
    Mars

    Mars

    4 months ago
    now it’s working. thanks! may I suggest putting a check on the UI side for agent configuration parameters and warning if a
    server
    agent is in the Cloud UI? A big red warning label to say
    your agent is probably misconfigured
    would have saved a lot of time.
    or comparing the API url: if I have a
    server
    backend and
    <http://cloud.prefect.io|cloud.prefect.io>
    API URL then something is probably wrong
    and that could be printed in the agent startup logs or the Cloud UI
    (or both)
    Kevin Kho

    Kevin Kho

    4 months ago
    Nice work and thanks for the patience! That was roundabout. I can certainly open an issue for it, but to be honest this is the first time I’ve seen it happen so it’s not a high priority thing 😅. But I can open it and it’s a good first issue for someone to pick up 🙂