https://prefect.io logo
#prefect-community
Title
# prefect-community
m

Mars

05/03/2022, 2:29 PM
Hi all, I’m having trouble diagnosing a GitHub storage problem. I’ve created a trivial testing flow similar to the example script-based workflow for GitHub. I’ve deployed a k8s agent using
prefect k8s agent install
. I’ve uploaded my flow to a private GitHub repo and registered it with Prefect. And I’ve added a Cloud Secret called
GITHUB_ACCESS_TOKEN
that holds a valid GitHub personal access token. When I run my flow the agent’s GitHub storage gives me an
UnknownObjectException(404, 'Not Found')
error. If I change the flow to use a different Cloud Secret key for the PAT, such as
access_token_secret='MYKEY'
, then the agent tells me
ValueError('Local Secret "MYKEY" was not found.')
. How can I introspect the kubernetes agent to verify that the GitHub PAT secret is being loaded from Prefect Cloud correctly?
k

Kevin Kho

05/03/2022, 2:34 PM
It’s weird that it’s looking for local secrets. Do you have an env variable
PREFECT___CLOUD____USE__LOCAL_SECRETS
set to True? 404 is the repo couldn’t be found I think. Could you show me what your Github storage looks like?
Do you have a private hosted version of Github by chance?
m

Mars

05/03/2022, 2:38 PM
I have not set the
PREFECT___CLOUD____USE__LOCAL_SECRETS
key. It’s set to whatever the default value is. I’m looking at the agent logs in the pod but there isn’t much there: it mentions the labels but doesn’t display any startup configuration.
k

Kevin Kho

05/03/2022, 2:40 PM
You can add it explicitly and get it to False, but the setup with
Copy code
GITHUB_ACCESS_TOKEN
seeme fine
m

Mars

05/03/2022, 2:40 PM
no, we don’t use hosted GitHub. I’m using a private repo. I verified the PAT by starting a clean docker container and cloning the repo there: git asked for the username and PAT as expected, so I assume the PAT works as expected.
k

Kevin Kho

05/03/2022, 2:40 PM
Can you show me what the Github storage definition looks like? Just redact any sensitive info
m

Mars

05/03/2022, 2:42 PM
Copy code
storage = GitHub(
    repo="myorg/sandbox",
    path="flows/check_repo_access.py",
)

# Taken from the 'script-based flows' example
with Flow("check-repo-access", storage=storage) as flow:
    data = get_data()
    print_value(data)
k

Kevin Kho

05/03/2022, 2:44 PM
Just making sure, you already uploaded the file at that path right? Do you have other branches in the repo?
m

Mars

05/03/2022, 2:44 PM
This give me the `ValueError`:
Copy code
storage = GitHub(
    repo="myorg/sandbox",
    path="flows/check_repo_access.py",
    access_token_secret="GITHUB_ACCESS_TOKEN",
)
or this:
Copy code
storage = GitHub(
    repo="myorg/sandbox",
    path="flows/check_repo_access.py",
    access_token_secret="MYKEY",
)
yes, I have uploaded the code to the repo
I’ve been trying to verify that the agent is: a) loading the flow I want, and b) has access to the resources (storage, secrets) that the flow requires.
does the agent have a debug log setting that may dump more information about its state?
k

Kevin Kho

05/03/2022, 2:49 PM
So I think this is a big hard to debug, the first thing I can suggest is to speed up the testing by removing the need for registration. You can debug by doing:
Copy code
with Flow(..) as flow:
    ...

storage =...
flow.storage = ...

storage.get_flow(flow_name)
and this
get_flow
is what the agent calls. For agent settings, you can add
Copy code
--show-flow-logs
and
Copy code
--log-level=DEBUG
upon agent start
m

Mars

05/03/2022, 3:17 PM
This is even weirder: I tried to verify the access to Prefect Cloud Secrets using a trivial flow loaded from S3 storage. That flow prints out a trivial cloud secret,
FOO
. When I run that flow the agent gives me a pickle error about the flow running Python 3.7.13. However I deployed the agent image
prefecthq/prefect:1.2.0-python3.9
and I verified that the Python version in the agent is 3.9.12. Looks like a bug with the flow execution environment?
Copy code
import prefect
from prefect import Flow, task
from prefect.run_configs import UniversalRun
from prefect.storage import S3
from prefect.tasks.secrets import PrefectSecret


@task
def print_value(secret):
    logger = prefect.context.get("logger")
    <http://logger.info|logger.info>(f"value: {secret}")


with Flow(
        "print-cloud-secret",
        storage=S3(bucket="my-bucket"),
        run_config=UniversalRun(labels=[])) \
        as flow:
    s = PrefectSecret("FOO")
    print_value(s)


if __name__ == "__main__":
    flow.run()
Copy code
└── 11:06:54 | INFO    | Entered state <Failed>: Failed to load and execute flow run: FlowStorageError("An error occurred while unpickling the flow:\n  TypeError('code() takes at most 15 arguments (16 given)')\nThis may be due to one of the following version mismatches between the flow build and execution environments:\n  - python: (flow built with '3.10.4', currently running with '3.7.13')")
└── 11:06:54 | ERROR   | Failed to load and execute flow run: FlowStorageError("An error occurred while unpickling the flow:\n  TypeError('code() takes at most 15 arguments (16 given)')\nThis may be due to one of the following version mismatches between the flow build and execution environments:\n  - python: (flow built with '3.10.4', currently running with '3.7.13')")
k

Kevin Kho

05/03/2022, 3:19 PM
The agent image is not carried over to the Flow image. The Kubernetes, Docker, and ECS agents allow you to specify a container through the RunConfiguration. So if you don’t specify, 3.7 will be the default
🙏 1
m

Mars

05/03/2022, 3:24 PM
and I configured the flow with UniversalRun, so it used the default image 3.7 image.
k

Kevin Kho

05/03/2022, 3:24 PM
It used
prefecthq/prefect:latest
which is 3.7 by default yep 🙂. You’d need to choose a differently tagged image (I am not sure we support 3.10)
m

Mars

05/03/2022, 3:30 PM
fixed the “print-cloud-secret” flow. It runs, however it fails with
ValueError: Local Secret "FOO" was not found.
That means that the call to
PrefectSecret("FOO")
is not pulling values from the cloud vault. Why would that be?
FOO
is a valid key under https://cloud.prefect.io/team/secrets
k

Kevin Kho

05/03/2022, 3:31 PM
Your backend is cloud right?
Are you using
flow.run()
somehow? Because
flow.run()
will attempt to pull locally
m

Mars

05/03/2022, 3:31 PM
the backend is cloud yes. How do I verify that?
k

Kevin Kho

05/03/2022, 3:32 PM
That should be the default unless you have an environment variable that changes it so server. Do you have a
config.toml
somewhere?
m

Mars

05/03/2022, 3:32 PM
I’m not using
flow.run()
except in a
if name == main
block:
Copy code
with Flow(
        "print-cloud-secret",
        storage=S3(bucket="my-bucket"),
        run_config=UniversalRun(labels=[])) \
        as flow:
    s = PrefectSecret("FOO")
    print_value(s)


if __name__ == "__main__":
    flow.run()
k

Kevin Kho

05/03/2022, 3:32 PM
But anyway you can force the cloud pull with
Copy code
PREFECT__CLOUD__USE_LOCAL_SECRETS=false
m

Mars

05/03/2022, 3:33 PM
no, I do not use a config.toml
k

Kevin Kho

05/03/2022, 3:33 PM
Can you try KubernetesRun and see if it persists?
Copy code
with Flow(
        "print-cloud-secret",
        storage=S3(bucket="my-bucket"),
        run_config=KubernetesRun(image="...", labels=[])) \
        as flow:
    s = PrefectSecret("FOO")
    print_value(s)
You are on Kubernetes right?
m

Mars

05/03/2022, 3:34 PM
sorry, outdated paste. here is what I tried:
Copy code
import prefect
from prefect import Flow, task
from prefect.run_configs import UniversalRun, KubernetesRun
from prefect.storage import S3
from prefect.tasks.secrets import PrefectSecret


@task
def print_value(secret):
    logger = prefect.context.get("logger")
    <http://logger.info|logger.info>(f"value: {secret}")


with Flow(
        "print-cloud-secret",
        storage=S3(bucket="my-bucket"),
        run_config=KubernetesRun(
            image="prefecthq/prefect:1.2.0-python3.9",
            image_pull_policy="IfNotPresent",
            labels=[])) \
        as flow:
    s = PrefectSecret("FOO")
    print_value(s)


if __name__ == "__main__":
    flow.run()
k

Kevin Kho

05/03/2022, 3:37 PM
That is pretty weird, you can set the env var from here too to make it use Cloud secrets:
Copy code
with Flow(
        "print-cloud-secret",
        storage=S3(bucket="my-bucket"),
        run_config=KubernetesRun(
            image="prefecthq/prefect:1.2.0-python3.9",
            image_pull_policy="IfNotPresent",
            env={"PREFECT__CLOUD__USE_LOCAL_SECRETS": False}
            labels=[])) \
        as flow:
    s = PrefectSecret("FOO")
    print_value(s)
m

Mars

05/03/2022, 3:38 PM
Yes I am on kubernetes. And I am executing the flow from the command line with:
Copy code
prefect run --name print-cloud-secret --watch
I verified that the job was run by the agent by checking the agent logs. I can see the failed run on the Kubernetes Agent dashboard in the Cloud UI.
I tried explicitly setting the value
env={"PREFECT__CLOUD__USE_LOCAL_SECRETS": False}
from the run_config. Still fails saying the local secret was not found.
k

Kevin Kho

05/03/2022, 3:42 PM
Ok let me double check some stuff
Wait sorry I’m confused. Why is the storage S3 there? Is that just wrong?
I also see now that Github storage will fall back to local if the Cloud pull fails here. So it looks like the Secret pull is failing in general. Wondering what the S3 usage is there
m

Mars

05/03/2022, 3:51 PM
I first asked for help with GitHub storage, because it was failing. I couldn’t verify that the agent had access to cloud secrets using the agent logs, so to narrow down the problem and verify the cloud secret pull I made a second trivial flow that pulls a cloud secret, using known-working S3 storage. That second, more trivial script with S3 storage also fails on the cloud secret pull step.
If I can’t pull cloud secrets using a flow with S3 storage, then I certainly can’t pull cloud secrets to get GitHub storage working, either.
Does that help?
k

Kevin Kho

05/03/2022, 3:56 PM
Yeah, can you try doing
Copy code
from prefect.client import Secret
Secret("GITHUB_ACCESS_TOKEN").get()
inside the pod?
m

Mars

05/03/2022, 3:56 PM
the agent pod?
from the agent pod:
Copy code
root@prefect-agent-69688996d-qltbc:/# python
Python 3.9.12 (main, Mar 29 2022, 14:20:48) 
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from prefect.client import Secret
>>> Secret("GITHUB_ACCESS_TOKEN").get()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.9/site-packages/prefect/client/secrets.py", line 140, in get
    raise ValueError(
ValueError: Local Secret "GITHUB_ACCESS_TOKEN" was not found.
k

Kevin Kho

05/03/2022, 4:02 PM
Ah sorry let’s add the env var to pull from Cloud. It’s injected when you run the flow:
Copy code
import os
os.environ["PREFECT__CLOUD__USE_LOCAL_SECRETS"] = "false"

from prefect.client import Secret
Secret("GITHUB_ACCESS_TOKEN").get()
m

Mars

05/03/2022, 4:04 PM
Same error
Copy code
>>> import os
>>> os.environ["PREFECT__CLOUD__USE_LOCAL_SECRETS"] = "false"
>>> 
>>> from prefect.client import Secret
>>> Secret("GITHUB_ACCESS_TOKEN").get()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.9/site-packages/prefect/client/secrets.py", line 140, in get
    raise ValueError(
ValueError: Local Secret "GITHUB_ACCESS_TOKEN" was not found.
k

Kevin Kho

05/03/2022, 4:04 PM
Testing myself one sec
I can’t figure out why it’s still pulling locally. If the secret doesn’t exist and you are pointed to cloud, it would be:
Copy code
KeyError: 'The secret SECRET_NAME was not found.  Please ensure that it was set correctly in your tenant: <https://docs.prefect.io/orchestration/concepts/secrets.html>'
It should only go there if you are not configured to hit Cloud as a backend
How about this?
Copy code
import os
os.environ["PREFECT__CLOUD__USE_LOCAL_SECRETS"] = "false"
os.environ["PREFECT__BACKEND"] = "cloud"

from prefect.client import Secret
print(Secret("GCP_CREDENTIALS").get())
m

Mars

05/03/2022, 4:14 PM
That explains it:
Copy code
>>> os.environ.get("PREFECT__BACKEND")
'server'
>>>
k

Kevin Kho

05/03/2022, 4:16 PM
I think….we can fix all our problems now 😅. Set the backend to Cloud. How did the agent pick up the Flow? It picked it up from server?
m

Mars

05/03/2022, 4:16 PM
However, setting the value to “cloud” in os.environ doesn’t work. I think I’ll have to redeploy the agent.
Copy code
>>> os.environ.get("PREFECT__BACKEND")
'server'
>>> import os
>>> os.environ["PREFECT__CLOUD__USE_LOCAL_SECRETS"] = "false"
>>> os.environ["PREFECT__BACKEND"] = "cloud"
>>> 
>>> from prefect.client import Secret
>>> print(Secret("GCP_CREDENTIALS").get())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.9/site-packages/prefect/client/secrets.py", line 140, in get
    raise ValueError(
ValueError: Local Secret "GCP_CREDENTIALS" was not found.
k

Kevin Kho

05/03/2022, 4:16 PM
Yeah let’s try that
m

Mars

05/03/2022, 4:17 PM
it picked up the flow from cloud as far as I’m aware. is there debug output from
prefect run
that I can verify with?
k

Kevin Kho

05/03/2022, 4:17 PM
I don’t think so but you could just check the Cloud UI and see the runs you’re expecting?
m

Mars

05/03/2022, 4:18 PM
yes, the runs are in the cloud UI
k

Kevin Kho

05/03/2022, 4:22 PM
That’s very weird. I am guessing you have something like:
Copy code
PREFECT__BACKEND="server"
PREFECT__SERVER__ENDPOINT="<http://api.prefect.io|api.prefect.io>"
which is why it works.
But the code for the secret uses the backend to identify to pull cloud/local so that specifically fails
m

Mars

05/03/2022, 5:06 PM
so I think what happened is that I ran
prefect agent kubernetes install
before I had run
prefect backend cloud
to switch it over. That
server
setting ended up in the manifest that I deployed the agent with. I switched the backend on the CLI later but I didn’t regenerate the manifest from scratch.
k

Kevin Kho

05/03/2022, 5:08 PM
Ahh I see. Were you able to figure things out?
m

Mars

05/03/2022, 5:14 PM
now it’s working. thanks! may I suggest putting a check on the UI side for agent configuration parameters and warning if a
server
agent is in the Cloud UI? A big red warning label to say
your agent is probably misconfigured
would have saved a lot of time.
or comparing the API url: if I have a
server
backend and
<http://cloud.prefect.io|cloud.prefect.io>
API URL then something is probably wrong
and that could be printed in the agent startup logs or the Cloud UI
(or both)
k

Kevin Kho

05/03/2022, 5:16 PM
Nice work and thanks for the patience! That was roundabout. I can certainly open an issue for it, but to be honest this is the first time I’ve seen it happen so it’s not a high priority thing 😅. But I can open it and it’s a good first issue for someone to pick up 🙂
1
60 Views