Hi all, I wonder can someone help me make sense of...
# ask-community
j
Hi all, I wonder can someone help me make sense of some errors I am encountering with gcp auth. I am trying to store gcp credentials on the agent as described here so that my tasks are able to use google storage/big query. I am setting PREFECT__CONTEXT__SECRETS__GCP_CREDENTIALS on the agent via a helm chart to a string containing the json credentials. This seems to propagate GOOGLE_APPLICATION_CREDENTIALS to each prefect job, and the creds are different to what I set on the agent, but this var is set to the json contents rather than a path in the container containing the credentials. This causes errors for the prefect google utilities and the google api in python. I can hack a fix for this by running something like the following but I am wondering if this is expected behaviour or I am setting up the agent incorrectly?
Copy code
from pathlib import Path
import tempfile
import os
import google.auth

creds = Path(tempfile.NamedTemporaryFile().name)
creds.write_text(os.environ["GOOGLE_APPLICATION_CREDENTIALS"])
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = str(creds)
google.auth.default()
k
Hey @John Lee, will try this myself and check with the team on this
Are you on server or Cloud?
So when Prefect gets the Client, it passes the credentials from
GCP_CREDENTIALS
. It would only default to
GOOGLE_APPLICATION_CREDENTIALS
is this is missing (lines 32-33 here). So if you provide the secret, I don’t think it should hit that, unless you use another client inside your script or maybe use your own GCP task?
If you have to provide your
GCP_CREDENTIALS
and the
GOOGLE_APPLICATION_CREDENTIALS
as a file, it does seem weird and I would honestly just go the
GOOGLE_APPLICATION_CREDENTIALS
cuz all the Prefect code in the task library just uses the Google
Client
anyway which will fall back to that. No sense to use both for sure.
v
Hi @Kevin Kho, similar situation as above, but I wanted to check something. If the
PREFECT__CONTEXT__SECRETS__GCP_CREDENTIALS
  variable is set in the helm chart for the kubernetes agent, does it need to be directly set in the agent start arg as well? (e.g
prefect agent kubernets start -e PREFE...GCP_CREDENTIALS=$...
) or it will be directly recognized as a secret?
k
Do you mean specifying with
--env PREFECT__CONTEXT__SECRETS__GCP_CREDENTIALS=….
?
v
yes
I thought that was the case (grab directly fron env), but I could't spot the values when executing
credentials = prefect.context.get("secrets", {}).get("GCP_CREDENTIALS")
inside the flow pod
k
I think this should work without setting it on the agent. I say this because environment variables with
PREFECT___CONTEXT___…
are loaded into the context and the Secret will be held in the context. In general though, not all env variables are copied cover. Also, environment variables are not copied over to Dask workers, but the prefect context is. So if it is already in
prefect.context.secrets
, I think it will make it to the Dask worker.
v
I see, so there is a chance that the secret might not be passed to the context?
k
Do you use Cloud secrets as well? Maybe you can try setting
"PREFECT__CLOUD__USE_LOCAL_SECRETS" = "true"
and see if that helps? Secret.get() will look locally. Also, I think the syntax would be
Secret("GCP_CREDENTIALS").get()
if you want it to pull the environment variable?
v
"PREFECT__CLOUD__USE_LOCAL_SECRETS" = "true"
 and see if that helps?
Amazing, this seems very reasonable too. I will try that out. Uhm, I will try using Secrets as well. thanks
k
Actually, you might indeed need the env flag because your agent is not the same pod that the Flow runs in.
I just read the docs again
v
oh nice!! thanks, I read through it but didn't spot that detail
thanks! I will try that out and see how it goes
j
Thanks for the help with this @Kevin Kho. We are using the cloud UI. The GOOGLE_APPLICATION_CREDENTIALS was a red herring (I had originally tried this and then remove that but it leaked back in during a rebase). The error I was seeing was occurring because GCP_CREDENTIALS was not being set as we expected. The env variable was being set on the k8s agent and not on the flow pod. We want to set credentials as part of the deployment (via the agent instead of the web ui) so will pursue the
--env
option.
👍 1
v
Hi @Kevin Kho, it's me again, thanks for the overall discussion last time, I was able to solve that problem. Unfortunately I'm in a situation where I need some google authentication in my flow pipeline, I got the impression when reading the docs that once I've submitted the GCP_CREDENTIALS as a secret to Prefect I would have all necessary Gcloud auth automatically set, but that didn't seemed to be that case when I tested. Do you know, or have an idea, of what should be the best approach to do such authentication for every flow? I had the idea of reaching out to the prefect Secrets, as a mounted service, and store the necessary credentials in it.. is that possible?
k
Are you using some Client like the Bigquery Client and it’s not working? What error do you get? Most of the tasks do use GCP_CREDENTIALS
v
yup, I've tried that approach as well. And it didn't work
Most of the tasks do use GCP_CREDENTIALS
That's one of my doubts, do we need to use the GCP task in order to access the GCP secret for the run? or just running a simple task will be enough?
as a toy example that explains my pipeline process: • we set the gcp secret under
-e
arg for the agent start command • we then succeed in authenticating the flow, using prefect login (it's other credentials file) and send the flow pipeline to prefct.io • but, the real problem is, for instance, executing a pd.read_csv or using the client in the flow ends up in a error. We were getting anonymous caller insufficient permission to access Google cloud storage
I was able to solve the above error using the GCSFileSystem directly in the pipeline, but it only worked for a specific part of my pipeline. So if I could get a better understanding about how prefect handles the gcp secret and authenticates it, or maybe another way around of doing it.. it would be immensely helpful
k
You need to use the GCP Task because the underlying code does some if else to use the GCP_CREDENTIALS. If you use the GCP Client, it GCFS, I think it looks for GOOGLE_APPLICATION_CREDENTIALS. GCP_CREDENTIALS is a Prefect thing. GOOGLE_APPLICATION_CREDENTIALS is the default GCP thing.
I think all the Prefect GCP Tasks use this utility function under the hood to authenticate
If you pass the env variable through the
-e
flag. It gets added to the context and the
prefect.context.get("secrets", {}).get("GCP_CREDENTIALS")
will be able to grab it and pass it to the
Client
. Otherwise, it falls back to this line, and the default GCP Client falls back to
GOOGLE_APPLICATION_CREDENTIALS
. This is my current understanding. John here might know way more
v
I see, my current understanfig is the same as you explainad, so it's good to know that it all makes sense as weel 😄 .
Thanks for providing the links for the client functions there, it gave me an idea. Besides that, do you think it's possible to add in the get_google_client an optional set_google_application_credentials phase as well? I think it would help syncing it with different tools that support it under the hood, e.g pandas
k
get_google_client
can take in
credentials
and use those instead of the secret. So you might be able to import that function and use it to create a Client by passing credentials directly?
v
yup, that's the current workaround I did to get some of the process working
k
Sorry, I’m not following. What more would a
set_google_application_credentials
phase give you?
v
also, not related to credentials... but is there a reason why there is not enable option for the different resources in the prefect-server helm charts?
k
That…you would need to post in community because my kubernetes is not good enough to answer. I’d need to find another team member. 😅
v
hshs, no problem I was about to create an issue for that as well. Thank you so much for the help 😄
k
oh yeah issue might be better. things are a bit hectic this week, so there may be a delay in a response but if there is none in a couple of days, you can ping me and i can follow up
v
wonderful, thanks 👍
👍 1