Hello here, I am new to Prefect and run my first f...
# ask-community
b
Hello here, I am new to Prefect and run my first flow in my local K8S cluster and use
KubernetesRun
. The flow is trying to download the
py
file from a local git repo in the same local K8S cluster. But I got an
SSL_VERIFICATION
error even though the
url/host
for
git storage
is
HTTP
. Is there anyone having similar issue?
a
Have you tried using Personal Access Token as an authentication method for your GitHub storage? This is more secure and much easier to manage
what kind of Git server are you using - is it self-hosted? is it accessible from your Kubernetes cluster? you could test it by running a simple container in this Kubernetes cluster doing git clone from this repo to test the permissions Those docs pages may be helpful: • https://docs.prefect.io/orchestration/flow_config/storage.html#githttps://docs.prefect.io/orchestration/flow_config/storage.html#ssh-git-storage
b
Thanks Anna for the docs, it is not github. It is gitea, self hosted open source git repo. It is also in the same k8s local cluster.
the repo is public so there is no need for any token,
Copy code
flow.storage = Git(repo="flows", flow_path="prefect_flow_1.py", repo_host="repo.default.svc.cluster.local")
there is no ssl verification switch in the
dulwich.porcelain
prefect agent and git repo are in same namespace just to keep it simple
k
What does your clone url look like?
b
storage def in flow
Copy code
flow.storage = Git(repo="flows", flow_path="prefect_flow_1.py", repo_host="repo.default.svc.cluster.local:3000")
Error message
Copy code
Failed to load and execute flow run: MaxRetryError("HTTPSConnectionPool(host='repo.default.svc.cluster.local', port=3000): Max retries exceeded with url: /flows.git/info/refs?service=git-upload-pack (Caused by SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:1091)')))")
I even try to change the repo_host to
<http://repo.default.svc.cluster.local:3000>
but it always tries to use HTTPS regardless of selection here.
I think behind the scene in
dulwich.porcelain
uses
requests
without
verify=True|False
switch, at least what I see in prefect repo.
k
I think https comes from the clone url here but you can just supply the whole clone_url yourself by making a secret and then Prefect will just use that secret to fetch the clone_url to clone the repo
b
As I understand I need to maintain the local secrets as described here, is that right? https://docs.prefect.io/orchestration/concepts/secrets.html#setting-local-secrets
According to documentation, it says it is only available for local runs. I am using Kubernetes agent, is there any documentation for kubernetes agent?
Copy code
Note that this configuration only affects the environment in which it's configured. So if you set values locally, they'll affect flows run locally or via a local agent, but not flows deployed via other agents (since those flow runs happen in a different environment). To set local secrets on flow runs deployed by an agent, you can use the --env flag to forward environment variables into the flow run environment.
Should I maintain environment variables of agent instead of
~/.prefect/config.toml
?
k
Yes to local secret, or store it in Prefect Cloud. If you have a Kubernetes agent, you can use the
Copy code
prefect agent kubernetes install
which makes the template, but you can also do
Copy code
prefect agent kubernetes install --env PREFECT__CONTEXT__SECRETS__MYSECRET="mysecret"
and then have it populate in the template. Basically it’s likely easier to use the env var to create the secret like shown here If you need more than 1 secret, just repeat another
--env
call
And then you have a secret name and then do
Copy code
storage = Git(git_clone_url_secret_name="MYSECRET")
and then it will pull that secret to clone the repo during run time
b
👍
k
Are you on Server or Cloud?
b
let me try that
on Server
k
Yeah this is right
b
I used helm chart to deploy the agent
I will maintain the agent env variables in the values.yaml
It did not work, Here are my environment vars from kubernetes agent.
Copy code
IMAGE_PULL_POLICY:
IMAGE_PULL_SECRETS:
JOB_CPU_LIMIT:
JOB_CPU_REQUEST:
JOB_MEM_LIMIT:
JOB_MEM_REQUEST:
NAMESPACE: sandbox
PREFECT__BACKEND: server
PREFECT__CLOUD__AGENT__AGENT_ADDRESS: <http://0.0.0.0:8080>
PREFECT__CLOUD__AGENT__LABELS: ["prefect-k8s"]
PREFECT__CLOUD__API: <http://prefect-server-apollo.sandbox:4200/graphql>
PREFECT__CONTEXT__SECRETS__REPO_URL: <http://repo.default.svc.cluster.local:3000>
SERVICE_ACCOUNT_NAME: prefect-server-serviceaccount
but somehow getting this error
Failed to load and execute flow run: ValueError('Local Secret "REPO_URL" was not found.')
k
That looks pretty right. Stupid questions: 1. Was the agent redeployed? 2. Do you only have 1 agent the flow can do to?
b
yes, I even deleted the whole deployment and redeploy again
k
Can you try adding it to storage
Copy code
flow.storage = Git(..., secrets=["REPO_URL"])
This will tell context to include that secret
b
yes I have 1 agent
here is my storage
Copy code
flow.storage = Git(repo="flows", flow_path="prefect_flow_1.py",  git_clone_url_secret_name="REPO_URL")
shall I replace
git_clone_url_secret_name
with
secrets=["REPO_URL"]
k
No, use them together.
secrets
is another kwarg
b
trying
k
This is a more detailed list of stuff you can do. Yours look like that right (minus config.toml)?
b
Copy code
18:14:58
lens
WARNING
Git
Git storage initialized with a `git_clone_url_secret_name`. The value of this Secret will be used to clone the repository, ignoring `repo`, `repo_host`, `git_token_secret_name`,  `git_token_username`, `use_ssh`, and `format_access_token`.
	

18:14:58
lens
ERROR
execute flow-run
Failed to load and execute flow run: ValueError('Local Secret "REPO_URL" was not found.')
k
Ok can you check the Discourse link to compare what you tried across those?
b
hey @Kevin Kho, what I figured out is that the agent has the environmental variables but the kubernetes job doesn’t have the environment variables. here are the env var from the Kubernetes job.
Copy code
PREFECT__BACKEND: server
PREFECT__CLOUD__AGENT__LABELS: ['prefect-k8s']
PREFECT__CLOUD__API: <http://prefect-server-apollo.sandbox:4200/graphql>
PREFECT__CLOUD__API_KEY:
PREFECT__CLOUD__AUTH_TOKEN:
PREFECT__CLOUD__SEND_FLOW_RUN_LOGS: true
PREFECT__CLOUD__TENANT_ID:
PREFECT__CLOUD__USE_LOCAL_SECRETS: false
PREFECT__CONTEXT__FLOW_ID: 9cf35e89-c1df-4fd6-841a-10650784c311
PREFECT__CONTEXT__FLOW_RUN_ID: be8b3e7b-e985-4e6b-abcc-deadfcef39f3
PREFECT__CONTEXT__IMAGE: prefecthq/prefect:1.2.0
PREFECT__ENGINE__FLOW_RUNNER__DEFAULT_CLASS: prefect.engine.cloud.CloudFlowRunner
PREFECT__ENGINE__TASK_RUNNER__DEFAULT_CLASS: prefect.engine.cloud.CloudTaskRunner
PREFECT__LOGGING__LEVEL: INFO
PREFECT__LOGGING__LOG_TO_CLOUD: true
This secret is missing in k8s job, could this be the root cause?
Copy code
PREFECT__CONTEXT__SECRETS__REPO_URL: <http://repo.default.svc.cluster.local:3000>
k
Ah I see. Yes that is the root cause. One sec let me try something
I believe some of those are handed by the agent though so it is appearing to do something. You’re following number 2 in the Discourse link right? What version are you on?
b
This is the agent image
prefecthq/prefect:latest
It is version 1x if you are refering to 1x or 2x
discord options does not fit 100% I believe. number 2 is local agent
it is more like number 5
Shall I try to add to
run_config
of
KubernetesRun()
k
Adding to
run_config
of
KubernetesRun
will absolutely work, just the least secure of course
Could you check what the env variables on the agent pod are?
b
these are the env vars from the agent pod
Copy code
IMAGE_PULL_POLICY:
IMAGE_PULL_SECRETS:
JOB_CPU_LIMIT:
JOB_CPU_REQUEST:
JOB_MEM_LIMIT:
JOB_MEM_REQUEST:
NAMESPACE: sandbox
PREFECT__BACKEND: server
PREFECT__CLOUD__AGENT__AGENT_ADDRESS: <http://0.0.0.0:8080>
PREFECT__CLOUD__AGENT__LABELS: ["prefect-k8s"]
PREFECT__CLOUD__API: <http://prefect-server-apollo.sandbox:4200/graphql>
PREFECT__CONTEXT__SECRETS__REPO_URL: <http://repo.default.svc.cluster.local:3000>
SERVICE_ACCOUNT_NAME: prefect-server-serviceaccount
k
Just making sure, this is from when you go in the pod and list them out right?
b
yes
there are others but I just get the ones which are on the`deployment.yaml+values.yaml`
k
And you are registering the flow and running it through the Server UI right?
b
to be more clear, these are also same when go in to pod in console and run
$ env
I register is from my own client, flow.py is also created in the git repo.
Copy code
with Flow("basic-prefect-etl-flow", run_config=KubernetesRun(labels=["prefect-k8s"]), 
#storage=GitHub(repo="cekicbaris/public", path="prefect_flow_1.py")
#storage=Docker(python_dependencies=["pandas==1.1.0"],image_tag='latest')
) as flow:
    extracted_df = extract()
    transformed_df = transform(extracted_df)
    load(transformed_df)


flow.storage = Git(repo="flows", flow_path="prefect_flow_1.py", 
, git_clone_url_secret_name="REPOURL", secrets=["REPOURL"])

if __name__ == '__main__':
    # flow.run()
    flow.register(project_name='Test')
I can see it on the UI and then run in on server UI, and get the logs from server UI
also here is the agent logs
Copy code
INFO:agent:Deploying flow run be8b3e7b-e985-4e6b-abcc-deadfcef39f3 to execution environment...
WARNING:prefect.Git:Git storage initialized with a `git_clone_url_secret_name`. The value of this Secret will be used to clone the repository, ignoring `repo`, `repo_host`, `git_token_secret_name`, `git_token_username`, `use_ssh`, and `format_access_token`.
[2022-04-27 19:11:01+0000] WARNING - prefect.Git | Git storage initialized with a `git_clone_url_secret_name`. The value of this Secret will be used to clone the repository, ignoring `repo`, `repo_host`, `git_token_secret_name`, `git_token_username`, `use_ssh`, and `format_access_token`.
[2022-04-27 19:11:02,123] INFO - agent | Completed deployment of flow run be8b3e7b-e985-4e6b-abcc-deadfcef39f3
INFO:agent:Completed deployment of flow run be8b3e7b-e985-4e6b-abcc-deadfcef39f3
k
I think you want it to be
REPO_URL
instead of
REPOURL
?
b
no no I have just changed it if the underscore is the problem
it is not working either with
REPOURL
or
REPO_URL
UI logs here is with new secret,
Failed to load and execute flow run: ValueError('Local Secret "REPOURL" was not found.')
here is with old secret
Failed to load and execute flow run: ValueError('Local Secret "REPO_URL" was not found.')
k
I believe it. I remember the previous error log had the underscore. Man really, not seeing what the issue is. You really can’t use Cloud? Cuz Cloud makes this a lot easier by storing the secrets. If you need something more immediate, I would recommend the RunConfig
As long as you are above something like Prefect 0.15.0 this setup should be working
b
thanks Kevin, I will experimenting the Prefect for orchestrating a data platform, therefore it is important for us to deploy it as a part of our offering.
let me explore and thanks for your great support so far
k
If you want to add debug lines to your Prefect code on the agent, the environment update happens here where agent env_vars are given to the env that then goes to the new pod
The agent code is pretty readable. You can start here in the deploy_flow as it creates the job_spec
b
it is working with
run_config
k
Yeah that’s the most sure sure way. But it should still be picked up with the previous attempt
b
yes I will explore that option and try to debug
because if the flows grows , it might be hard to update all flows if there is any change in the REPO url.
hey @Kevin Kho, It worked ! In the helm chart instead of providing PREFECT__CONTEXT__SECRETS__REPO_URL as env variable , I wrapped it under
PREFECT__CLOUD__AGENT__ENV_VARS
and I made the change as below.
Copy code
env:
    - name: PREFECT__CLOUD__AGENT__ENV_VARS
      value: '{"PREFECT__CONTEXT__SECRETS__REPO_URL": "<http://repo.default.svc.cluster.local:3000/repo1/flows>"}'
I figured out the difference when I run the
kubernetes agent install
Copy code
prefect agent kubernetes install --env PREFECT__CONTEXT__SECRETS__REPO_URL="<http://repo.default.svc.cluster.local:3000/repo1/flows>"
Then when it creates the kubernetes job , it moves the secret to the job. 🎉
Just as a reference for other users incase same issue arises.
k
Thanks for circling back! and great job
clap
181 Views