https://prefect.io logo
#prefect-community
Title
# prefect-community
b

Baris Cekic

04/27/2022, 12:30 PM
Hello here, I am new to Prefect and run my first flow in my local K8S cluster and use
KubernetesRun
. The flow is trying to download the
py
file from a local git repo in the same local K8S cluster. But I got an
SSL_VERIFICATION
error even though the
url/host
for
git storage
is
HTTP
. Is there anyone having similar issue?
a

Anna Geller

04/27/2022, 12:35 PM
Have you tried using Personal Access Token as an authentication method for your GitHub storage? This is more secure and much easier to manage
what kind of Git server are you using - is it self-hosted? is it accessible from your Kubernetes cluster? you could test it by running a simple container in this Kubernetes cluster doing git clone from this repo to test the permissions Those docs pages may be helpful: • https://docs.prefect.io/orchestration/flow_config/storage.html#githttps://docs.prefect.io/orchestration/flow_config/storage.html#ssh-git-storage
b

Baris Cekic

04/27/2022, 12:54 PM
Thanks Anna for the docs, it is not github. It is gitea, self hosted open source git repo. It is also in the same k8s local cluster.
the repo is public so there is no need for any token,
Copy code
flow.storage = Git(repo="flows", flow_path="prefect_flow_1.py", repo_host="repo.default.svc.cluster.local")
there is no ssl verification switch in the
dulwich.porcelain
prefect agent and git repo are in same namespace just to keep it simple
k

Kevin Kho

04/27/2022, 1:50 PM
What does your clone url look like?
b

Baris Cekic

04/27/2022, 2:53 PM
storage def in flow
Copy code
flow.storage = Git(repo="flows", flow_path="prefect_flow_1.py", repo_host="repo.default.svc.cluster.local:3000")
Error message
Copy code
Failed to load and execute flow run: MaxRetryError("HTTPSConnectionPool(host='repo.default.svc.cluster.local', port=3000): Max retries exceeded with url: /flows.git/info/refs?service=git-upload-pack (Caused by SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:1091)')))")
I even try to change the repo_host to
<http://repo.default.svc.cluster.local:3000>
but it always tries to use HTTPS regardless of selection here.
I think behind the scene in
dulwich.porcelain
uses
requests
without
verify=True|False
switch, at least what I see in prefect repo.
k

Kevin Kho

04/27/2022, 2:57 PM
I think https comes from the clone url here but you can just supply the whole clone_url yourself by making a secret and then Prefect will just use that secret to fetch the clone_url to clone the repo
b

Baris Cekic

04/27/2022, 3:52 PM
As I understand I need to maintain the local secrets as described here, is that right? https://docs.prefect.io/orchestration/concepts/secrets.html#setting-local-secrets
According to documentation, it says it is only available for local runs. I am using Kubernetes agent, is there any documentation for kubernetes agent?
Copy code
Note that this configuration only affects the environment in which it's configured. So if you set values locally, they'll affect flows run locally or via a local agent, but not flows deployed via other agents (since those flow runs happen in a different environment). To set local secrets on flow runs deployed by an agent, you can use the --env flag to forward environment variables into the flow run environment.
Should I maintain environment variables of agent instead of
~/.prefect/config.toml
?
k

Kevin Kho

04/27/2022, 3:58 PM
Yes to local secret, or store it in Prefect Cloud. If you have a Kubernetes agent, you can use the
Copy code
prefect agent kubernetes install
which makes the template, but you can also do
Copy code
prefect agent kubernetes install --env PREFECT__CONTEXT__SECRETS__MYSECRET="mysecret"
and then have it populate in the template. Basically it’s likely easier to use the env var to create the secret like shown here If you need more than 1 secret, just repeat another
--env
call
And then you have a secret name and then do
Copy code
storage = Git(git_clone_url_secret_name="MYSECRET")
and then it will pull that secret to clone the repo during run time
b

Baris Cekic

04/27/2022, 3:59 PM
👍
k

Kevin Kho

04/27/2022, 3:59 PM
Are you on Server or Cloud?
b

Baris Cekic

04/27/2022, 3:59 PM
let me try that
on Server
k

Kevin Kho

04/27/2022, 3:59 PM
Yeah this is right
b

Baris Cekic

04/27/2022, 3:59 PM
I used helm chart to deploy the agent
I will maintain the agent env variables in the values.yaml
It did not work, Here are my environment vars from kubernetes agent.
Copy code
IMAGE_PULL_POLICY:
IMAGE_PULL_SECRETS:
JOB_CPU_LIMIT:
JOB_CPU_REQUEST:
JOB_MEM_LIMIT:
JOB_MEM_REQUEST:
NAMESPACE: sandbox
PREFECT__BACKEND: server
PREFECT__CLOUD__AGENT__AGENT_ADDRESS: <http://0.0.0.0:8080>
PREFECT__CLOUD__AGENT__LABELS: ["prefect-k8s"]
PREFECT__CLOUD__API: <http://prefect-server-apollo.sandbox:4200/graphql>
PREFECT__CONTEXT__SECRETS__REPO_URL: <http://repo.default.svc.cluster.local:3000>
SERVICE_ACCOUNT_NAME: prefect-server-serviceaccount
but somehow getting this error
Failed to load and execute flow run: ValueError('Local Secret "REPO_URL" was not found.')
k

Kevin Kho

04/27/2022, 5:11 PM
That looks pretty right. Stupid questions: 1. Was the agent redeployed? 2. Do you only have 1 agent the flow can do to?
b

Baris Cekic

04/27/2022, 5:12 PM
yes, I even deleted the whole deployment and redeploy again
k

Kevin Kho

04/27/2022, 5:12 PM
Can you try adding it to storage
Copy code
flow.storage = Git(..., secrets=["REPO_URL"])
This will tell context to include that secret
b

Baris Cekic

04/27/2022, 5:12 PM
yes I have 1 agent
here is my storage
Copy code
flow.storage = Git(repo="flows", flow_path="prefect_flow_1.py",  git_clone_url_secret_name="REPO_URL")
shall I replace
git_clone_url_secret_name
with
secrets=["REPO_URL"]
k

Kevin Kho

04/27/2022, 5:13 PM
No, use them together.
secrets
is another kwarg
b

Baris Cekic

04/27/2022, 5:14 PM
trying
k

Kevin Kho

04/27/2022, 5:14 PM
This is a more detailed list of stuff you can do. Yours look like that right (minus config.toml)?
b

Baris Cekic

04/27/2022, 5:15 PM
Copy code
18:14:58
lens
WARNING
Git
Git storage initialized with a `git_clone_url_secret_name`. The value of this Secret will be used to clone the repository, ignoring `repo`, `repo_host`, `git_token_secret_name`,  `git_token_username`, `use_ssh`, and `format_access_token`.
	

18:14:58
lens
ERROR
execute flow-run
Failed to load and execute flow run: ValueError('Local Secret "REPO_URL" was not found.')
k

Kevin Kho

04/27/2022, 5:16 PM
Ok can you check the Discourse link to compare what you tried across those?
b

Baris Cekic

04/27/2022, 7:13 PM
hey @Kevin Kho, what I figured out is that the agent has the environmental variables but the kubernetes job doesn’t have the environment variables. here are the env var from the Kubernetes job.
Copy code
PREFECT__BACKEND: server
PREFECT__CLOUD__AGENT__LABELS: ['prefect-k8s']
PREFECT__CLOUD__API: <http://prefect-server-apollo.sandbox:4200/graphql>
PREFECT__CLOUD__API_KEY:
PREFECT__CLOUD__AUTH_TOKEN:
PREFECT__CLOUD__SEND_FLOW_RUN_LOGS: true
PREFECT__CLOUD__TENANT_ID:
PREFECT__CLOUD__USE_LOCAL_SECRETS: false
PREFECT__CONTEXT__FLOW_ID: 9cf35e89-c1df-4fd6-841a-10650784c311
PREFECT__CONTEXT__FLOW_RUN_ID: be8b3e7b-e985-4e6b-abcc-deadfcef39f3
PREFECT__CONTEXT__IMAGE: prefecthq/prefect:1.2.0
PREFECT__ENGINE__FLOW_RUNNER__DEFAULT_CLASS: prefect.engine.cloud.CloudFlowRunner
PREFECT__ENGINE__TASK_RUNNER__DEFAULT_CLASS: prefect.engine.cloud.CloudTaskRunner
PREFECT__LOGGING__LEVEL: INFO
PREFECT__LOGGING__LOG_TO_CLOUD: true
This secret is missing in k8s job, could this be the root cause?
Copy code
PREFECT__CONTEXT__SECRETS__REPO_URL: <http://repo.default.svc.cluster.local:3000>
k

Kevin Kho

04/27/2022, 7:15 PM
Ah I see. Yes that is the root cause. One sec let me try something
I believe some of those are handed by the agent though so it is appearing to do something. You’re following number 2 in the Discourse link right? What version are you on?
b

Baris Cekic

04/27/2022, 7:21 PM
This is the agent image
prefecthq/prefect:latest
It is version 1x if you are refering to 1x or 2x
discord options does not fit 100% I believe. number 2 is local agent
it is more like number 5
Shall I try to add to
run_config
of
KubernetesRun()
k

Kevin Kho

04/27/2022, 7:26 PM
Adding to
run_config
of
KubernetesRun
will absolutely work, just the least secure of course
Could you check what the env variables on the agent pod are?
b

Baris Cekic

04/27/2022, 7:29 PM
these are the env vars from the agent pod
Copy code
IMAGE_PULL_POLICY:
IMAGE_PULL_SECRETS:
JOB_CPU_LIMIT:
JOB_CPU_REQUEST:
JOB_MEM_LIMIT:
JOB_MEM_REQUEST:
NAMESPACE: sandbox
PREFECT__BACKEND: server
PREFECT__CLOUD__AGENT__AGENT_ADDRESS: <http://0.0.0.0:8080>
PREFECT__CLOUD__AGENT__LABELS: ["prefect-k8s"]
PREFECT__CLOUD__API: <http://prefect-server-apollo.sandbox:4200/graphql>
PREFECT__CONTEXT__SECRETS__REPO_URL: <http://repo.default.svc.cluster.local:3000>
SERVICE_ACCOUNT_NAME: prefect-server-serviceaccount
k

Kevin Kho

04/27/2022, 7:31 PM
Just making sure, this is from when you go in the pod and list them out right?
b

Baris Cekic

04/27/2022, 7:31 PM
yes
there are others but I just get the ones which are on the`deployment.yaml+values.yaml`
k

Kevin Kho

04/27/2022, 7:33 PM
And you are registering the flow and running it through the Server UI right?
b

Baris Cekic

04/27/2022, 7:33 PM
to be more clear, these are also same when go in to pod in console and run
$ env
I register is from my own client, flow.py is also created in the git repo.
Copy code
with Flow("basic-prefect-etl-flow", run_config=KubernetesRun(labels=["prefect-k8s"]), 
#storage=GitHub(repo="cekicbaris/public", path="prefect_flow_1.py")
#storage=Docker(python_dependencies=["pandas==1.1.0"],image_tag='latest')
) as flow:
    extracted_df = extract()
    transformed_df = transform(extracted_df)
    load(transformed_df)


flow.storage = Git(repo="flows", flow_path="prefect_flow_1.py", 
, git_clone_url_secret_name="REPOURL", secrets=["REPOURL"])

if __name__ == '__main__':
    # flow.run()
    flow.register(project_name='Test')
I can see it on the UI and then run in on server UI, and get the logs from server UI
also here is the agent logs
Copy code
INFO:agent:Deploying flow run be8b3e7b-e985-4e6b-abcc-deadfcef39f3 to execution environment...
WARNING:prefect.Git:Git storage initialized with a `git_clone_url_secret_name`. The value of this Secret will be used to clone the repository, ignoring `repo`, `repo_host`, `git_token_secret_name`, `git_token_username`, `use_ssh`, and `format_access_token`.
[2022-04-27 19:11:01+0000] WARNING - prefect.Git | Git storage initialized with a `git_clone_url_secret_name`. The value of this Secret will be used to clone the repository, ignoring `repo`, `repo_host`, `git_token_secret_name`, `git_token_username`, `use_ssh`, and `format_access_token`.
[2022-04-27 19:11:02,123] INFO - agent | Completed deployment of flow run be8b3e7b-e985-4e6b-abcc-deadfcef39f3
INFO:agent:Completed deployment of flow run be8b3e7b-e985-4e6b-abcc-deadfcef39f3
k

Kevin Kho

04/27/2022, 7:35 PM
I think you want it to be
REPO_URL
instead of
REPOURL
?
b

Baris Cekic

04/27/2022, 7:36 PM
no no I have just changed it if the underscore is the problem
it is not working either with
REPOURL
or
REPO_URL
UI logs here is with new secret,
Failed to load and execute flow run: ValueError('Local Secret "REPOURL" was not found.')
here is with old secret
Failed to load and execute flow run: ValueError('Local Secret "REPO_URL" was not found.')
k

Kevin Kho

04/27/2022, 7:38 PM
I believe it. I remember the previous error log had the underscore. Man really, not seeing what the issue is. You really can’t use Cloud? Cuz Cloud makes this a lot easier by storing the secrets. If you need something more immediate, I would recommend the RunConfig
As long as you are above something like Prefect 0.15.0 this setup should be working
b

Baris Cekic

04/27/2022, 7:39 PM
thanks Kevin, I will experimenting the Prefect for orchestrating a data platform, therefore it is important for us to deploy it as a part of our offering.
let me explore and thanks for your great support so far
k

Kevin Kho

04/27/2022, 7:42 PM
If you want to add debug lines to your Prefect code on the agent, the environment update happens here where agent env_vars are given to the env that then goes to the new pod
The agent code is pretty readable. You can start here in the deploy_flow as it creates the job_spec
b

Baris Cekic

04/27/2022, 9:26 PM
it is working with
run_config
k

Kevin Kho

04/27/2022, 9:28 PM
Yeah that’s the most sure sure way. But it should still be picked up with the previous attempt
b

Baris Cekic

04/27/2022, 9:44 PM
yes I will explore that option and try to debug
because if the flows grows , it might be hard to update all flows if there is any change in the REPO url.
hey @Kevin Kho, It worked ! In the helm chart instead of providing PREFECT__CONTEXT__SECRETS__REPO_URL as env variable , I wrapped it under
PREFECT__CLOUD__AGENT__ENV_VARS
and I made the change as below.
Copy code
env:
    - name: PREFECT__CLOUD__AGENT__ENV_VARS
      value: '{"PREFECT__CONTEXT__SECRETS__REPO_URL": "<http://repo.default.svc.cluster.local:3000/repo1/flows>"}'
I figured out the difference when I run the
kubernetes agent install
Copy code
prefect agent kubernetes install --env PREFECT__CONTEXT__SECRETS__REPO_URL="<http://repo.default.svc.cluster.local:3000/repo1/flows>"
Then when it creates the kubernetes job , it moves the secret to the job. 🎉
Just as a reference for other users incase same issue arises.
k

Kevin Kho

04/28/2022, 4:32 PM
Thanks for circling back! and great job
clap
80 Views