https://prefect.io logo
a

Aqib Fayyaz

02/21/2022, 3:28 PM
Hi, i have kind of silly question, if i want to run agent, flow and server on same gke cluster can i have local agent instead of kubernetes agent?
https://github.com/flavienbwk/prefect-docker-compose i am trying above example using docker-compose and locally it works so i want to have same behaviour on gke
a

Anna Geller

02/21/2022, 4:14 PM
I wouldn’t recommend it as you will likely face issues when you need to scale up or redeploy some components. For Server deployment on Kubernetes, I would recommend the helm chart. But @Aqib Fayyaz I remember we went together through both: • setting up a
KubernetesAgent
on GKE • as well as setting up Server with Helm on GKE and I remember we managed to do it (both), right? did something happen with your setup and you have to start from scratch?
a

Aqib Fayyaz

02/21/2022, 4:15 PM
yeah @Anna Geller i remember all of that worked but now its my job requirements to have docker-compose set up and working and than same way deploy it on gke as it works on docker-compose
but the main thing is this going to work?
local agent deployed on gke and server on same cluster instead of kubernetes agent
our prefect code will also be on same cluster
a

Anna Geller

02/21/2022, 4:21 PM
Again, I wouldn’t recommend that since even if this works, you will 100% face issues with scale - I recommend using either Prefect Cloud or when you want to self host, then Helm Chart is the recommended setting for Kubernetes deployments. If you have very small workloads that fit into a single machine, you can deploy a single VM and self host Server using docker-compose. Docker-compose is meant for a single machine container deployments, not for something to be run on a Kubernetes cluster.
a

Aqib Fayyaz

02/21/2022, 4:26 PM
ok got it, but the thing is half of it has been deployed on gke our prefect code that runs the pipeline is already deployed on gke and it works now i only need to deploy server using this approch so that i can run the deployed pipeline when we want not that it should run automatically when deployed for the first time.
a

Anna Geller

02/21/2022, 4:31 PM
My recommendation is as follows: • if you want to use docker-compose, deploy your Server on a single VM, not GKE • if you want to deploy Server on a GKE Kubernetes cluster, use helm chart, not docker-compose.
1
a

Aqib Fayyaz

02/22/2022, 11:29 AM
Hi @Anna Geller so now i am using helm for server deployment and i have deployed it on gke following this https://github.com/PrefectHQ/server/tree/master/helm/prefect-server and this awesome video

https://www.youtube.com/watch?v=EwsMecjSYEU&t=2792s

.
we are using google file store as shared volume mounted on vm instance for all our gke services and now the main thing is our prefect pipeline also need to access that shared volume for storing the results and i am confused how can we do that. for all other services we defined shared volume in their manifest files like in attached image.
and where should i store the flow and its dependencies so that it can access the shared volume
a

Anna Geller

02/22/2022, 11:42 AM
You can store both your flow and results in GCS (mounting cloud block storage volumes is more involved and I wouldn't do it unless you're a Kubernetes pro, especially given that you need it for object storage and GCS is made for that) as long as you have a service account permissions in your cluster, your flow run pods should be able to interact with GCS
but I don't fully understand why you go through the entire process again, we already did that 2 times - once when you were setting GKE
KubernetesAgent
with Prefect Cloud and once when you were setting up Server on GKE with helm chart and I remember you got it working
a

Aqib Fayyaz

02/22/2022, 11:45 AM
yes even now the server is up on gke using helm chart the only thing which is added is shared volumes and i need to access them in my flow
i have run the flow as service on gke and mounted shared volume in it and it woked now the thing is this server part added and we want to orchestrate the flow using server on gke because without server or cloud we cannot trigger the flow when needed but it runs automatically when deployed on gke
and i can get any permission i want
a

Anna Geller

02/22/2022, 11:52 AM
you would need a persistent volume to mount a drive - check out those docs: https://cloud.google.com/kubernetes-engine/docs/concepts/persistent-volumes#persistentvolumeclaims that's correct, you need either Prefect Cloud or Server backend to run flows on schedule on GKE
a

Aqib Fayyaz

02/22/2022, 11:53 AM
i have them already
for all the other services and it works
now the quesion is how can i use it for our flow i mean how flow can access the volume and where flow should be stroed so that it can acces this
a

Anna Geller

02/22/2022, 12:04 PM
afaik you can't use persistent volume as flow storage, but you probably may use it for results if you specify the path. Again, I would recommend using GCS rather than persistent volume since you need to just store objects (both flow storage and results) and GCS is object storage, while persistent volume is block storage and it's used more for stateful applications like database API backend
a

Aqib Fayyaz

02/22/2022, 12:10 PM
Exactly i don't want to use persistent volume for flow but for result and this is what i want to know where the flow should be stored so that it can access the persistent volume using path specified in flow for storing the results in persistent volume
and i need to use persist volume for storing result because all other services are using it and they need the result of prefect pipeline from persistent volume for further work
a

Anna Geller

02/22/2022, 12:19 PM
You can decide where you store it, there are no restrictions from the Prefect side.
a

Aqib Fayyaz

02/22/2022, 12:29 PM
Can you please tell me how things work like when flow is stored on docker file on gcr and server and agent on gke and when we run the flow from server how things work i mean who gets the flow and where it is run
a

Anna Geller

02/22/2022, 12:32 PM
This thread provides a detailed explanation
a

Aqib Fayyaz

02/22/2022, 12:50 PM
can i place my flow on vm instance when server and agents are on gke?
a

Anna Geller

02/22/2022, 12:54 PM
You are using Kubernetes agent right? If so, Prefect deploys your flow runs as Kubernetes jobs and those jobs must be able to pull your flow. So your Kubernetes job template would be the right way to configure that. But you need to use one of the existing storage mechanisms. If you really don't wanna use GCS, you may try using Local storage and store the flow on this PV, but it's at your own risk, I would really recommend using GCS for that (it's more reliable, scalable and even cheaper)
a

Aqib Fayyaz

02/22/2022, 12:58 PM
I really appreciate the great recomendation but i have only option of pv as other services need to get the result of flow and they are already looking into pv for that
a

Anna Geller

02/22/2022, 12:59 PM
I was speaking of Storage, not results - for results you can use whatever stateful mechanism you want (provided it's configured properly)
1
a

Aqib Fayyaz

02/22/2022, 1:14 PM
hmm ok so i will store the flow on gcp than how can i access the pv in flow got any idea?
a

Anna Geller

02/22/2022, 1:24 PM
When you specify PV claim, you specify the mount path and you could use the same path for your flow results in theory. Check out this for more info But again, I would strongly encourage you to use GCS for that instead - your custom applications can use the same storage bucket and paths to retrieve the flow results in the same way you would do it with a PV, there is really no difference apart from the fact that GCS is significantly less complex and more reliable/better fit for your use case. Is that you or someone else I need to persuade to GCS? 🙂 And if you still need to use PV, do you have some DevOps in your team who can help you with that? This is hard to support via Slack. Not sure if you saw that but we do provide a paid support for such infrastructure issues
a

Aqib Fayyaz

02/22/2022, 3:32 PM
@Open AIMP
@Anna Geller i have one last question so i deployed my flow on gke as service and inside docker file i gave the command to run the flow soon it is deployed on gke as service
CMD  ["python3", "/usr/app/feat_post_flow_local.py"]
and it works i mean it runs the flow and flow does its job and sends the result to shared storage as well. Server is also deployed on gke using helm chart so my question is can this server interact with the flow stored on gke as servcie?
i know for this i need to register the flow with server but for that i also need to tell it where the storage is but i did not find any option for kubernetes as storage
a

Anna Geller

02/23/2022, 9:41 AM
Exactly, if you want to run a flow on Server backend, you need to register (and probably also schedule) your flow. Prefect doesn't support running flows as a long running service.
You could schedule this flow and run it even forever, but what you do right now is just executing a local script which is not tracked in the backend and thus, it can't communicate via API. To communicate with the backend, flow must be registered
💯 1
a

Aqib Fayyaz

02/23/2022, 9:45 AM
hmm what if i use docker storage for my flow in this way my flow will be able to run both as service (so that i should be able to mount pvc) and also i should be able to register the flow with server?
a

Anna Geller

02/23/2022, 9:46 AM
I don't know what you try to do. Can you explain the problem you try to solve?
a

Aqib Fayyaz

02/23/2022, 9:47 AM
ok lets get it straight i just need to mount file store instance to my flow no matter whever my flow run it just be able to communicate with with filestore instance (that is used as shared volume for all other services that we have on gke)
a

Anna Geller

02/23/2022, 9:52 AM
To access a file from a pod, you don't need to run flow as a service, you need to mount the PV to the pod, as we discussed before. There are some ways to do it, I would ask your DevOps folks to help you set this up and we also provide professional services you can book for such infrastructure issues. From Prefect perspective, you can set it on your Kubernetes job template, that's all I know and can help with tbh
1
a

Aqib Fayyaz

02/23/2022, 9:59 AM
ok Thank you so much for great help.
ok one more thing i had flow previously stored on github and custom modules on docker file now i want to store flow on docker i mean docker as storage for my flow and all custom modules so can you please provide some useful link where to get started for that.
a

Anna Geller

02/23/2022, 10:05 AM
Sure, here is one example with AWS ECR, but you can use a similar logic with GCP GCR
a

Aqib Fayyaz

02/23/2022, 11:20 AM
incase of aws account id what i need to use for gcp?
a

Anna Geller

02/23/2022, 11:22 AM
your GCR registry url
1
a

Aqib Fayyaz

02/23/2022, 11:25 AM
a

Anna Geller

02/23/2022, 11:45 AM
yup, correct 🙂
2 Views