Hey everyone, I used Prefect version 1 and 2 some ...
# ask-community
i
Hey everyone, I used Prefect version 1 and 2 some years ago, and I'm thinking on using Prefect 3 Cloud in a project were we need to orchestrate Databricks Delta Live Tables Jobs and also some GCP tools (Cloud Functions, Dataflow, BigQuery, etc). However I didn't found much info on how to orchestrate and integrate Prefect Cloud in this scenario and I have multiple questions such as: How the Infrastructure looks like, if any resource needs to be deployed on GCP (Cloud Build, Kubernetes, etc), and how the orchestration would look like in practice (do we code locally and when deployed it will become serverless?) So, do we have any kind of doc or github repo for this scenario?
n
hi @Italo Barros, welcome back! here's a couple resources you might find helpful • docs: quickstart • docs: creating deployments • youtube

video

: workers/work pools -> running code on remote infra (like docker, k8s, cloud run etc) there's also prefect-databricks which might be useful for you
i
@Nate thanks for the feedback! One question that is not clear to me, do I need to have Workers/Work Pools even using Prefect Cloud? For example, if I want to orchestrate a Batch pipeline using Google Cloud Functions to ingest data and then Databricks to Transform and delivery.
n
work pools and workers are only necessary if you need prefect to dynamically dispatch infrastructure for your flow runs. if you're using google cloud functions, you might not need workers or work pools at all, you might just want to set
PREFECT_API_KEY
and
PREFECT_API_URL
in your GCP function environment and call the decorated functions like normal python, like this but to answer the question
do I need to have Workers/Work Pools even using Prefect Cloud?
the work pool and worker concept is common to prefect cloud and prefect (open source) server • worker: lightweight thing that sits in your client side and asks the server if work is scheduled, and then submits scheduled work when it finds it • work pool: represents an execution environment (k8s cluster, cloud run, or a simple process) and stores config (env vars, cpu limits etc) for that execution env so you say
prefect deploy my_flow.py:some_fn --pool some-work-pool
to create a deployment, then
prefect worker start --pool some-work-pool
where you want the worker to listen for work, then you can schedule runs (via , API, CLI etc) and the worker will find them and submit them based on what type of work pool it is (k8s -> submits as a k8s job etc)