:wave:Hey guys, can anybody confirm this use case ...
# ask-community
g
👋Hey guys, can anybody confirm this use case is actually possible with Prefect? It sounds so amazing. I have a workflow and a single task in it needs ML inference, so expensive GPU is needed. I may have 1 GPU machine and >1 CPU-only machine. I want the GPU-required task to start and run only on the GPU machine.
n
👋 this is Prefect 1 syntax but the same sort of deployment-level infra selection is possible in prefect 2 let me find you a link or two
what sort of infra are you running on? like cloud-provider-wise
g
@Nate I've not decided yet. Just random cloud poviders, like vast.ai. Shouldn't it not matter for Prefect what's the cloud provider?
n
> Shouldn't it not matter for Prefect what's the cloud provider? it generally does matter. while we recently began offering managed execution (where we provide the runtime for flows), most of our users bring their own runtime (as has been the case historically), and we do not offer "managed" GPU-powered runtimes for flows. Prefect Cloud has a "hybrid model" where we host the orchestration API (server, db) for you, but generally your code runs in your infrastructure. Our managed execution offering is the only exception to that
g
@Nate I'm not looking for managed execution. I'm asking something different. I have 1 GPU machine and 3 CPU only machine. I need some prefect tasks (which need GPU) to execute only on the GPU machine. How can this be done? Did chatgpt response tell it accurately, does such task tags functionality exist, where you can tag worker instance too?
n
@Marvin please get some links for using work pools and workers and infra level overrides, make separate chroma queries for each of these and summarize concisely
m
Work Pools: - Can be configured via Prefect CLI, API, or UI. - UI management includes the ability to pause or create new work pools. - Control concurrent run limits or pause run delivery. - Guide available here. Workers: - Workers are polling processes that trigger scheduled flow runs. - Start a worker with the command:
prefect worker start --pool <work-pool-name>
. - Worker must correspond to an appropriate work pool. - Tutorial available here. Infrastructure-Level Overrides: - Customize infrastructure blocks through UI, Prefect Cloud Blocks page, or in code. - Infrastructure blocks can be Docker, Kubernetes, ECS Task, Cloud Run, etc. - Use
-i
or
--infra
to set defaults. Use
-ib
or
--infra-block
with block slug for pre-configured settings. - Example command to build deployment:
prefect deployment build ./my_flow.py:my_flow -n my-flow-deployment -t test -i docker-container -sb s3/my-bucket --override env.EXTRA_PIP_PACKAGES=s3
- More on configuration here and recent update here.
n
@Marvin
prefect deployment build
is effectively deprecated, use
prefect deploy
with
prefect.yaml
instead
m
Ah, the dull ache of obsolescence. Well, here's the update: - Use
prefect deploy
in combination with
prefect.yaml
for orchestrating deployments. -
prefect.yaml
configures deployment settings, including infrastructure overrides. - Custom infrastructure settings are defined as blocks, which can be reused in deployments. - The new approach encourages version-controlled and reusable configurations. You'll find the paradigm shift towards
prefect deploy
promotes better CI/CD practices and configuration management. The Prefect docs and change logs are valuable resources for grasping the nuances of these newer patterns.
n
yeah so work pools are like pub sub topics, and workers are like consumers you create a deployment from a flow, point a deployment at a work pool, workers pick up flow runs from the work pool you would create a work pool for each unique type of infra you have and set your infra config there, so maybe one for your GPU machine, one for your CPU-only machines then you could create deployments that point at each so you can distribute work as needed
g
@Nate so the way that chatgpt described (with task tags and worker tags) is not possible?
n
what chatgpt gave is prefect 1
which is our legacy version, its just different software now. similar idea, but you use work pools to represent infra and channel work there via deployments as ive mentioned
g
in my python code, I have 10 tasks. 1 task is gpu task. 9 tasks are cpu tasks. How do I specify that this 1 task is gpu task?
because this work pool like like oriented around worker configuration
n
you'd create a deployment (@flow decorated python function that you call
.deploy(name="my-gpu-deployment", work_pool_name="my-gpu-infra"...)
on) and then in your python code you'd trigger a run on that deployments infra with
run_deployment("my-decorated-fn/my-gpu-deployment")
but you'd need a work pool to represent your GPU infra to send your deployment's flow runs to
g
sounds way more complicated than with task tags. was this task tag functionality removed, so that your approach is mandatory?
n
chatgpt was hallucinating, that's not real in prinicple what you're asking for is possible, but it requires a small amount of engagement with our concepts
g
oh ok
n
yep - chatgpt as released today was only trained up through early 2022 afaik, today's prefect is effectively entirely new software easiest way to start with deployments if you're interested with
.serve
Copy code
from prefect import flow, task

@task
def work(x):
  print(f"got {x}")

@flow(log_prints=True)
def collection_of_work():
  for i in range(10):
    work(i)

  # or work.map(range(10)) to do above concurrently

# starts a long lived process
collection_of_work.serve("my-deployment-name")
so you could write / serve a flow on any of your computers and then trigger it from anywhere