<@ULVA73B9P> if i want to deploy a code deployment...
# ask-marvin
a
@Marvin if i want to deploy a code deployment which uses serve(), which port should be exposed, how should the teh Dockerfile look like
m
thought for 361.0 seconds
Short answer: with Prefect 3.x,
serve()
does not open any inbound network ports. It polls the Prefect API and executes runs; there’s nothing to “expose” for Prefect itself. Expose a port only if your flow spins up its own web server (e.g., FastAPI for webhooks) or you intentionally want to expose a self-hosted Prefect Server/UI from inside the same container. Useful docs: - Serve flows in Docker - Blog: Schedule your code quickly with flow.serve - Settings reference What to expose - Using
flow.serve()
or `prefect flow serve`: expose nothing by default. - If your flow starts its own HTTP server, expose that port (your app’s port). - If you’re running a self-hosted Prefect Server in this container and want outside access to it, expose the server’s ports. Otherwise, don’t. Two common Docker approaches A) CLI-driven serve (recommended for containers) Dockerfile:
Copy code
FROM python:3.11-slim

ENV PYTHONDONTWRITEBYTECODE=1 \
    PYTHONUNBUFFERED=1

# Optional: create non-root user
RUN useradd -m -u 10001 appuser

WORKDIR /app

# Install deps
# If you have a requirements.txt with your flow deps, use that:
# COPY requirements.txt .
# RUN pip install --no-cache-dir -r requirements.txt
# Or just install prefect + your packages directly:
RUN pip install --no-cache-dir prefect

# Copy your project
COPY . .

USER appuser

# Note: set PREFECT_API_URL and PREFECT_API_KEY at runtime via env vars
# No EXPOSE needed unless YOUR app listens on a port

# Serve the flow via entrypoint (update the path:name)
CMD ["prefect", "flow", "serve", "flows/my_flow.py:my_flow", "--name", "my-deployment"]
Run it (Cloud example):
Copy code
docker run --rm \
  -e PREFECT_API_URL="<https://api.prefect.cloud/api/accounts/<acct>/workspaces/<ws>>" \
  -e PREFECT_API_KEY="<your-api-key>" \
  your-image:tag
Run it (self-hosted API example):
Copy code
docker run --rm \
  -e PREFECT_API_URL="<http://prefect-server:your-port/api>" \
  --network your_docker_network \
  your-image:tag
Replace the API URL/port with your actual server. If you’re using Docker Compose, use the service name and port of your Prefect Server service. B) Python-driven serve (if you prefer code over CLI) main.py:
Copy code
from prefect import flow

@flow
def my_flow():
    return "hello"

if __name__ == "__main__":
    # keep this process running to poll for scheduled work
    my_flow.serve(name="my-deployment")
Dockerfile:
Copy code
FROM python:3.11-slim
ENV PYTHONDONTWRITEBYTECODE=1 PYTHONUNBUFFERED=1
WORKDIR /app
RUN pip install --no-cache-dir prefect
COPY . .
CMD ["python", "main.py"]
Environment variables to care about - PREFECT_API_URL: where this process should talk to (Prefect Cloud or your self-hosted Prefect Server) - PREFECT_API_KEY: required if using Cloud Pass them at runtime; don’t bake secrets into the image. See settings. What’s the difference between flow.serve, prefect flow serve, and deploy? -
flow.serve(...)
(in code) and
prefect flow serve <entrypoint> -n <name>
(CLI) do the same thing conceptually: create a “runner-style” deployment and keep a process running that polls for and executes runs. No worker required. No port exposed. -
prefect deploy
and
flow.from_source(...).deploy(...)
create “worker-style” deployments that are picked up by a Prefect worker via a work pool. That pattern doesn’t keep your container running by itself—you run a worker separately. Also, in Prefect 3.x, do not use the removed 2.x methods (
Deployment.build_from_flow()
or
prefect deployment build
). When would you expose a port? - If your flow provides an HTTP endpoint (e.g., webhook ingestion), expose that port: - Add
EXPOSE 8080
(or your port) - Run your server and
serve()
in the same process or via a supervisor
- Publish that port with
-p 8080:8080
- If you want external access to a self-hosted Prefect Server/UI running in the container, expose those server ports. If you’re not sure which ones, share how you’re running Prefect Server and I’ll be delightfully specific. Do you plan to connect to Prefect Cloud or a self-hosted Server? If self-hosted, what’s the API URL and where does it run (host or another container)? I can tailor the exact Dockerfile/CMD and any port considerations based on that.
a
@Marvin yeah so i have done the above and it seems like the deployment fails on gcp cause it expects and exposed port but there is none, what should be done for this
m
thought for 105.7 seconds
Sounds like you’re deploying to a GCP service that insists on an HTTP listener (most likely Cloud Run). Prefect’s
serve()
doesn’t open a port; it just polls the Prefect API. That’s great for VMs or Kubernetes, but Cloud Run wants an HTTP server listening on $PORT. Without one, it kills the container. You’ve got two options: Option 1 (Cloud Run workaround): add a tiny HTTP server - Start a trivial HTTP server that listens on
0.0.0.0:$PORT
alongside
flow.serve()
. This satisfies Cloud Run’s requirement. - Set Cloud Run to keep at least one instance warm and CPU always allocated, or your poller will scale to zero and stop serving schedules. Example main.py:
Copy code
import os
import threading
from http.server import BaseHTTPRequestHandler, HTTPServer
from prefect import flow

class Handler(BaseHTTPRequestHandler):
    def do_GET(self):
        # Respond OK for any path (health checks, etc.)
        self.send_response(200)
        self.end_headers()
        self.wfile.write(b"ok")

    def log_message(self, *args):
        pass  # quiet logs

def run_http_server():
    port = int(os.environ.get("PORT", "8080"))
    server = HTTPServer(("0.0.0.0", port), Handler)
    server.serve_forever()

@flow
def my_flow():
    return "hello"

if __name__ == "__main__":
    threading.Thread(target=run_http_server, daemon=True).start()
    my_flow.serve(name="my-deployment")
Dockerfile:
Copy code
FROM python:3.11-slim
ENV PYTHONDONTWRITEBYTECODE=1 PYTHONUNBUFFERED=1
WORKDIR /app
RUN pip install --no-cache-dir prefect
COPY . .
# Optional, for humans; Cloud Run uses $PORT envvar regardless
EXPOSE 8080
CMD ["python", "main.py"]
Cloud Run settings: - Min instances: 1 (keeps the poller alive) - CPU: always allocated - Memory/timeout: bump as needed for your flow - Concurrency: consider 1 if this is a single long-lived process Trade-offs: - You’re paying for an always-on instance. - It’s not the most idiomatic use of Cloud Run (it’s request-driven;
serve()
is a background poller). Option 2 (Recommended): run
serve()
on something designed for daemons - GKE: Deploy a Kubernetes Deployment that runs
prefect flow serve
. No Service/ports needed. - GCE VM or MIG: Run the container or a systemd service with
prefect flow serve
. - Or switch to worker-style deployments (
prefect deploy
or
flow.from_source(...).deploy(...)
) and run a Prefect worker on GKE/GCE. Workers poll too, so they need a long-lived environment just like
serve()
. Helpful Prefect docs - Serve flows in Docker - Blog: flow.serve quickstart - Settings: PREFECT_API_URL, PREFECT_API_KEY, etc. Quick CLI reference (verified): -
prefect flow serve ./path/to/file.py:flow_name -n my-deployment
runs the same pattern from the CLI. If this isn’t Cloud Run, which GCP service are you using (GKE, GCE, Cloud Run, Cloud Run Jobs)? And could you share the exact error message you’re seeing? I can tailor the exact container config, port, and platform tweaks so your deployment stops sulking.
a
@Marvin what would the exposable serverless way be
m
thought for 184.5 seconds
If you need an HTTP-exposed, serverless entrypoint on GCP, don’t run the poller (
serve()
) there. Use an HTTP handler that triggers a Prefect deployment on demand, then let a worker actually run the flow elsewhere. Pattern - Cloud Run or Cloud Functions hosts a tiny HTTP endpoint. - The handler calls Prefect to create a flow run (using
prefect.deployments.run_deployment
or a direct API call). - A Prefect worker (somewhere) picks up the run and executes it. That worker can be: - GKE/GCE (nice and boring), or - Another Cloud Run service kept warm with min-instances=1 and a tiny HTTP server (to satisfy Cloud Run’s port requirement). Serverless-ish, but not scale-to-zero for the worker. Minimal HTTP trigger (Cloud Run) app.py:
Copy code
import os
from fastapi import FastAPI, Request
from prefect.deployments import run_deployment
from prefect import get_client

app = FastAPI()

@app.get("/healthz")
def health():
    return {"status": "ok"}

@app.post("/run")
async def trigger(request: Request):
    body = await request.json()
    # Expect "deployment" in form "flow-name/deployment-name"
    deployment = body.get("deployment")
    params = body.get("parameters") or {}
    flow_run_name = body.get("flow_run_name")

    async with get_client() as client:
        fr = await run_deployment(
            name=deployment,
            client=client,
            parameters=params,
            flow_run_name=flow_run_name,
        )
    # Return quickly; the worker will perform the run
    return {"flow_run_id": str(fr.id), "state": fr.state.type.value}
Dockerfile:
Copy code
FROM python:3.11-slim
ENV PYTHONDONTWRITEBYTECODE=1 PYTHONUNBUFFERED=1
WORKDIR /app
RUN pip install --no-cache-dir prefect fastapi uvicorn
COPY . .
# Cloud Run expects your app to listen on $PORT
ENV PORT=8080
EXPOSE 8080
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8080"]
Deploy to Cloud Run and set: - PREFECT_API_URL (Cloud or your self-hosted URL) - PREFECT_API_KEY (if using Cloud) Then POST to your endpoint:
Copy code
curl -X POST https://<your-cloud-run-url>/run \
  -H "Content-Type: application/json" \
  -d '{"deployment": "my-flow/my-deployment", "parameters": {"x": 42}}'
Start a worker (somewhere that can stay up) - Example command (verified):
prefect worker start --pool <your-work-pool>
- Run this on GKE, GCE, or in a separate Cloud Run service with: - a tiny HTTP server on $PORT (Cloud Run demands it), and - min-instances=1 so the worker doesn’t scale to zero and miss runs. Why this works - Cloud Run/Functions are great at handling an HTTP request and returning quickly. They are not great at “polling a queue forever” (which is what
serve()
does). - Using an HTTP trigger to enqueue runs keeps your entrypoint serverless and exposable. - The worker handles the long-running part, as nature intended. Related docs - Serve flows in Docker (for background on serve; you won’t use it in Cloud Run) - Blog: flow.serve (again, useful context) - Settings reference (PREFECT_API_URL, PREFECT_API_KEY) If you want, I can sketch the minimal “worker-on-Cloud-Run” container (HTTP health endpoint +
prefect worker start
) as well. Or are you planning to host the worker on GKE/GCE?
n
hmm marvin might be missing the port for the health check here? didn't read too closely
although, reading a bit closer i think i would also expect you don't need to manually configure any port stuff to throw .serve in a container
anything interesting in the GCP logs that might hold a clue?
a
@Nate gcp is not promoting the revision cause there is no service listening on any port: Default STARTUP TCP probe failed 1 time consecutively for container "placeholder-1" on port 4200. The instance was not started. Connection failed with status DEADLINE_EXCEEDED.
n
4200 is the default port for the Prefect server though, not the health check webserver
why is it looking for something on 4200?
a
cause thats what i specified in the deployment config
gcp run expects a "container port"
i could use a dummy service to listen to an arbitrary port but i want to know whats the correct way to deploy prefect workers
i would not want to use github to fetch the code, is there a way to package the workflow with the worker itself?
i would like to use serve cause it creates both worker pool and the agent to listen to job que
n
is there a way to package the workflow with the worker itself?
this is basically what serve does.
i would like to use serve cause it creates both worker pool and the agent to listen to job queue
yea exactly if it wants a health check port for a probe i would think it should be 8080, lemme double check thtat's the default
a
i could be missing something here
n
> i would like to use serve cause it creates both worker pool and the agent to listen to job queue
yea exactly
well, technically there's no work pool here, but yea it bundles the listener process
a
does serve pool the prefect endpoint as marvin explained? do i always need to have the server up
whats a serverless way to deploy a worker
n
yes, like workers do, serve polls the prefect api for scheduled runs. its basically like a process worker bundled with a set of flows • listen for any scheduled runs • submit them as subprocesses when you find some
whats a serverless way to deploy a worker
generally you take your favorite static container host (docker, compose, k8s, cloud run etc) and you run
prefect worker start --pool my-pool
as the entrypoint of that container, and then the worker has a health check port that you'd use as the startup probe
but, you don't need to run a worker separately if you have a static set of flows and you just want to use the
serve
process as the entrypoint instead, like that first guide marvin kept linking
workers are just a way to decouple the deployment objects from the listening daemon (which is especially useful when your flow runs should run on per-flow-run infra, like a container per flow run on ecs or k8s etc)
and just to clarify
do i always need to have the server up
i assume you mean "worker" not "server" here. the server is either
prefect server start
or Prefect Cloud, that's the prefect server the worker is a client side daemon that asks the server for work and submits it to the exec env
a
okay makes sense, i was under the assumption that
prefect worker start --pool my-pool
actually creates an endpoint and the prefect server / agent (i think this is deprecated now) checks for jobs and hits the worker
i assume you mean "worker" not "server" here
yes i meant the worker
so i should be good with using serve for my usecase, but need to figure out which port is the healtcheck port
n
i think it is just
webserver=True
as a kwarg for
.serve
and then in GCP you do
Copy code
livenessProbe:
    httpGet:
      path: /health
      port: 8080
i should mention that in the guide we have on
.serve
in containers
these docs should clarify this for the future
a
this worked perfectly, thanks a lot for the help!!
n
catjam
a
sorry to bother you again, new problem, runs are getting submitted but its not running:
Copy code
19:38:29.530 | INFO | prefect.flow_runs.runner - Runner 'runner-69370865-174c-44b4-9de8-b726d1864292' submitting flow run '06896522-a585-7046-8000-f53e5aa404d1'

2025-08-09 01:08:31.730 IST
19:38:31.429 | INFO | prefect.flow_runs.runner - Runner 'runner-69370865-174c-44b4-9de8-b726d1864292' submitting flow run '06896522-bfc1-7316-8000-79c5d5975a63'

2025-08-09 01:08:42.633 IST
19:38:42.430 | INFO | prefect.flow_runs.runner - Opening process...

2025-08-09 01:09:11.429 IST
19:39:11.230 | INFO | prefect.flow_runs.runner - Completed submission of flow run '06896522-a585-7046-8000-f53e5aa404d1'

2025-08-09 01:09:23.329 IST
19:39:23.130 | INFO | prefect.flow_runs.runner - Runner 'runner-69370865-174c-44b4-9de8-b726d1864292' submitting flow run '06896523-94fe-7224-8000-5ba1c1af5b32'

2025-08-09 01:09:31.129 IST
19:39:29.630 | INFO | prefect.flow_runs.runner - Opening process...

2025-08-09 01:10:21.248 IST
GET404 122 B 10 ms Safari 18  <https://armor-prefect-worker-675776943808.northamerica-northeast1.run.app/>

2025-08-09 01:10:21.252 IST
19:40:19.430 | INFO | prefect.flow_runs.runner - Opening process...

2025-08-09 01:10:21.360 IST
19:40:21.360 | INFO | prefect.flow_runs.runner - Completed submission of flow run '06896522-bfc1-7316-8000-79c5d5975a63'

2025-08-09 01:10:21.361 IST
19:40:21.361 | INFO | prefect.flow_runs.runner - Completed submission of flow run '06896523-94fe-7224-8000-5ba1c1af5b32'

2025-08-09 01:10:26.638 IST
GET200 1.02 KB 5 ms Safari 18  <https://armor-prefect-worker-675776943808.northamerica-northeast1.run.app/docs>

2025-08-09 01:10:27.413 IST
GET200 4.32 KB 11 ms Safari 18  <https://armor-prefect-worker-675776943808.northamerica-northeast1.run.app/openapi.json>

2025-08-09 01:10:42.174 IST
GET200 100 B 5 ms Safari 18  <https://armor-prefect-worker-675776943808.northamerica-northeast1.run.app/run_count>

2025-08-09 01:11:54.533 IST
19:41:53.730 | INFO | prefect.flow_runs.runner - Runner 'runner-69370865-174c-44b4-9de8-b726d1864292' submitting flow run '0689652e-ff24-7684-8000-d3130fcc58ed'

2025-08-09 01:12:18.129 IST
19:42:17.329 | INFO | prefect.flow_runs.runner - Opening process...

2025-08-09 01:13:52.229 IST
19:43:50.830 | INFO | prefect.flow_runs.runner - Completed submission of flow run '0689652e-ff24-7684-8000-d3130fcc58ed'
something like above, the prefect server says the flow is in pending state
n
have you set the
limit
kwarg or otherwise using any deployment level concurrency limits?
a
yes have set it to 10
n
you probably have old "zombie" runs in Running or Pending from a while ago that are hogging your slots, you might need to clear them out
a
hmm, cleared everything, still same issue, goes to pending very quickly, sometimes it becomes Late and stays in that state
didnt have this problem when i was running the container locally