< Marvin> what s the best way to measure Prefect s API respo Prefect Community #marvin-ai

Join Slack

<@ULVA73B9P> what's the best way to measure Prefec...

# marvin-ai

Anthony Goddard

09/22/2025, 7:32 PM

@Marvin what's the best way to measure Prefect's API response times?

Marvin

09/22/2025, 7:32 PM

✅ thought for 231.6 seconds

Marvin

09/22/2025, 7:36 PM

Great question — “best” depends on where you’re running and what you want to see. A couple quick clarifiers to tailor this: - Are you on Prefect Cloud or self‑hosted? - Do you want server‑side metrics (API latency on the server) or client‑perceived latency (how long SDK/worker calls take)? - If self‑hosted, are you deploying via Helm/Kubernetes or Docker Compose, and do you already use Prometheus/Grafana/OTEL? Here are the common, reliable paths: Client‑side timings (works for both Cloud and self‑hosted) - Instrument the Prefect SDK’s httpx client with event hooks. All SDK calls go through httpx, so you can capture end‑to‑end latencies without touching the server.

Copy code

import time
from prefect.client.orchestration import get_client

latencies_ms = []

def on_request(request):
    # mark start time on the outgoing request
    request.extensions['start_time'] = time.perf_counter()

def on_response(response):
    start = response.request.extensions.get('start_time')
    if start:
        dur_ms = (time.perf_counter() - start) * 1000
        latencies_ms.append(dur_ms)
        # optional: log per-request
        print(f"{response.request.method} {response.request.url.path} took {dur_ms:.1f} ms")

async def main():
    async with get_client(
        httpx_settings={'event_hooks': {'request': [on_request], 'response': [on_response]}}
    ) as client:
        # perform any SDK operations here; all will be timed by the hooks
        # e.g., read deployments, submit flow runs, poll states, etc.
        pass

# Run main() in your app/flow; latencies_ms will hold per-request timings

- If you’re already using OpenTelemetry, you can instrument httpx so each Prefect SDK call emits a span (with duration). Then export to your backend (Grafana/Tempo, Datadog, etc.). - See OpenTelemetry HTTPX instrumentation Synthetic probes (works everywhere) - Use a Prometheus blackbox exporter probe against a simple endpoint (e.g., your API base or health endpoint) to get probe_duration_seconds histograms without touching Prefect: - Prometheus blackbox exporter - Or run a tiny scheduled “synthetic” Prefect flow that pings the API and reports the duration to your observability stack or logs:

Copy code

import httpx, time, os
from prefect import flow

@flow
def api_probe():
    base = os.environ.get("PREFECT_API_URL")  # or your explicit URL
    url = f"{base.rstrip('/')}/api/health"    # example endpoint; adjust as needed
    start = time.perf_counter()
    r = httpx.get(url, timeout=5.0)
    elapsed_ms = (time.perf_counter() - start) * 1000
    print(f"API {url} -> {r.status_code} in {elapsed_ms:.1f} ms")

Self‑hosted server‑side metrics - There isn’t a documented, built‑in /metrics endpoint in the Prefect 3 API server. Common patterns: - Ingress/Proxy metrics: Put an ingress (NGINX, Envoy) or API gateway in front of the Prefect API and scrape its request duration metrics via Prometheus. In Kubernetes, the NGINX Ingress Controller exposes Prometheus metrics; enable a ServiceMonitor if you’re using kube‑prometheus‑stack. - Blackbox exporter (above): zero‑touch and reliable for SLI latency from your cluster. - If you control the ASGI app and want first‑class Prometheus on the API process, wrap the FastAPI app with an instrumentator such as: - prometheus-fastapi-instrumentator Note: this requires customizing the API container/app; Prefect does not ship this out of the box. OpenTelemetry end‑to‑end (advanced) - You can instrument both client (httpx) and server (FastAPI) with OpenTelemetry, export to an OTEL collector, and derive latency metrics from spans (e.g., spanmetrics processor). This gives trace+metric linkage but requires an OTEL stack. What about Prefect Cloud org‑level API latency?

Marvin

09/22/2025, 7:36 PM

- I’m not aware of a published org‑level API latency metric in the Cloud UI today. For Cloud, the most practical approach is client‑side timing (httpx hooks or OTEL) and/or external synthetic probes. If you share your environment (Cloud vs self‑hosted, deployment method, and your preferred observability stack), I can give you a drop‑in example (e.g., a Helm values snippet for ingress metrics, a blackbox exporter config, or an OTEL setup) tailored to your setup.

Nate

09/22/2025, 7:37 PM

probably something like Logfire , some other metrics/OTEL-based solution

Anthony Goddard

09/22/2025, 7:37 PM

awesome, thanks! Will wire it up w/ otel

3 Views

Open in Slack

Previous Next