Anthony Goddard
09/22/2025, 7:32 PMMarvin
09/22/2025, 7:32 PMMarvin
09/22/2025, 7:36 PMimport time
from prefect.client.orchestration import get_client
latencies_ms = []
def on_request(request):
# mark start time on the outgoing request
request.extensions['start_time'] = time.perf_counter()
def on_response(response):
start = response.request.extensions.get('start_time')
if start:
dur_ms = (time.perf_counter() - start) * 1000
latencies_ms.append(dur_ms)
# optional: log per-request
print(f"{response.request.method} {response.request.url.path} took {dur_ms:.1f} ms")
async def main():
async with get_client(
httpx_settings={'event_hooks': {'request': [on_request], 'response': [on_response]}}
) as client:
# perform any SDK operations here; all will be timed by the hooks
# e.g., read deployments, submit flow runs, poll states, etc.
pass
# Run main() in your app/flow; latencies_ms will hold per-request timings
- If you’re already using OpenTelemetry, you can instrument httpx so each Prefect SDK call emits a span (with duration). Then export to your backend (Grafana/Tempo, Datadog, etc.).
- See OpenTelemetry HTTPX instrumentation
Synthetic probes (works everywhere)
- Use a Prometheus blackbox exporter probe against a simple endpoint (e.g., your API base or health endpoint) to get probe_duration_seconds histograms without touching Prefect:
- Prometheus blackbox exporter
- Or run a tiny scheduled “synthetic” Prefect flow that pings the API and reports the duration to your observability stack or logs:
import httpx, time, os
from prefect import flow
@flow
def api_probe():
base = os.environ.get("PREFECT_API_URL") # or your explicit URL
url = f"{base.rstrip('/')}/api/health" # example endpoint; adjust as needed
start = time.perf_counter()
r = httpx.get(url, timeout=5.0)
elapsed_ms = (time.perf_counter() - start) * 1000
print(f"API {url} -> {r.status_code} in {elapsed_ms:.1f} ms")
Self‑hosted server‑side metrics
- There isn’t a documented, built‑in /metrics endpoint in the Prefect 3 API server. Common patterns:
- Ingress/Proxy metrics: Put an ingress (NGINX, Envoy) or API gateway in front of the Prefect API and scrape its request duration metrics via Prometheus. In Kubernetes, the NGINX Ingress Controller exposes Prometheus metrics; enable a ServiceMonitor if you’re using kube‑prometheus‑stack.
- Blackbox exporter (above): zero‑touch and reliable for SLI latency from your cluster.
- If you control the ASGI app and want first‑class Prometheus on the API process, wrap the FastAPI app with an instrumentator such as:
- prometheus-fastapi-instrumentator
Note: this requires customizing the API container/app; Prefect does not ship this out of the box.
OpenTelemetry end‑to‑end (advanced)
- You can instrument both client (httpx) and server (FastAPI) with OpenTelemetry, export to an OTEL collector, and derive latency metrics from spans (e.g., spanmetrics processor). This gives trace+metric linkage but requires an OTEL stack.
What about Prefect Cloud org‑level API latency?Marvin
09/22/2025, 7:36 PMNate
09/22/2025, 7:37 PMAnthony Goddard
09/22/2025, 7:37 PM