ESmith
07/02/2025, 1:39 PMPREFECT_CLIENT_ENABLE_METRICS
Or is it the same as setting
- name: PREFECT_SERVER_METRICS_ENABLED
value: "true"
- name: PREFECT_API_ENABLE_METRICS
value: "true"
On the serverMarvin
07/02/2025, 1:39 PMMarvin
07/02/2025, 1:40 PMPREFECT_SERVER_METRICS_ENABLED=true
or PREFECT_API_ENABLE_METRICS=true
on your server deployment is sufficient. These variables are effectively synonymous.
- This enables Prometheus metrics endpoint on the Prefect server.
2. Client-side metrics:
- PREFECT_CLIENT_ENABLE_METRICS
is a separate configuration that controls metrics reporting from Prefect clients (where your flows run).
- If you want to collect metrics from your flow runs, you should set this in your flow deployment environment or worker environment.
So in practice:
- Set PREFECT_SERVER_METRICS_ENABLED=true
(or PREFECT_API_ENABLE_METRICS=true
) in your server deployment
- Set PREFECT_CLIENT_ENABLE_METRICS=true
in your worker/agent deployments if you want client-side metrics
They serve different purposes and aren't interchangeable - one enables metrics collection on the server side, the other on the client side. For complete metrics coverage, you'd want both enabled in their respective components.
You can find more details about these settings in the Prefect settings reference documentation.ESmith
07/02/2025, 1:41 PMSet PREFECT_CLIENT_ENABLE_METRICS=true in your worker/agent deployments if you want client-side metrics
Would this be on the prefect-worker pods?Marvin
07/02/2025, 1:41 PMMarvin
07/02/2025, 1:41 PMPREFECT_CLIENT_ENABLE_METRICS
should actually be set on the flow run pods, not the worker pods. Here's why:
- The worker pods are responsible for orchestration and scheduling
- The flow run pods are where your actual flows execute and where the client metrics are generated
- Client metrics are collected during flow/task execution
In your Kubernetes setup, you would need to:
1. Set this in your flow deployment configuration where you define the environment variables for the flow run pods
2. This could be done through your deployment infrastructure configuration or through the Kubernetes infrastructure block settings
This makes more sense architecturally because the metrics are generated during the actual execution of flows and tasks, which happens in the flow run pods, not in the worker pods that manage them.
Would you like me to provide an example of how to configure this in your deployment setup?ESmith
07/02/2025, 1:42 PM{
"env": {
"PREFECT_CLIENT_METRICS_PORT": "4201",
"PREFECT_CLIENT_ENABLE_METRICS": "true",
"PREFECT_CLIENT_METRICS_ENABLED": "true"
},
"image": "<http://xxxx.dkr.ecr.eu-west-2.amazonaws.com/xxx-ci85|xxxx.dkr.ecr.eu-west-2.amazonaws.com/xxx-ci85>",
"labels": {
"<http://k8s.grafana.com/scrape|k8s.grafana.com/scrape>": "true",
"<http://k8s.grafana.com/metrics.portNumber|k8s.grafana.com/metrics.portNumber>": "4201",
"<http://k8s.grafana.com/metrics.scrapeInterval|k8s.grafana.com/metrics.scrapeInterval>": "5s"
},
"cpu_request": "100m",
"mem_request": "512Mi",
"service_account_name": "xxxx"
}
Marvin
07/02/2025, 1:42 PMMarvin
07/02/2025, 1:43 PMPREFECT_CLIENT_ENABLE_METRICS
)
2. Grafana Labels:
- ✅ The Kubernetes labels for Grafana scraping are correctly configured
- ✅ The port number (4201) matches your metrics port
- ✅ Scrape interval of 5s is reasonable
Here's a cleaned-up version of your configuration (removing the redundant variable):
json
{
"env": {
"PREFECT_CLIENT_METRICS_PORT": "4201",
"PREFECT_CLIENT_ENABLE_METRICS": "true"
},
"image": "<http://xxxx.dkr.ecr.eu-west-2.amazonaws.com/xxx-ci85|xxxx.dkr.ecr.eu-west-2.amazonaws.com/xxx-ci85>",
"labels": {
"<http://k8s.grafana.com/scrape|k8s.grafana.com/scrape>": "true",
"<http://k8s.grafana.com/metrics.portNumber|k8s.grafana.com/metrics.portNumber>": "4201",
"<http://k8s.grafana.com/metrics.scrapeInterval|k8s.grafana.com/metrics.scrapeInterval>": "5s"
},
"cpu_request": "100m",
"mem_request": "512Mi",
"service_account_name": "xxxx"
}
This configuration will:
1. Enable metrics collection for your flow runs
2. Expose metrics on port 4201
3. Allow Grafana to scrape these metrics every 5 seconds
The rest of your configuration (CPU, memory, service account) looks fine and doesn't impact the metrics setup.ESmith
07/02/2025, 1:44 PMMarvin
07/02/2025, 1:44 PMMarvin
07/02/2025, 1:45 PMprometheus_client
library
- There are specific counters for:
- Event emissions and subscriptions
- Client-side operations
- These metrics are separate from the server metrics
2. Collection Setup:
- Yes, you'll need to configure your Prometheus to scrape these metrics separately from the server metrics
- The client metrics are exposed on a different endpoint (localhost:4201 in your case) compared to the server metrics (/api/metrics)
- Your Grafana labels look correct for scraping, but you need to ensure your Prometheus configuration can reach these endpoints
3. When metrics are exposed:
The client metrics endpoint becomes available during:
- Flow runs (when entering the flow run context)
- When serving flows via serve(...)
- When serving tasks via serve(...)
To troubleshoot why you're not seeing metrics:
1. First, verify the metrics endpoint is actually available:
bash
curl <http://localhost:4201/metrics>
(Replace localhost with the appropriate pod IP if needed)
2. Check if Prometheus is actually scraping the endpoint:
- Look at your Prometheus targets page
- Check for any scraping errors in Prometheus logs
3. Make sure your network policies allow:
- Prometheus to reach the flow run pods on port 4201
- The metrics port to be accessible from outside the pod
Would you like me to help you verify any of these specific aspects? Also, could you share what you see when you try to access the metrics endpoint directly?ESmith
07/02/2025, 1:46 PMMarvin
07/02/2025, 1:46 PMMarvin
07/02/2025, 1:48 PMprometheus_client
library, which typically include:
- process_cpu_seconds_total
- process_open_fds
- process_resident_memory_bytes
- python_gc_objects_collected
- python_info
2. Prefect-specific metrics related to:
- Event emissions and subscriptions
- Client-side operations
However, I cannot find the exact Prefect-specific metric names in the documentation or code. The best way to see what metrics are available would be to:
1. Access the metrics endpoint directly on a running flow:
bash
curl <http://localhost:4201/metrics>
This will show you all available metrics.
2. Or check your Prometheus targets that are successfully scraping these endpoints.
Would you like me to help you construct a command to check the metrics endpoint, or would you like me to do more research to find the specific Prefect metric names?ESmith
07/02/2025, 1:48 PMMarvin
07/02/2025, 1:49 PMMarvin
07/02/2025, 1:49 PMPREFECT_CLIENT_ENABLE_METRICS
, you're only getting the default Python metrics from the prometheus_client
library. This includes basic process metrics like:
- process_cpu_seconds_total
- process_open_fds
- process_resident_memory_bytes
- python_gc_objects_collected
- python_info
The Prefect-specific metrics appear to be still in development. Looking at the GitHub history, while the infrastructure for client metrics is in place (the ability to expose metrics endpoints), the actual Prefect-specific metrics implementation is still evolving.
If you're looking to monitor Prefect operations specifically, for now you should:
1. Continue using the Prometheus exporter for the Prefect server, which provides more Prefect-specific metrics
2. Use the server's /api/metrics
endpoint which has more comprehensive Prefect-related metrics
Would you like me to provide information about what metrics are available through the server's metrics endpoint instead? Those tend to be more comprehensive for monitoring Prefect operations.