Hey folks, I quite like the Prometheus Exporter, b...
# ask-community
j
Hey folks, I quite like the Prometheus Exporter, but I want to be able to emit fine-grained details from some flow & task internals (at customer request). Does anyone have any ideas on what a "good place" to put a prometheus-client would be in an agent so flows could increment counters, etc.? Or is writing that out to a db & querying similar to how the Exporter functions likely to be the cleanest solution?
c
Hi Joe! In Prefect 3.x, we've added
prometheus_client
as a dependency of Prefect (both client and server), so you do have that available as an option. Running a metrics server for scraping is disabled by default, but the setting
PREFECT_CLIENT_ENABLE_METRICS
turns it on and
PREFECT_CLIENT_METRICS_PORT
sets the port for it (defaulting to 4201). With that combo enabled, your flows would start a metrics server while they're running, which Prometheus can scrape. You'll get a combination of whatever metrics we emit from Prefect itself (currently just a few, but we want to grow that over time) as well as any metrics your flows emit. You'll need a service discovery mechanism in place for Prometheus to identify where the flows are running, but if you're on Kubernetes using the Kubernetes Prefect worker, that should be relatively easy to set up with a
PodMonitor
(https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#podmonitor)
j
Oh that's pretty cool! I assume that when the flow ends the endpoint and the metrics disappear also? or do they persist to somewhere?
I'm moving to Kube for most of our baseline, but the specific client for this is all-in on Nomad so I need to be mindful of that.
c
The way Prometheus works is that it reaches out to scrape the configured endpoints and warehouse the metrics. So for any particular flow process, the metrics will just be available ephemerally at an endpoint
<hostname>:4201/metrics
that Prometheus will scrape if it's configured to monitor that type of Pod
If there's no Prometheus scraping the flows, the metrics do kind of just evaporate
If you're moving to Kubernetes or to an environment that Prometheus has service discovery for, you should be able to set it up so that each Pod running a flow gets scraped, just with the right labeling/selectors/etc for that kind of environment. In Kubernetes it's `PodMonitor`s matching `Pod`s by their labels. Not sure with Nomad but I expect it's similar
j
Thanks a ton, this is super helpful. I'll have to do a bit of digging but this definitely helps me make the case for investing in upgrading.
c
Amazing, best of luck and let us know how it goes!
n
It's actually great that you've decided on a native Prometheus integration. I wonder, are there any upcoming plans to add metrics around flow run states? The open-source Prometheus exporter hasn't been adjusted to the breaking changes in Prefect 3.0 and some important metrics aren't available any more, e.g
CRASHED
flow runs are not reported. Adding this would be extremely valuable to me, as I am rolling out Prefect in our company at the moment on on-premise infrastructure. Our alerting system integrates natively with Prometheus metrics.
c
Thanks, Nas! We haven't added metrics on very many Prefect internals yet, just a counter for the events client as a proof-of-concep;t. The general premise was mostly around enabling people to add metrics in their flows and tasks, but it definitely makes sense to have counters coming out of the engines as well. Would you be interested in sending a PR that does this? Feel free to tag me (
@chrisguidry
) on it
I also understand if a PR is a big ask, so if nothing else, please send us an issue outlining how you'd use this
n
I'll try and do both this weekend, Chris.