Hey folks I quite like the Prometheus Exporter but I want to Prefect Community #ask-community

Hey folks, I quite like the Prometheus Exporter, b...

Joe

08/23/2024, 12:17 AM

Hey folks, I quite like the Prometheus Exporter, but I want to be able to emit fine-grained details from some flow & task internals (at customer request). Does anyone have any ideas on what a "good place" to put a prometheus-client would be in an agent so flows could increment counters, etc.? Or is writing that out to a db & querying similar to how the Exporter functions likely to be the cleanest solution?

Chris Guidry

08/23/2024, 2:40 PM

Hi Joe! In Prefect 3.x, we've added

prometheus_client

as a dependency of Prefect (both client and server), so you do have that available as an option. Running a metrics server for scraping is disabled by default, but the setting

PREFECT_CLIENT_ENABLE_METRICS

turns it on and

PREFECT_CLIENT_METRICS_PORT

sets the port for it (defaulting to 4201). With that combo enabled, your flows would start a metrics server while they're running, which Prometheus can scrape. You'll get a combination of whatever metrics we emit from Prefect itself (currently just a few, but we want to grow that over time) as well as any metrics your flows emit. You'll need a service discovery mechanism in place for Prometheus to identify where the flows are running, but if you're on Kubernetes using the Kubernetes Prefect worker, that should be relatively easy to set up with a

PodMonitor

(https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#podmonitor)

Joe

08/23/2024, 2:45 PM

Oh that's pretty cool! I assume that when the flow ends the endpoint and the metrics disappear also? or do they persist to somewhere?

Joe

08/23/2024, 2:46 PM

I'm moving to Kube for most of our baseline, but the specific client for this is all-in on Nomad so I need to be mindful of that.

Chris Guidry

08/23/2024, 2:47 PM

The way Prometheus works is that it reaches out to scrape the configured endpoints and warehouse the metrics. So for any particular flow process, the metrics will just be available ephemerally at an endpoint

<hostname>:4201/metrics

that Prometheus will scrape if it's configured to monitor that type of Pod

Chris Guidry

08/23/2024, 2:47 PM

If there's no Prometheus scraping the flows, the metrics do kind of just evaporate

Chris Guidry

08/23/2024, 2:49 PM

If you're moving to Kubernetes or to an environment that Prometheus has service discovery for, you should be able to set it up so that each Pod running a flow gets scraped, just with the right labeling/selectors/etc for that kind of environment. In Kubernetes it's `PodMonitor`s matching `Pod`s by their labels. Not sure with Nomad but I expect it's similar

Joe

08/23/2024, 2:51 PM

Thanks a ton, this is super helpful. I'll have to do a bit of digging but this definitely helps me make the case for investing in upgrading.

Chris Guidry

08/23/2024, 2:51 PM

Amazing, best of luck and let us know how it goes!

Nas Denkov

10/09/2024, 8:55 AM

It's actually great that you've decided on a native Prometheus integration. I wonder, are there any upcoming plans to add metrics around flow run states? The open-source Prometheus exporter hasn't been adjusted to the breaking changes in Prefect 3.0 and some important metrics aren't available any more, e.g

CRASHED

flow runs are not reported. Adding this would be extremely valuable to me, as I am rolling out Prefect in our company at the moment on on-premise infrastructure. Our alerting system integrates natively with Prometheus metrics.

Chris Guidry

10/09/2024, 2:32 PM

Thanks, Nas! We haven't added metrics on very many Prefect internals yet, just a counter for the events client as a proof-of-concep;t. The general premise was mostly around enabling people to add metrics in their flows and tasks, but it definitely makes sense to have counters coming out of the engines as well. Would you be interested in sending a PR that does this? Feel free to tag me (

@chrisguidry

) on it

Chris Guidry

10/09/2024, 2:33 PM

I also understand if a PR is a big ask, so if nothing else, please send us an issue outlining how you'd use this

Nas Denkov

10/10/2024, 7:24 AM

I'll try and do both this weekend, Chris.

163 Views

Open in Slack

Previous Next