Time series metrics reporting and alerting is an essential tool when it comes to monitoring production services. Graphs help you monitor trends over time, identify spikes in load / latency, identify bottlenecks with constrained resources, etc. Dropwizard Metrics is a great library for collecting metrics and has a lot of features out of the box including various JVM metrics. There are also many third party library hooks for collections metrics on HikariCP connections pools, Redis client connections, HTTP client connections, and many more.
Once metrics are being collected we need a time series datastore as well as a graphing and alerting system to get the most out of our metrics. This example will be utilizing Grafana Cloud which offers cloud hosted Grafana a graphing and alerting application that hooks into many datasources, as well as two options for time series datasources Graphite and Prometheus. StubbornJava has public facing Grafana dashboards that will continue to add new metrics as new content is added. Take a look at the StubbornJava Overview dashboard to start with.
Custom Dropwizard GraphiteSender
Note: This is not the Grafana Cloud recommended implementation
. Grafana Cloud recommends using a Carbon-Relay-NG process for pre-aggregating and batch sending metrics to Grafana Cloud. Since this site is currently only a single server we opted to implement an HTTP sender using the Grafana Cloud API to have less infrastructure overhead. If your system has multiple environments and services it is highly recommended to use the Carbon-Relay-NG process.
This implementation should be fairly straightforward. Dropwizard Metrics reporters are run on a single thread on a timer so we should not have to worry about thread safety in this class. Every time the reporter runs it will iterate all of the metrics contained in our MetricRegistry
convert them to the appropriate format and send the data to the Grafana API using OkHttp and serializing to JSON with Jackson.
DropwizardMetrics Reporter
Once we have our custom GraphiteSender
implemented all we are left to do is plug it into the existing GraphiteReporter
and start it. We have our keys partitioned by environment and host so that all metrics are easier to split up and view aggregates or host by host metrics. See it in action at StubbornJava Overview.