Hadoop-Ganglia Integration using Hadoop Metrics2 Framework

January 28, 2014

In our previous post here, we detailed why Ganglia is a good tool for monitoring clusters. However, when monitoring a Hadoop cluster you often need more information about CPU, disk, memory, and nodal network statistics than the generic Ganglia config can provide. For those who need more finely tuned monitoring, Hadoop supports a framework for recording internal statistics and then for posting them to an external source, either to a file or to Ganglia. In fact, Hadoop now supports an implementation of the Metrics2 Framework for Ganglia. In this post we’ll discuss Hadoop Metrics2 Framework’s design and how it enables Ganglia metrics.

Features

The Hadoop Metrics2 Framework provisions multiple metrics output plugins for use in parallel. It allows dynamic reconfiguration of metrics plugins without having to restart the server, and it exports metrics via Java Management Extensions (JMX).

Design Overview

The Hadoop Metrics2 Framework consists of three major components:

The metric source is used to generate metrics.
The metric sink is used to consume the metrics produced by the metric sources.
The metric system is used to periodically poll metric sources and to pass the metric records to sink.

Implementing and Configuring Components

A metric source class must implement the following interface:

org.apache.hadoop.metrics2.MetricsSource

A metric sink must implement this interface:

org.apache.hadoop.metrics2.MetricsSink

The basic syntax to configure metric system components is:

&lt;prefix&gt;.(source|sink).&lt;instance&gt;.&lt;option&gt;

Here’s a sample job tracker configuration for sinking a file:

jobtracker.sink.file.class=org.apache.hadoop.metrics2.sink.FileSink

jobtracker.sink.file.filename=jobtracker-metrics.out

Filtering Metrics

Metrics can be filtered on source, context, records, tags, and metrics themselves. Here is a filtering example :

test.sink.file1.class=org.apache.hadoop.metrics2.sink.FileSink

test.sink.file0.context=foo:

This will filter out all the metrics within the context “foo”.