Scalable Docker Monitoring with Fluentd, Elasticsearch and Kibana 4

Screen Shot 2014-11-20 at 14.38.27

Docker is a great set of technologies. Once you are comfortable with using it, you are presented with a set of challenges, you didn't have before. To name some:

  • log consolidation: How to retrieve log files from dozens of containers?
  • monitoring: How much RAM and CPU is each container using?

There are a few articles on this topic out there. After reading them none of the solutions really hit me, but they all had some nice features which I chose to combine here.

1. Data collection

Before we can store our precious logs and stats, we'll have to collect them. I chose fluentd for it's large number of plugins and easy expandability. Logstash is an alternative you might want to consider.

I installed fluentd via gems. There are debs available, but they caused permission issues for me. Plus: Plugins are installed via gems as well so better get used to it soon.

apt-get install ruby-dev libcurl4-gnutls-dev
gem install fluent

In addition we'll use an experimental plugin provided by Kiyoto, one of the fluentd maintainers to collect statistics.

gem install fluent-plugin-docker-metrics

With this setup we have the means to collect logs AND monitoring metrics. Next we need to tell fluentd, what to collect in detail. So let's create the right folder and config file.

mkdir /etc/fluent
mkdir /etc/fluent/conf.d
vi /etc/fluent/fluent.conf

I'm mentally splitting fluent.conf in 3 parts:

  1. collection of Docker logs
  2. collection of Docker metrics
  3. output to Elasticsearch

Part 1 will live in the conf.d folder, part 2 is below and part 3 will come a bit later:

include conf.d/*.conf

   type docker_metrics
   stats_interval 1m
   tag_prefix metrics

Since Docker containers are dynamic, I'm using a template as config file. Each Docker instance is a file in conf.d. Personally I deploy my containers with Ansible, but you could easily write a Python script or use this Go-program by Jason Wilder. (Also read his own blog post on Docker monitoring.)

  type tail
  format json  
  time_key time
  time_format %Y-%m-%dT%H:%M:%S.%L%z  
  path /var/lib/docker/containers/{{ docker_containers[0]['Id'] }}/{{ docker_containers[0]['Id'] }}-json.log
  pos_file /var/lib/docker/containers/{{ docker_containers[0]['Id'] }}/{{ docker_containers[0]['Id'] }}-json.log.pos  
  tag docker.container.{{ image_name }}_{{ subdomain }}
  rotate_wait 5

As you can see, I'm directly using the docker_container variable in Ansible and then tag each input source with the container name. Many solutions will only give you the container hash, which is neither readable nor permanent.

2. Data storage

In the next step we'll need an Elasticsearch container. For permanent data storage, you can mount an external folder as data store.

docker run -d -p 9200:9200 -p 9300:9300 --name elasticsearch dockerfile/elasticsearch

With Elasticsearch up and our input sources in place, we can add these output configurations at the end of fluent.conf

 type copy
   type stdout
 type elasticsearch
logstash_format true
 flush_interval 5s #debug
 type_name log
 include_tag_key true

 type copy
   type stdout
 type elasticsearch
 logstash_format true
 flush_interval 5s #debug
 type_name metric
 include_tag_key true

These will take your logs and metrics and write them to Elasticsearch, as well as to stdout. Now we're ready to test our setup. Just run fluentd (you may have to add it to your PATH first) and see if there are any problems.

3. Analysis

With Elasticsearch and fluentd working, you can use one of many available web interfaces for data analysis. I can recommend Kibana 4, which is currently in Beta. I has a very smart workflow and is easy to deploy. Just download it from here and run it


For Log-analysis you are currently bound to the Discover-tab.

Screen Shot 2014-11-20 at 14.37.58

For metric, use the visualize tab. Picking a single metric and visualizing it for all containers might be a good way to start. Dasboards can be shared as iFrame.

For now this is a satisfactory and scalable setup. Personally I assume that one of the next Docker updates will bring some changes to logging and monitoring. This could make some of the above steps easier.