Addressing Hybrid Architecture Complexity With New Docker Monitoring Plugin

It is now relatively easy to have hybrid infrastructures comprising containers, virtual servers and bare metal machines. Container adoption is getting easier than ever. Docker has been downloaded more than 200 million times and Google spins up more than 2 billion containers in a week.

The downside to this ease in infrastructure creation is that it has encouraged us to develop increasingly complex hybrid architectures that are hard to monitor.

To address this, I created a plugin for the Rackspace Cloud Monitoring Agent that uses the recently released Docker stats endpoint to provide container level CPU, memory and network metrics on a single system. This plugin allows the user to look at metrics from their hybrid infrastructure, including cloud servers, bare metal and containers, and the user gets alerted when things go wrong through a pre existing email, SMS, PagerDuty or VictorOps notification.

With the advent of Magnum, containers are getting first-class citizenship in Openstack. Magnum uses Heat to orchestrate an OS image, which contains Docker and Kubernetes and runs that image in either virtual machines or bare metal in a cluster configuration. Thus making consistent monitoring across containers, cloud servers and bare metal machines even more relevant.

There are many monitoring tools out there. There are also special container monitoring products such as the one from Datadog (not free) and the really cool docker-mon, which gives you helpful container info right from your console. These solutions still do not offer a single pane of glass. The Rackspace cloud monitoring plugin was created to specifically solve this problem.

In the spirit of collaboration, I wanted to open this tool up to the community for testing and feedback. I encourage you to give it a shot and provide feedback in the comments below. Also I would also like to hear about any other solutions you found useful to monitor your hybrid infrastructure. This is an issue we will continue to work on as infrastructure complexity increases, and one that I have a keen interest in working on collaboratively to solve.

Nachiket has more than eight years of industry experience as a software developer and software developer in test. He was a core contributor to the Yahoo! Fantasy Sports backend engineering team and has extensive experience across the entire web development stack including feed processing, SQL and noSQL databases, REST APIs, caching and large scale distributed systems. He also believes that software testing is a primary quality for a good engineer and has written complicated integration tests and built test platforms running hundreds of thousands of test cases. He has worked extensively in Java and PHP and recently picked up NodeJS. Nachiket now works on the Cloud Monitoring product at Rackspace, which monitors and collects metrics for thousands of corporate customers' complicated infrastructures.


Please enter your comment!
Please enter your name here