This is one of a collection of posts I’ve written recently to provide a high-level introduction to all of the products and services available within a Rackspace Cloud Account. I like to think of them as the building blocks of the Internet. Each post will give a description of the product, how to use it, the costs and the most frequently asked questions about that product.
I’ve also included several videos to help show how to configure and deploy and use all of these products.
As always, feedback is welcome and appreciated!
Being notified when your website becomes unavailable and having accurate data about your Cloud Server’s performance is key to making important decisions about how to maintain and optimize your application’s infrastructure. Cloud Monitoring provides you with a set of tools that monitor, analyze and report on the availability and performance of your websites, servers and other cloud resources.
Cloud Monitoring is used by configuring one or more checks that will monitor the internal performance of your Cloud Server (Agent Checks) as well as the availability of your website from different points on the Internet (Remote Service Checks). Using these checks is key to consistent improvement and optimization of your application’s code and infrastructure as well as the ability to maintain high availability for your customers.
Checks can be created by either clicking on the Action button on your Cloud Server’s detail screen or clicking the gear icon in the list of Cloud Servers.
Creating a check from the Cloud Server’s detail screen
Creating a check from the Cloud Server list.
Remote Service Checks
HTTP Check (Website)
This check monitors the availability of your website either by URL or by IP address and alerts you if the site becomes unavailable for more than 30 seconds.
Here’s how to configure and test the HTTP Check:
TCP Check (Port)
This check will monitor the response from a specific port on your server, usually determining if the process that is bound to that port is running. In the video below you will see how I use the TCP Check to monitor the availability of MySQL on my database server.
Here’s how to configure and test the TCP Check:
Ping Check (Server)
Ping is a network utility that checks the availability of a computer (node) on a network. If the node responds, the Ping utility will also measure how long it takes for a small packet of information to make a round trip from your computer to that remote system. This check monitors the general responsive of your server on the network and alerts you if it fails to respond.
Agent Checks require a monitoring agent to be installed on your Cloud Server. If you have a cloud account with a managed service level, the monitoring agent is installed for you as part of the build process. If you have an infrastructure account, you will need to install the agent manually.
Once the agent is installed, you’ll be able to see current and historical performance information about a Cloud Server from its detail screen. Agent Checks allow you to set specific thresholds that will trigger notifications to you and your team.
This video shows you how to install the monitoring agent on a Cloud Server running CentOS 6.4.
Here is a brief description of the available Agent Checks:
Your server has a finite amount of memory. Running low on memory will negatively affect the performance of your entire server and possibly cause it to be unresponsive. This check will alert you if your Cloud Server’s memory utilization surpasses 80 percent, but that value can be changed to meet your needs.
Like memory, your server will effectively shut down if it runs out of CPU. This check will return WARNING for 90 percent used and CRITICAL for 95 percent used. These thresholds can be configured to your needs.
Load Average (Linux Only)
Unique to UNIX systems, a server’s load average represents the average amount of system work (CPU, disk, memory, etc.) that a computer has performed over a period of time.
This alarm triggers when your server becomes heavily loaded. By default it will return WARNING when load average exceeds 1x the number of vCPUs and CRITICAL when it exceeds 1.5x the number of vCPUs.
Your server needs a certain amount of free disk space to operate. This check will monitor your server’s disk utilization and alert you when used space reaches a set threshold on the default mount point. By default, it will return WARNING for 80 percent of capacity and CRITICAL for 90 percent of capacity.
Even if your server is operating properly, it does little good if it cannot communicate over the network. This check monitors the rate at which your server is sending and receiving data. It will send a WARNING or ALERT if either rate drops below a value which you configure.
Q: How much does Cloud Monitoring cost?
A: Cloud Monitoring is billed monthly based on three factors:
For the most accurate and up-to-date information, please reference the pricing page for Cloud Monitoring.
Q: What other information is available about Cloud Monitoring?
A: There’s quite a bit available. I recommend reading the Control Panel Monitoring: What do the options do? in the Rackspace Knowledge Center. I also found a lot of useful information in the API documentation for Cloud Monitoring.
Be sure to check out previous posts: Introduction to Cloud Backup, Introduction to Cloud Files, Introduction to Cloud Servers, Introduction to Cloud Databases and Introduction to Cloud Load Balancers.