New Cloud Servers SLA…And Why It Matters

Today we are pleased to introduce a new and improved SLA for our Next Generation Cloud Servers service, powered by OpenStack.

To appreciate what has changed, it’s important to understand how cloud services are built. The vast majority of cloud services have two major components: 1) a “control plane” which is comprised of the API, provisioning system, database, etc; and 2) a “data plane” which is the actual resources that get provisioned via the control plane – in this case, cloud servers. (If you have a networking background, control and data planes may sound familiar.)

These components have different availability characteristics. It’s quite possible for the control plane to be down while the data plane is up (e.g. you can’t add servers because the API is down but your hosted web site is still up) as well as the data plane be down and the control plane up (e.g. the host running your web server crashes but you can create a replacement cloud server via the API).

Historically we have only guaranteed the Cloud Servers data plane and the new SLA adds control plane guarantee as well. This is meaningful for a couple of reasons:

  1. Apps are increasingly integrating with infrastructure APIs to make dynamic adjustments and thus take on a large dependency on API availability. A control plane guarantee means you can rely on the Cloud Servers API to be there when you need it.
  2. OpenStack has proven itself and we are ready to guarantee it.  Since its launch in the fall of 2012, Cloud Servers has handled approximately 650 million API requests with an overall uptime of 99.95%. At this time, we are guaranteeing a 99.9% control plane availability but have every intention of pushing it higher over time. Note also that we don’t cheat. We count all server side HTTP 5xx errors as unavailability, maintenance is not excluded, and we measure availability monthly.
  3. Having both control and data plane guarantees means you can build apps the way you want. If you want to build a more traditional static app that doesn’t need to work around data plane failure, you can do that. The data plane guarantee is there. If you want to build an elastic app that integrates with the API to autoscale, you can do that as well. The control plane guarantee is there.  No forced complexity. The choice is yours.

While SLAs are important, they are more than legalese to us. They are promises we make to our customers. It’s part of how we deliver Fanatical Support. We hate downtime and we work hard every day to keep our promises and provide you with a powerful and reliable platform so you can do what you do great. Thanks for being a customer, and we hope the new SLA gives you even more confidence in Cloud Servers and OpenStack.

Erik joined Rackspace in 2008 as Chief Architect helping to launch and grow Cloud Servers as well as integrate and optimize multiple services across the Rackspace Cloud portfolio. Erik has been involved in OpenStack since its inception and helped launch the Quantum network service. Erik currently serves as Director of Product Strategy for the Cloud Infrastructure Product Line, which includes all base cloud building block services (Cloud Servers, Cloud Networks, Cloud Block Storage, Cloud Files, Cloud Load Balancers and RackConnect). Prior to joining Rackspace, Erik was Chief Infrastructure Architect for SRA International, where he helped architect solutions for large, complex enterprise and government clients. Erik is a graduate of Virginia Tech and holds a B.S. in Computer Engineering and a minor in Computer Science.


  1. What’s the SLA for server creation? I have been waiting over an hour for a “create from image” to complete and I’m losing customers.

    • Thanks for the comment Brian. We’ve debated an SLA on server creates but don’t currently have one. We generally run in the couple to several minutes for base Linux builds and we have a couple of projects underway to drive down build times across the board. Part of the challenge with a build time SLA is the build time is extremely dependent on the image size. Windows images are bigger than Linux and custom images tend to be bigger than base images. It sounds like you are building from a custom image. If so, the hour+ build time is likely because the image is big (we have to move the image bits from cloud files to the host). I’d love to hear more and help get to the bottom of it if it’s not a large custom image.

      Another important option for build time optimization is diskconfig. By default, we auto expand the disk and filesystem. This can be a little time consuming, especially on bigger flavors. If you’d like the server faster and either don’t need the additional space and/or want to expand yourself later, you can set diskconfig to manual during server create (see our API docs here –> for more info. We’ll also be adding this option to our control panel shortly so it will be available via both API and UI.

      Feel free to email me at erik dot carlin at for more info on either of these.



Please enter your comment!
Please enter your name here