As the co-founder and standard-bearer for OpenStack, Rackspace gets a lot of questions from users, journalists, analysts and vendors — about how we run OpenStack at scale, whether we use upstream code or have forked the project, and how we decide what code to contribute back.
Given that Rackspace runs the oldest and largest OpenStack public cloud in the world, was the first to offer OpenStack private cloud as a service and runs some of the largest private clouds in production, it’s important we address those questions.
In this post, I aim to do that by describing how we operate OpenStack in our public and private clouds, as well as the philosophy that guides our choices. I’ll explain how we decide which projects to include and what to contribute back to the community while running clouds that are hosting hundreds of thousands of instances.
Most importantly, I want to talk about how the approach Rackspace takes benefits both end users and the OpenStack community at large.
This post is a little longer blog than I usually write, but I believe it will be valuable to readers — so sit down with your caffeinated beverage of choice and get comfortable.
Rackspace Public Cloud
Our public and private clouds have typically diverged by how much of the OpenStack project makes up the the core of teach offering. This is due primarily to when those offerings were created relative to the maturity of the open source project.
A quick historical review here may be helpful.
Prior to the creation of OpenStack in 2010, Rackspace had a public cloud based on technology acquired when we purchased Slicehost, which sold virtual machine “slices” running on Xen hypervisors. When the Slicehost-based cloud reached its scalability limits, Rackspace decided it was time to write our own public cloud platform. That fateful decision led to a partnership with NASA to open source the cloud platform we now know as OpenStack.
In the early days, Rackspace actually ran parallel public clouds — one running on the Slicehost technology and the other on OpenStack. It was a temporary solution, and by 2012, we’d shut down the Slicehost cloud and migrated those users to our OpenStack cloud. To ensure a smooth transition, Rackspace used Xen as the underlying hypervisor for our OpenStack cloud, to match what was running in the Slicehost cloud.
This meant Rackspace did not have the luxury of starting small in test and scaling up slowly over time to production. With tens of thousands of customers and thousands of compute nodes to manage from the beginning, we had to immediately build for scale. That has continued as our OpenStack public cloud has grown by orders of magnitude.
Running a cloud at scale and using still-maturing open source technology has meant making careful choices about how we deploy the constantly growing OpenStack code base. First and foremost, we have to serve well customers who depend on us to run their missional critical workloads. At the same time, we want to honor our commitment to grow the project we co-founded. To accomplish this, Rackspace has employed several guidelines.
To begin with, whenever possible, we deploy upstream OpenStack code. Not just every six months with new releases from the OpenStack Foundation, but pulling from the trunk on a regular basis and continuously upgrading our public cloud. With hundreds of thousands of customers, we hit more bugs then a smaller cloud might, so we can’t afford to wait every six months to patch those bugs.
We only include projects that have passed our extensive testing, and so are stable enough to run in production at scale. This means certain projects may never make it on to our public cloud or may not be included for years.
Even if a project is deemed production ready, there may be code or implementation changes required once it’s part of our public cloud, because of issues we find later as we hit scaling limits only visible after running in production. We then contribute those code changes back upstream.
Neutron, which Rackspace migrated from Nova-networks in 2014, is a good example. We realized that while Neutron was ready for production, the existing implementation had limits, especially when running it with hundreds of thousands of virtual machines and switch ports. To make it scale, we wrote the Neutron Quark plugin, which provided additional capabilities, such as segmentation. Most of the Quark capabilities have since been adopted in upstream Neutron.
If there is a service Rackspace deems critical to implement due to customer demand, but is not mature enough or not available in the OpenStack project, we will create the service, then contribute code back, or start a project with the goal of moving to the project upstream version once it’s ready.
We did this with the Trove DB-as-a-Service and Heat Orchestration projects. In both cases, Rackspace deployed the services, Cloud Databases and Cloud Orchestration respectively, before the project existed or had sufficiently matured. We then contributed code and lessons learned based on running both services in production at scale. We later updated our code to the upstream project code base when it was deemed ready.
These guidelines help us to stay close to the project while maintaining a large-scale cloud that serves the needs of our customers.
Rackspace Private Cloud
We’ve taken a different approach with our OpenStack private cloud.
Being first to market with private cloud-as-a-service, we had the luxury of waiting until the core projects were ready for production before releasing our offering. Without the large installed base we had with our public cloud, Rackspace has been able to follow a different set of guidelines.
First, we decided to base our offering on the upstream code. Our value is not in the OpenStack software, but in the services we provide to make OpenStack production ready and scalable.
For single tenant solutions owned by individual customers, Rackspace upgrades our private cloud offering in cadence with the Foundation’s six month releases, a concession to those customers not ready for continuous deployment of their cloud infrastructure. We still do the same stringent tests as with our public cloud, but focus on tracking bug fixes in upcoming releases and backporting fixes when necessary.
As with our public cloud, we only include projects that have passed our extensive testing and are stable enough to run in production at scale. This is because we architect our private clouds for scale and for service availability. We understand that running code on a few nodes in test/dev is far different than running it in large-scale production.
In our private cloud, we typically change our deployment and work with the community to improve the existing code base when a deployed project hits scalability limits. Our decision to use the ML2 plugin for Neutron with Linux bridge in place of Open vSwitch is one example, based on instability and scaling issues we saw at the time with OVS.
Rackspace continues to work with OVS and the Neutron community and our goal is to move back to OVS in our Neutron deployment.
How the Rackspace Approach Benefits Everyone
So how do the these two approaches benefit end users and the OpenStack community?
End-Users — Public cloud customers typically don’t care what cloud platform we’re running or that it’s based on an open source project. Primarily, they care about price, scale and innovation. Our approach to OpenStack public cloud allows us to provide those values through a platform that lets us scale and develop new services.
Our OpenStack private cloud ensures our customers can run their mission-critical workloads on a stable platform that doesn’t require vendor lock-in.
All our cloud customers benefit from using a cloud, public or private, operated by a team with more experience and expertise in the industry than anyone else in the world. In particular, customers can rely on a Rackspace team that knows how to scale out OpenStack clouds while maintaining their stability.
We encourage potential customers to talk with us about a partnership that offers all the benefits of our experience and expertise.
The OpenStack Community — As the standard-bearer for OpenStack, Rackspace is committed to the success of the project and the growth of the great community that has grown up around it.
We share our experience and expertise operating the world’s largest OpenStack clouds with the community so members can avoid some of the challenges Rackspace faced while learning what it takes to operate large scale clouds. Our ability to innovate new services in our public cloud helps by creating a place where OpenStack projects can be refined at large production scale.
We invite the community to work with us on evolving the project and on initiatives like the OpenStack Innovation Center, so together we can help customers build the next great set of innovations on OpenStack clouds.