EDITOR’S NOTE: Rackspace Service Registry is no longer available.
Last week we announced the preview availability of Rackspace Service Registry. We want to give our customers some more insight into the Service Registry. This is the first article in a “Behind the Scenes” series where I will talk in depth about different aspects of this product such as:
- Why we decided to build it.
- How we built it (product architecture).
- Best practices we have adopted for developing and deploying the product (continuous integration, continuous delivery / deployment, etc.)
- How we use other Rackspace Cloud products such as Cloud Servers, Cloud Load Balancers, Cloud Files and Cloud Monitoring.
- Product use cases.
- How to integrate it into your application.
In this blog post I will explain our motivation and reasoning for building it.
Why did we build Rackspace Service Registry?
While working on Cloud Monitoring we encountered several problems. One of them was that we had many different services but no easy way to keep track of them. Most services were already tracked in some way or another in Chef, but there were multiple problems with using Chef for this purpose:
- Our Chef setup is currently only deployed in a single region so it is not highly-available in the same way other components are.
- Information in Chef isn’t real-time – Chef knows if some service should be present on some server, but it doesn’t know if service is actually up and / or running.
- Chef was primarily built for applying configuration and keeping a set of servers in a consistent state that is defined in a set of cookbooks and recipes. As such, Chef primarily cares about the servers on which your application and services run. An ideal service registry shouldn’t need to know anything about the servers upon which those services run.
Cloud Monitoring is a complex product composed of multiple services running on many servers in multiple regions.
If you are interested in more details about the Cloud Monitoring architecture, please refer to the great post Paul Querna wrote last year titled Technology behind Rackspace Cloud Monitoring.
Usually the first thing we do when we need to solve a problem is look for an existing open-source project that already does what we need. In this case we could not find a project that fit our needs. We found Noah, which looked really promising, but because it’s using Redis as a main data store it didn’t fulfill our main requirement – high availability without a single point of failure. Keep in mind that this was last year, long before Netflix announced Eureka.
Because we didn’t find a project that would fit our needs, we decided to build it ourselves on top of the excellent Apache ZooKeeper project. Choosing Apache ZooKeeper had multiple advantages. We already used it for other operations such as leader election and distributed locking. This meant we were already familiar with its APIs and operations, so using it added no complexity or mental overhead to our project.
The data model and distributed architecture of Apache ZooKeeper make it a good fit for building a service registry on top of it.
The service registry we built in Cloud Monitoring is simple, but it makes service discovery and some other things a lot easier. Here are some examples:
- Service discovery – e.g. finding all the API servers located in ORD
- Jobs discovery – We registered all the long running jobs (e.g. a script that calculates daily usage statistics, etc.) in the registry and this allowed us to see which jobs are currently running.
- Version check – When registering an API server in the registry we included current version in the payload field. This allowed us to easily check which version of the software is running on which servers and check for potential version mismatches.
We quickly saw that the service registry added a lot of value and simplified operations. This motivated us to start thinking about how we could share it with other internal and possibly external users.
One approach would be to run a separate ZooKeeper cluster and service registry for each user/tenant, but this would be expensive to operate and maintain. Because of that we decided to build a new service registry using a SaaS model designed with multi-tenancy in mind from the start.
The new Rackspace Service Registry service builds on the core ideas and use cases we observed in Cloud Monitoring. While the original Cloud Monitoring service registry was simple and only offered service discovery, the new registry goes beyond that by offering useful features we think are helpful when building highly available and scalable applications. Some of these new features are listed below.
Better object model
In the Cloud Monitoring service registry we only had one object called service that could contain arbitrary key/value pairs in the payload field. This worked fine for our very simple and focused use case.
In Rackspace Service Registry we expanded this model and introduced a concept of a session and added “tags” and “metadata” fields to the service model. This allows users to use service registry in a variety of ways that were not possible with the previous model. You can learn about some of those ways in the next blog post where I will talk about some common use cases.
HTTP + JSON
In the Cloud Monitoring service registry, clients talked to the registry over TCP using a native binary ZooKeeper protocol.
In the Rackspace Service Registry we went with HTTP and JSON. In cases like this, a combination of HTTP + JSON has multiple advantages:
- It’s easier to start playing with it – you can use simple command line tools such as cURL.
- It’s easier to build clients and talk to the service because almost every programming language supports HTTP.
- It’s easier to debug issues because the text is human readable.
In Rackspace Service Registry HTTP is also used for letting us know that the session is alive. This is done by heartbeating the session – periodically sending an HTTP POST request. Some people would argue that sending many heartbeats over HTTP is expensive, but we believe that with using HTTP/1.1 persistent connections this is not an issue.
We added an events feed that includes all of the events that have happened during the lifecycle of your account (e.g. a service comes online, a configuration value gets updated, etc.).
The events feed is a great information source about your infrastructure and can act as a platform for auto-scaling or be used to kick off a host of different automation processes.
Configuration storage allows users to store arbitrary configuration values in our system and get notified via the events feed when a value gets updated or deleted.
Storing all the configuration values in a centralized place allows for better visibility and easier introspection of the configuration data. It also allows applications to react to changes faster and makes automation of tasks that rely on those values a lot easier.
In a typical application you usually have configuration values stored in a config file and if you want to update it you kick off some process that updates the configuration file and restarts the service or sends it a SIGHUP signal.
With the Service Registry Configuration Storage feature you can poll the events feed for changes inside your application and once the value changes you can more easily and quickly react to those changes directly in your application.
I hope you’ve found this post helpful! If you haven’t signed up for the preview yet, I encourage you to do so by filling out this short survey. I also encourage you to come back next week when I’ll talk more in depth about why you might need Service Registry and present some typical use cases.