For those of you who don’t know, Trove is the newest integrated OpenStack project. We have been working on it for over two years at Rackspace, and it’s been a wild ride. We’ve had a ton of help from our friends at HP, who have been on this roller coaster with us for a long while as well. You’re sure to hear more about Trove at OpenStack Summit Atlanta next week, but today I’d like to take a walk down memory lane with Trove, and talk about how it went from a small project started within Rackspace to the treasure it is today.
In The Beginning…
I remember chatting with my leadership team about a database as a service project, and wanting to help lead the project from the developer side of things. We had a few amazing engineers who helped architect the big picture. We started down a path of “build it ourselves,” and began using a bunch of Java tooling to get that done. At about the same time, Rackspace was adopting OpenStack Nova (in our public cloud) and, after much discussion, we decided to pivot and build the system on top of Nova. Looking back, that was the best decision we’ve made to date on the project!
Back then Nova, Glance and Swift were the big three projects in OpenStack. We dove into Nova and learned all that we could. And it was then that we made our first mistake. The Rackspace DBaaS team and I thought that Nova should have an *aaS API, with database being the first thing that we implemented. We started building we called RedDwarf to go along with the Nova star theme.
<sarcasm>RedDwarf is also, apparently, was the name of a TV show across the pond.</sarcasm>
In steps HP. I had some Google hangouts with a few of the HP database team, and they wanted to help. There was one problem: they disliked the way we were inside Nova. After more chatting, I decided to do a stealth rewrite of RedDwarf, and call it RedDwarf Lite, a-la the Keystone Lite rewrite. So for the next four weeks I spent nights and weekends banging out what would become the new foundation of what we call Trove today. (You can thank me or blame me.)
Let’s fast forward to Grizzly. During the Grizzly summit, the Rackspace and HP teams got together and drafted our incubation form. It was an exciting moment. Until then, we were still paying the tax of open source, and getting none of the benefits. We were excited to become an incubated project and get traction and support from the OpenStack community. A few weeks later we went up to the OpenStack Technical Committee, pleaded our case and we were rubber-stamped. Incubation. What a cool thing.
We spent a cycle in Incubation. During this time we grew and had some setbacks. We had done as much as we could to be in line with the ecosystem. But we didn’t do enough! We spent the next six months “falling in,” so to speak. We had a giant uptick in support form other companies. Mo developers, Mo problems. But, it was starting to pay off. These were good problems to have. We also took the time to change our name from RedDwarf to Trove. We decided a treasure trove was a good place to store your data. So Trove it became.
Six months later I went back to the TC. It was time to apply for Integration. After a few good conversations, we passed another milestone. We would be integrated in Icehouse. Alliteration at its finest. And at that point, I become an official PTL. I even got a cool rainbow unicorn hat for it. Well, I’ve had the hat for a while…
Ok. But What About Trove?
If you’ve gotten this far, you are probably thinking to yourself “why the heck hasn’t this guy said anything about Trove?” Well, good question. Let’s talk about what Trove is now. Trove is, and always has been, Database/Datastore as a Service. We currently have great support for MySQL, and experimental support for Redis, Cassandra, Couchbase and MongoDB. That means both Rackspace and HP have deployed their MySQL Datastore and we have companies that have deployed the others in some form or fashion. They may not have backups, or may not be as fully vetted as MySQL in Trove.
Trove is currently single instance. We will get into clustering and replication in a bit, but for now, lets talk about what Trove currently does. Trove can spin you up a database instance, which is secured by default. As a customer, you can request backups (incremental and full), and restore those backups into a new instance. You can edit configuration files for your Datastore. For MySQL, you can also add users and schemas to your database instance.
So if you want to know more about the present of Trove, you can learn all about it on the OpenStack wiki and doc page. Of course, you can also view the docs on docs.rackspace.com as well, since we are running trunk Trove in production!
Future Treasures In Trove
So this is fine and dandy, we are an official OpenStack project. But we are only single instance. When are we getting clustering? Replication? Point in time recovery? Where is all the CoolStuff ™???? These are all great questions. The Trovesters are working hard to make Trove a world class Datastore infrastructure system.
Replication and Clustering are not easy. Neither is designing an API that will work for MySQL, Redis, C*, Mongo, Couch and just about any other Datastore you can think of. We are taking a step back here, and have been working on a Replication and Clustering API for a while now. We don’t want to build something half-baked; we want this to work and we want it to work well for our users. We have discussed it heavily at a few summits as well as a mid cycle sprint. We are close. We have a few companies building out the implementations as I type this.
Scheduler is such an overloaded term in OpenStack. But Trove needs one to be able to deal with maintenance windows, for automatic restarts and upgrades. Trove also needs it to be able to do scheduled backups for point in time recovery. This is something else that’s on the horizon (no pun intended), and it will help us with multiple features in Trove. It will be exciting to see this being built.
So that leaves us with some open questions, one of which is: How far should Trove go into the management of your services? We have aspirations of allowing auto scaling with OpenStack. We also want to help heal your services when they go down or slow down. If your replica goes down, Trove needs to step in and fix it. It’s going to be real fun to solve how we do that. This is where the rubber meets the road. This is how we differentiate ourselves from other Datastore services. This is the future of Trove.
If you’re at OpenStack Summit Atlanta next week and want to learn more about Trove, I’ll be co-presenting with Tesora vice president of product development and Trove developer Doug Shelley at 11 a.m. Thursday, May 15. Join us for “Introduction to OpenStack Trove: A Multi-Database Development with MongoDB and MySQL.” And be sure to check out other Racker-led talks throughout the summit.