The Apache Cassandra team is in the process of rolling up a release candidate for Cassandra 0.7. That makes this a good time to highlight some of the new features and changes that will be in this version.
Memory Efficient Compactions (CASSANDRA-16)
Compaction is the process by which Cassandra iterates over its data, removing deleted data forever and combining multiple data files into a single data file. This results in faster read operations. One limitation of the implementation was that if a row of data couldn’t fit in memory, then the data file it belonged to could never be compacted.
Since one of the main features of Cassandra is its use of sparse columns, users expect to be able to add columns without penalty until disk runs out. Cassandra 0.7 solves this by making compactions happen incrementally.
Online Schema Changes (CASSANDRA-44)
Prior to Cassandra 0.7 adding and removing column families and keyspaces required you to first distribute an updated configuration file to each of your nodes and then execute a rolling restart of your cluster. That was not too bad, but it was manual and required human intervention, so it was possible to make mistakes.
Cassandra 0.7 solves this problem by exposing the ability to create and drop column families and keyspaces from its client API. Using the same methods there is limited support for updating existing column families and keyspaces (e.g., increasing the replication factor for a particular keyspace).
Additionally, we’ve made it easy to migrate your keyspace definitions from your 0.6 storage-conf.xml configuration file using bin/schematool. In the future we plan to add the ability to rename column families and keyspaces. A wiki article contains more detailed information.
Secondary Indexes (CASSANDRA-749)
Cassandra has always been an excellent platform for storing data. However, it has lacked an expressive, efficient way to query data by any property other than the primary key (you need the answer to “what keys match these particular columns and values?”). To cope with this, many application developers ended up creating their own secondary indexes that were basically inverted indexes of already existing column families. This solution worked well, but was tedious to maintain and involved a lot of duplicated effort.
The need for this feature has weighed heavily on our minds for a while. Work even began on it prior to the release of Cassandra 0.6. One design decision that required some wrangling over was whether secondary indexes should use node-local or distributed storage. Each approach had distinct advantages and disadvantages and served different use-cases. We ended up using node-local storage (the secondary indexes are wholly contained on the same node whose key they index). Distributed secondary indexes may eventually make their way into Cassandra, but not in 0.7.
Using the API for online schema changes, secondary indexes can be added to existing or brand new column families. An index can be queried on the client side using the
get_indexed_slices() API method.
Time to Upgrade?
For these reasons, Cassandra 0.7 is a compelling upgrade from earlier versions. But wait! The Cassandra team has been busy fixing bugs and adding more features. You can get the full details from the changelog, but I’ll highlight a few of them here.
The streaming code has been simplified to offer better tracking when files get streamed. Additionally, the requirement for anti-compaction, the time-consuming process of splitting a data file into multiple parts, has been removed. Streaming is faster and more predictable now.
Several things have happened to make reads faster overall. First, the read path has been optimized to take further advantage of the row cache introduced in 0.6 (CASSANDRA-1267 and 1302). Second, row keys have been switched from strings to byte arrays, picking up some speed by eliminating the overhead required for java string construction. Third, usage of
byte has been replaced with
java.nio.ByteBuffer. This allowed us to reduce the amount of byte array copying on the read path and eliminated a significant amount of garbage generation (bonus!), easing the pain still felt by stop-the-world garbage collection.
We’ve upgraded our Thrift version from 0.2 to 0.5. This comes with significant performance improvements. Thrift framed transport is now the default, in contrast to non-framed which was default in 0.6 and earlier.
As with every release of Cassandra so far, our client API has fluctuated quite a bit—something we are not necessarily proud of. A mind-shift is occurring though. Most Cassandra experts, including myself, have begun recommending that users access Cassandra through one of the available third party libraries (e.g.: Hector, Pelops, Pycassa) instead of directly via Thrift. The main benefit of this approach is that users will be immune from the sort of API changes that happen from version to version so long as client developers keep up.
There are several things worth noting about the updated API.
- The keyspace argument has been dropped from nearly every method and has been replaced by a
set_keyspace()method that sets the keyspace for the current connection session.
- Expiring (TTL) columns have been introduced. This gives you the ability to set the time-to-live for columns so that they will automatically be removed by compaction without any intervention on your behalf.
storage-conf.xml has been replaced with
cassandra.yaml. The intent is to provide a format that is more human-readable. We have provided a converter that will allow you to convert your XML configuration into YAML prior to upgrading. Also, more settings are configurable than ever before.
- We have continued the trend of exposing more operations and metrics via JMX, including compactions, sstable accesses, endpoint states (gossiper) and TCP messages.
- The authentication framework and built-in Hadoop support have both received a lot of attention.
- Introduction of native code ( ) for snapshots and help with memory swapping.
- Enhanced CLI experience
Cassandra 0.7 offers significant advantages in terms of features and speed than in earlier releases. We’ve done our best to make the upgrade process simple: old data files are still compatible, but your cluster must be brought down entirely.
As always, if you would like to learn more about Cassandra, good places to start are the website, wiki and #cassandra on IRC (freenode). Apache also hosts several mailing lists that are excellent resources. Additionally, NEWS.txt is a good place to go to find out more about some of the high-level changes in Cassandra.