Should You Switch To NoSQL Too?

Filed in by Jonathan Ellis | February 25, 2010 2:00 pm

With Twitter going public with their plans[1] to switch to Cassandra[2], a lot of people are asking if they should switch too[3].

Ian Eure from Digg (also switching to Cassandra[4]) gave a great rule of thumb last week at PyCon[5]: “if you’re deploying memcache on top of your database, you’re inventing your own ad-hoc, difficult to maintain NoSQL database,” and you should seriously consider using something explicitly designed for that instead.

While there are other reasons to use Cassandra, such as the less-rigid schema or the best-in-class support for replication across multiple datacenters, most people are using it because a single machine can’t handle their query volume.  This is the case for other distributed nosql databases as well, although there are other kinds[6] that are also useful for some applications.

The price of scaling is that Cassandra provides poor support for ad-hoc queries, emphasizing denormalization instead[7].  For analytics, the upcoming 0.6 release (in beta now) offers Hadoop map/reduce[8] integration, but for high volume, low-latency queries you will still need to design your app around denormalization.

So, NoSQL  systems are not drop-in replacements for a relational database.  It looks like Twitter has been working on moving to Cassandra at least since Evan’s post in July last year[9]; your application may be easier or harder to port, but it’s a useful data point to keep in mind.

For more on why the difficulty of scaling relational databases is driving developers to Cassandra, see the video[10] or slides[11] from my talk last week at PyCon, and for an introduction to Cassandra itself, see Eric Evans’ talk at FOSDEM, also available with video[12] or slides[13].

Related Posts: The Cassandra Project[14], How Do You Put Database In The Cloud?[15],  NoSQL Ecosystem[16]

  1. Twitter going public with their plans:
  2. Cassandra:
  3. asking if they should switch too:
  4. switching to Cassandra:
  5. PyCon:
  6. there are other kinds:
  7. emphasizing denormalization instead:
  8. Hadoop map/reduce:
  9. Evan’s post in July last year:
  10. video:
  11. slides:
  12. video:
  13. slides:
  14. The Cassandra Project:
  15. How Do You Put Database In The Cloud?:
  16. NoSQL Ecosystem:

Source URL: