Rails developers who want to leverage the speed and power of Rackspace Cloud Files are in luck – there are several different options to choose from. In this blog post, I will take a quick tour through three of the most popular Rails gems/plugins for cloud asset storage: attachment_fu, paperclip, and carrierwave.
Before I begin, why would you want to store assets in the cloud? First, you get the benefit of massively expandable storage. You don’t need to worry about adding another disk drive to your RAID or LVM pools, you just keep uploading and let the cloud provider worry about the disks. Second, you can gain the speed of a Content Distribution Network (CDN), such as Rackspace’s partner Limelight. This allows your users to get data faster, as the CDN routes their requests to network nodes located closer to the end user. Finally, you’re decoupling dynamic data (uploaded files) from static data (your application code), which will allow you to more easily scale your application servers based on server snapshots. That comes in handy when scaling out server images on Cloud Servers.
With that in mind, here’s a quick look at the big three plugins. For each of them, I have also provided links to the source code on github, as well as a small sample Rails application demonstrating how to use them. All of these utilities rely on the cloudfiles gem (source). (As an aside, that gem is due for some refactoring, so if anyone wants to help out with that project, feel free to drop me an email.)
attachment_fu was one of the first solid file attachment plugins for Rails, with first commits happening back in December 2006. Mike Clark blogged about it in early 2007, and it was included as an example in the Advanced Rails Recipes book. It was created as the successor to the earlier acts_as_attachment plugin. Uploaded files can be stored to the filesystem, database, Rackspace Cloud Files, or Amazon S3.
In attachment_fu, the uploaded file is stored as a discrete model in and of itself. So for a User with an uploaded Photo, I’ve found it easier to put the user data in the User model, and join that with a Photo model that just handles the uploaded file. I haven’t had much luck adding extra data to the attachment model. Additionally, due to the way the metadata on the object is stored in the database, one model cannot represent multiple types of uploaded files (so the same model can’t handle both a Photo and an Avatar).
attachment_fu can work with a wide variety of image processors, including RMagick, ImageScience, Mini-Magick, GD2, and even CoreImage on the Mac. The attachment_fu README also gives examples of how to use it in scripting, which is handy if you need to bulk-add data, or migrate from a different file upload system. The README is also very thorough documenting the various options to the has_attachment definition, as well as the database tables.
(Cloud Files support in paperclip-cloudfiles)
Paperclip is likely the most popular file upload plugin today, with over 2,000 watchers on github and over 300 forks. Part of the great suite of Thoughtbot software (including clearance, hoptoad, factory_girl, shoulda, and more), it allows an easier integration of uploaded files with existing models. You don’t need to dedicate an entire new model to your uploaded files, instead you just add a few new database fields to the model (representing filename, content type, size, and update date), and you can also have multiple files attached to a given model.
Paperclip offers an extensible selection of post-processors for when the file is uploaded, so you can perform tasks such as OCR’ing files into text, transcoding video, and more. There are also pre- and post- hooks that allow you more fine-grained control over handling uploads. For image processing (the most common use case), thumbnailing is built in, and Paperclip uses system-level ImageMagick binaries to handle them (avoiding memory issues with RMagick).
The base installation of paperclip supports storing files on the filesystem or in Amazon S3. Thoughtbot policy is that they don’t add any code into the mainline of their projects unless they actively use it in client projects, so there is no support for Cloud Files in the main branch of paperclip. I maintain a fork of paperclip with Cloud Files support, though, and that gem can be found as “paperclip-cloudfiles” on the gem servers. It’s on my list to refactor that code to be a plugin into the mainline, as opposed to a fork of the entire project.
Carrierwave is the new kid on the block, with first commits coming in early 2009. Interestingly, it is not just a Rails solution, but provides options for Rails, Merb, Sinatra, or other Rack applications. It splits out the uploader code into a class located in app/uploaders for Rails applications. This gives you the benefit of isolating the uploading code (rspec testing macros are built in), while easily allowing your base model to include multiple uploaded assets. Carrierwave can store files locally on the filesystem, on Rackspace Cloud Files, or on Amazon S3 (using AWS-S3 or Right-S3).
Carrierwave provides generators to give you the basic skeleton for your uploader. Because the uploader model is just a Ruby class, it’s very easy to override defaults or add new features – just open up that class and redefine methods as you go! Carrierwave can be configured on a global level (though an initializer) and on a per-uploader basis.
Another interesting thing about Carrierwave is built-in support for not only ActiveRecord, but MongoDB’s GridFS. With NoSQL databases growing in popularity, having out-of-the-box support for MongoDB is a handy way to dip your feet in the waters. For image processors, RMagick, ImageScience, and MiniMagick are supported. Carrierwave provides support for pulling uploads from a remote URL in addition to a direct file upload, which is quite cool. It also easily handles the common use case of removing previously uploaded files with a simple checkbox.
Carrierwave has an extensive amount of documentation in its README file, so peruse that to see what it can do!
attachment_fu is the old reliable workhorse. It’s not under heavy development at present, so it will be interesting to see if things pick up as Rails 3 gains traction. But you will be able to find a lot of online documentation supporting it, and it’s stable and works well.
Paperclip is leaner, integrates well with the rest of the Thoughtbot toolkit, and is widely popular. It’s tailored to a fairly specific technology set by design, though, so Rackspace customers will need to use a fork of the main project.
Carrierwave is the upstart, taking a different tack on the idea of asset uploads, and is touching on a wider variety of technology than the others. It’s still under active development, so you’ll want to keep an eye on how things change. I’m very impressed with it.
Play around with all three, find the one that suits your particular feature needs, and easily leverage the power of Cloud Files! And if I’ve gotten anything completely wrong, feel free to correct me in the comments!