It was February 14 and BuiltLean was about to have the biggest traffic day in its two- year existence. And we weren’t ready.
A post on posture problems that develop from office work was submitted to the LifeProTips subreddit at 10 a.m. Eastern. Within the first hour, it had garnered 500 upvotes and 35 comments; by the second, it reached over 1,000 upvotes and had hit the front page.
Previously, we had survived hitting the front page twice on Digg, largely in part because we had hit during off-peak hours. But it was a Thursday at noon, and there was no avoiding a lot of traffic arriving at once from Reddit.
Like the once fabled “Digg Effect,” Reddit has the ability to send a tremendous amount of traffic to your site. What makes this type of traffic especially difficult to handle is the rapidity and frequency in which it arrives – it often resembles a small-scale DDoS attack. It’s like a machine gun. We were prepared to handle 100,000 hits in a day, but we were nowhere near prepared for 100,000 hits in an hour.
Luckily, through active monitoring of our referral traffic, we spotted the Reddit submission within the first hour. When we knew we weren’t ready for what was potentially about to happen, we called up Rackspace support, which quickly diagnosed the situation and prescribed an upgrade of our database server and the temporary addition of a second server with a load balancer to distribute requests between our two servers. Within 20 minutes, both were implemented, almost simultaneously with our post hitting the front page of Reddit.
Thanks to Rackspace’s swift deployment, we were largely able to absorb the amount of traffic sent from Reddit. All told, we received over 125,000 visitors from Reddit, 60,000 of which came between noon to 2 p.m. But the on-the-fly adjustments to weather this traffic storm weren’t a long-term fix for the underlying issues. We had to be ready in case something similar happened again.
Going forward, in collaboration with Rackspace’s support staff, we made two substantial changes:
- We added Varnish, an HTTP accelerator that greatly reduces the time in which pages are served. When we did have errors during the Reddit traffic surge, it was largely due to maxed out simultaneous connections in Apache that were exacerbated by the slow distribution of the page requested.
- We permanently added a second server with a load balancer that acted on-demand during our most trafficked hours, and we could activate it when situations demanded it.
We’re happy to report that with the above implemented and constant refinements done to BuiltLean’s WordPress configuration, page speed and sever stability have greatly improved. Akin to the saying “dress for the job you want, not the job you have,” our takeaway was to always be prepared for the traffic we wanted.