Understanding Disaster Recovery and High Availability

Whether you use the cloud or dedicated servers, you should always make sure you have a plan for your configuration in the event that something goes wrong. This is the first in a series of posts based on a discussion I had with Aaron Scheel, a solutions engineer here at Rackspace.

People use the phrase Disaster Recovery (DR) to mean different things, and they will often ask what they can do to make sure that they have a strong DR plan. Scheel says that sometimes people will request a DR strategy when in reality they are looking for a High Availability (HA) solution. This is why it is important to understand what you are trying to accomplish.

This first post will be a discussion on these topics, the second will be focused on DR solutions in the cloud and the final post will explore ways to implement HA configurations in the cloud.

Disaster Recovery

The distinguishing feature of a DR plan is that you are preparing to recover from a disaster that actually occurred in your environment, and that you have to restore services through a series of actions. There are a variety of things that could happen, so it is important for a company to look at their business and define what constitutes a disaster.

For example, if you run a high volume, high transaction solution that is vital to a core group of people to do their job, a disaster could be defined as dropping requests for a ten minute time. Even though your configuration had issues for a very small duration, it would have a very large impact to your business.

This is contrasted by a business that has fairly static solution focused on entertainment that people use on an infrequent basis. While not desired, ten minutes of downtime might not constitute a disaster for this business. However, having a server go offline and become completely dark could be a disaster for them.

You can see that a company can define many things as a disaster, but whatever constitutes a disaster must actually occur. DR is not focused on preventing a disaster as much as it is the process, policies and procedures related to preparing for recovery or continuation of technology infrastructure critical to an organization after a natural or human-induced disaster.

High Availability

The goal for a HA configuration is to prevent a disaster from occurring in the first place. There are steps that customers can take to have a certain amount of redundancy of their configuration so that if they drop some packets or information or lose a node in their configuration, their solution can still hum along without missing a beat.

HA configurations do this by having a degree of automatic failover to other nodes in your configuration. By having an environment that self heals, you can ensure that your configuration will most likely not go down.

HA configurations are more focused on  automatically mitigating the impact of disasters when they occur.  This can be done via automated failover options or load-balancing options in configurations. Generally speaking, having this type of configuration to help prevent disasters is going to be more costly than simply having systems in place to recover from one.

Questions to Answer

Different businesses have different needs, and our solutions engineering Rackers have the expertise to answer your questions and assist in identifying the best solution for your needs. To understand what solution works for you, it would be best to give us a call and chat with one of these Rackers in more detail.

A solutions engineer will take you through many types of questions to help determine which solution is best for you, including:

(1)      How difficult is it for you to spin up a new server?
(2)      What is your tolerance for downtime?
(3)      How much storage are you using on your server?
(4)      What kind of performance do you require from the solution?
(5)      What is the business model that will be utilized (e-commerce, SaaS, etc)?
(6)      What kind of growth are you projecting in the next 3/6/12 months?

Additionally, there will be several more questions depending on how you answered the ones listed above. You should begin asking yourself these types of questions because they will lead you to the best option to choose. Next week I will be discussing DR in more detail, and in two weeks I will provide more information on HA configurations.

Aaron Scheel is a solutions engineer Racker who has advised a number of our customers with strategies to ensure that their businesses have the ability to withstand an adverse event.

Garrett Heath develops content and supports customers on the Rackspace Social Media team. His previous experience includes technical project management in the cloud, content marketing and social media marketing. He enjoys writing about how the cloud is spurring innovation and telling stories about the people behind the tech. You can also read his work at MarketingBytes.io. In his free time, Garrett writes about food and local San Antonio culture at SA Flavor.


Please enter your comment!
Please enter your name here