For many years now, databases have been the central repository for our most critical data. They are core to running local applications as well as web-based applications and sites.
Databases come in all sizes and shapes. We use words like relational, SQL, NoSQL, columnar, warehouse and even big to describe our data and how it relates to itself and external resources. Therefore, when it comes to migrating a database, it is no small matter and the process should be planned, purposeful and precise.
In this post, I want to talk about the process of migrating databases to AWS. Like any other target for database migration, there are several considerations to evaluate when migrating your database to AWS.
The first consideration to evaluate when migrating databases to AWS is compliance. Compliance is critical in that, one, you need to be compliant, and two, understanding compliance requirements may affect all other considerations and choices.
As a solution architect at Rackspace, I work with customers every day who are regulated by compliance standards such as the Payment Card Industry Data Security Standard, Health Insurance Portability and Accountability Act, or the General Data Protection Regulation (an EU regulation that takes effect in 2018). The compliance standards are regulations on data security and access and therefore, heavily apply to databases.
When migrating a database to AWS it is critical to understand which services are compliant to the particular regulation you are working with. While detailing each regulation and how AWS falls within that regulation, a great place to start is the AWS compliance section of their website.
While it’s not specific to a particular regulation, I would be remiss not to spend at least some time on encryption. If you look at the different compliance regulations, you will see that most, if not all, emphasize privacy — both through limited access and through the removal of personally identifiable information. One of the best ways to protect data from prying eyes is encryption, encryption at rest and in transit.
When moving your database, it’s critical to ensure that your database storage, backups and snapshots are all encrypted. AWS offers a transparent layer of encryption that uses a key management service to maintain the keys. Some regulatory standards even require database level encryption. It is also critical to evaluate the encryption of your data in flight. When data is being called from a server, even a private server, it might require an encrypted handshake to do so. The bottom line here is to KNOW YOUR REGULATORY REQUIREMENTS.
Choosing a Host
Choosing a host within AWS is a critical decision and requires a little research. First of all, it’s important to understand what I mean by a host within AWS. When hosting a database in AWS there are several options to choose from:
Database as a Service
One option is to run your database in one of the managed database services that AWS offers. For example, if you want to run a relational database like MySQL, MS SQL or Oracle, you can choose to host your database on Amazon RDS.
By choosing RDS, you get to take advantage of a service that is fully managed by AWS. For starters, this means you don’t have to worry about instance maintenance and patching. Additionally, AWS handles multi-zone replication, read replica’s, automated backups, snapshots, recovery and more at your request. Amazon RDS offers additional database engines like MariaDB and PostgreSQL, and their own flavors of MySQL and PostgreSQL at hyper-performance scale on Amazon Aurora. Read all about RDS here.
In addition to a relational database service, AWS also offers a NoSQL offering in the form of Amazon DynamoDB and data warehousing with Amazon Redshift. In the same manner of Amazon RDS, these services provide database computing at scale without the complexity of infrastructure maintenance and care.
A second option for hosting databases in the AWS cloud is to run EC2 instances and install your database of choice. This is the equivalent of spinning up a local machine and installing your database. While I heavily recommend using a managed service when possible, this option provides some advantage that you might not have in a managed service.
First off, you are not limited to the supported offerings. With the availability of Windows and multiple flavors of Linux EC2 instances, you have a lot of flexibility in which database engine and version you want.
In addition to being able to run non-supported databases and versions, this option also allows you some advantages with supported databases. In this configuration, you have access to the OS, while in a managed service you do not. This gives you complete control over configurations and tooling that might require OS-level access.
Migration types and tools
A third consideration for database migration to AWS is to understand what type of migration is being done and what tools are available. Types of migration can be broken down into two categories.
The first and most common type is a homogenous migration. In this type of migration, you are migrating from a database externally to a database in AWS while keeping the same engine — simply moving from MySQL to MySQL or Oracle to Oracle.
The second type of migration is called a heterogeneous migration. In this type of migration, you move from a database externally to a database on AWS. During this move, your choice of database engine changes. For example, I move from PostgreSQL to MariaDB or MS SQL to MySQL on Aurora.
Once you have established the type of migration you are doing, you can plan how you are going to execute the move and decide which tools will be useful. At Rackspace, we generally opt for simple first. Many times, this is as basic as standing up a database in AWS and replicating data using native database replication. If data size is a factor, then we might look at dumping the data and transferring to AWS via an Amazon Snowball or even a massive Amazon Smowmobile. Once the data is on location, we can then restore that data to the appropriate database and then sync through replication.
However, when things get complicated and we need to manage a database migration where native replication may not work, we then look to Amazon DMS. Amazon DMS allows for initial migration and continual replication of data with little to no downtime. This service also allows streaming into Amazon Redshift or Amazon Dynamo DB. In addition, when performing a heterogeneous migration, the AWS Schema Conversion Tool helps to transform from one schema to another.
A final consideration for migrating databases to AWS is to understand some of the general idiosyncrasies of your chosen database engine and any additional complexities an AWS DBaaS might add.
So, you’re probably thinking, “Wow, that’s a loaded statement,” and you would be right. Different databases have different limitations and they can sometimes be compounded in a service offering. For example, in MySQL, MyISAM tables cause table-level locking during mysqldump. Also, when moving to AWS, non InnoDB storage engines are permitted but not supported. While listing all of these little nuggets is outside of the scope of this blog, I encourage you to research these topics before you get into the migration.
I think it bears repeating, migrating a database anywhere can be a daunting task. In this respect, AWS is no different. Fortunately, AWS provides many options and tools to successfully migrate and run your databases in the cloud. However, while this is not intended as a sales pitch, I would encourage you to seek a partner in this journey. Rackspace migration experts research and plan for complex database migrations every day. We partner with AWS and companies from the beginning to ensure success.
Visit Rackspace to find out more about our managed support for AWS and the ways we help businesses with database migration.