Sitecore now supports Microsoft Azure SQL, from Sitecore 8.2 and above. In this post, I’ll explain in depth the two flavors of Azure SQL, plus how they can potentially help you cut costs.
Azure SQL comes in two flavors: single instance and ElasticPools, which I’ll examine in detail below.
Microsoft offers four pricing tiers for its Azure SQL offering: Basic, Standard, Premium and Premium RS. Each of these tiers includes a feature set, which typically comes in the form of a maximum database size along with a range of performance for each of the tiers. For performance, Microsoft came up with a concept called Database Transaction Units which are an aggregation of IOPS, RAM and CPU usage.
For single instances, let’s look at the standard tier, which comes in sizes S0-S3. As you can see below, there’s a range from 10 DTUs to 100 with a maximum database size of 250GB. The premium P1 size at the bottom of the table has just 25 more DTUs at more than three times the price, but it moves the maximum database size to 500 GB. There is an additional P series I’m not showing, but you can see the P series gets expensive quickly.
You can read more here on the Azure SQL Pricing Page.
Interestingly, the SLA is the same for standard and premium at 99.99 percent. That’s because, on the back end, we get SQL Always On running one primary and four secondary (replicas). For Sitecore, we’re going to need at least four single instances and at a minimum, they should be S1s. Sitecore says as much when they talk about their new Managed Cloud Product Topologies.
Four S1s in the US East Region is only $119.99 a month!
Compared to the SQL high availability configurations I covered last month, which ran at $1,462.70 per month without a witness server for the mirror configuration using SQL Server standard licensing and to $3,099.50 for the SQL Always On configuration using SQL Server enterprise licensing, this looks to be a bargain. In fact, we now have four secondary (replicas) in Azure SQL, not just a primary and secondary.
So, what’s the trade off? In a nutshell, it’s performance: four S1’s gets you 80 DTU’s in total. I’ve not seen anyone figure out the max DTU’s on a SQL Server vs Azure SQL, but based on my own tests, I’m going to say it’s more than a thousand, even for a minimally spec’d environment using Azure D v2 VM’s and a P10 SSD.
How does this performance limit effect Sitecore? It depends. If you have a smaller, low traffic site with limited content, when the CDs and CMS have an IIS reset, the pre-caching mechanism tries to cache the web, master and core DBs as much as possible. Once completed, Sitecore is operation, this startup period will likely be slow with Azure SQL using S1 instances. But how much slower depends on the amount of content you have and your cache settings.
With Sitecore now operational, as new un-cached pages are requested, Sitecore connects to Azure SQL and retrieves them. If we encounter enough un-cached content, we’ll max out the 20 DTU limit, page response time will go up and users could potentially encounter a timeout. Of course, if the most common content is cached, this won’t happen. So, this makes a good case for cache tuning of Sitecore, CDN and warm up scripts.
Alternatively, you can just increase your sizing on a per DB basis to a S2 all the way to an S3.
I’ve got some practical data to share here so you can see the real-world difference between an S1 and S3. I created a test setup for Sitecore Symposium in 2016 and I’ve used it on Sitecore IaaS, Sitecore on PaaS and also tested a hybrid option with hosting Sitecore on IaaS but using Azure SQL for the database tier.
Using Project Guttenberg and the filldb.aspx page in Sitecore (https://community.sitecore.net/technical_blogs/b/sitecore_7_development_team/posts/filldb-updates) Filldb.aspx ended up creating a 1.2GB master DB and 700MB web DB. I noticed if with Sitecore 8.2 Update 1 using the default cache settings, CD’s would complete pre-caching and be operational after an IIS reset in under three minutes. Switch to a S1 for the WebDB and Sitecore would timeout and never become operation as the S1 instance maxed out it’s DTUs. Switch to a S2 the same problem occurred, but after changing the SQL timeout to an hour I got it working. In fact, after 45 minutes, Sitecore would be operational. Switch to an S3 and the IIS reboot time is reduced to 15 minutes.
Now you could disable pre-caching and get that time down further. But let’s be honest, 500k text rich content items is a lot of text. I was also light on assets/rich media.
I also performed some benchmarks using jMeter comparing an Azure SQL S3 to a DS3 v2 running SQL Std. Load wise I applied 250 concurrent connections.
|Config||Throughput (Pages Per Second)||Average Page Latency (ms)||Page Latency 99th Percentile (ms)||Page Latency Max (ms)||Errors|
|Azure SQL.S3 Web – xDB Normal – Shared Session State InProc||67.2||415||4186||10021||0|
|SQL Server DS3 v2 with P20 SSD – xDB Normal – Shared Session State InProc||70.2||227||1608||7676||0|
As you can see overall, the Azure SQL configuration managed a similar level of pages per second but with much higher page latencies. Under a slightly lower load we could get the 99th percentile under 2000 ms, which I consider a minimum page response time for most .com sites. Why was the latency so high? Well I looked into it at the time, and I could see the DTU’s for Web DB were periodically maxed out, which caused the higher latencies.
I started thinking about this problem and in many cases, just bumping up to S3 is probably the easiest way to minimize the issue. However, I also noticed that the DTUs for core and master were pretty low during this time. Only Analytics had any really DTU utilization which makes sense as xDB would have been running at the same time.
If only we could tap into those unused DTUs… Perhaps we can, with the second type of Azure SQL flavor.
Instead of each database having its own fixed DTU, ElasticPools puts all of the databases in a shared pool of DTUs. There’s also some mechanisms that limit how much of the pool the database can consume so you can reserve some DTUs for say CORE and MASTER to operate successfully.
Here’s the pricing structure for Standard ElasticPools
The closest match to the 4x Single Instance S1’s with 80 DTUs is is the 100 eDTU pool which costs almost double at $224.75 per month and comes with a smaller 100GB max for each DB, however the tradeoff is each DB can now access up to the full 100 EDTUs. If we need custom DB’s for the application we can add them into the pool at no extra cost.
So, it’s kind of a decision of paying extra for flexibility and more performance with ElasticPools vs lower cost and performance limitations with single instances. As you know, Premium is even more expensive but what about Premium RS with ElasticPools?
Microsoft mentions a lower SLA but doesn’t actually define it. This tech is in preview, so I’m expecting when it’s GA, an SLA will be announced. But, I’d be surprised if it’s lower than 99.9 percent, which for the most part I’d say is acceptable for many customers.
So, what does a 125 eDTU Pool cost exactly?
$55 per month more than Single Instances but we now have even more performance, the main trade off being we now have one Primary and one Secondary vs the four Secondary replicas in the other tiers. For the most part, I’d say we have a winner.
I’ve performed tests using a 250 eDTU ElasticPools with Sitecore and found with 2x DS3 v2 VM’s under a heavy load, Sitecore was performing similarly to using S3’s for Web and Master but at a lower cost and lower latencies. I even started a full smart publish during a jMeter load on the CD’s and found I maxed out at 245 eDTU’s across all 4 DB’s. Web was running around 125 eDTU’s at it’s peak, Analytics around 40 eDTU’s and MASTER around 80 eDTU’s finally core was mostly around 0 but with peaks up to 10 eDTU’s.
Note: my laptop recently died on me and I lost the benchmarks I had on my desktop folder (I know I know, I should of copied them to Cloud Storage) so I’m explaining this last part from memory.
As you can probably guess, I quite like Premium RS, so let’s discuss how you can estimate the amount of DTU’s you’ll need. Thankfully JUSTIN HENRIKSEN, an engineer at Microsoft, has figured a lot of the hard work out for you. Just follow the process laid out at http://dtucalculator.azurewebsites.net and it will explain what your average and peak DTU’s are from your own Sitecore environments.
I’ll wrap up with a few gotchas. If you’re using Azure SQL Single Instances in the Standard Tier, don’t use Azure SQL for Session State, you’re limited to 600 concurrent connections. Premium raises this limit to 30k but I’d personally say stick to Redis. I think that’s the reason the Managed Cloud Topologies I mentioned above use Redis and not Azure SQL for session state.
If you’re thinking of moving to Azure SQL, remember it’s 8.2 onwards only, so you’ll have issues if you try to use Azure SQL with 8.1 or below and it’s not supported by Sitecore with 8.1 or below.
You can restrict Azure SQL to only allow certain IP’s to access it, which works great for Sitecore hosted on VM’s, but for Sitecore on Azure PaaS using Azure WebApps, you won’t know the CD IP addresses if you’ve got Autoscale enabled, so there’s a challenge there to figure out. You could lock it down to the Azure Public IP Address ranges but that’s a very wide range and other Azure customers would still be allowed to try and connect to Azure SQL. The databases are however username and password protected, so using a system generated username and password is highly recommended to avoid brute force attacks.
Finally you might need to refactor some code if you’re hitting SQL a lot with your Sitecore solution. One approach I’ve noticed is to index the data with a search engine and query that instead. If you schedule the indexing to run during off peak hours, you’ll mitigate impacting end user performance during peak hours and potentially be able to drop a Azure SQL pricing tier or two.
I suspect many customers can move to Azure SQL, but it does come with limitations. Above a 1000 eDTU Premium RS ElasticPool I suspect you’ll be better off with SQL Server, but for those where it is a good fit you can save a lot of money vs SQL Server.