Why Choose Amazon Aurora over Regular RDS
Amazon RDS is the managed relational database solution of AWS. You leave the setup and maintenance of your database to AWS, and focus on using it with the help of RDS. You can launch and maintain community edition MySQL, PostgreSQL databases as well as commercial Oracle and SQL Server databases on Amazon RDS. However, in a few years ago, AWS developed its own cloud native, enterprise level database engine called Amazon Aurora.
Aurora provides MySQL and PostgreSQL compatibility. In this post, I will discuss some of its unique features and why you should choose it instead of community edition MySQL and PostgreSQL databases.
Recently, Aurora also launched its serverless and multi-master versions and any of these features can alone be the reason to choose it. However, in this post, we will focus on single-master Aurora deployment and its advantages over RDS.
What is Amazon Aurora?
First of all, Amazon Aurora is neither MySQL nor PostgreSQL. It is a different, cloud-native database engine developed by AWS providing versions compatible with these two databases. You also launch and manage your Aurora database clusters with the help of Amazon RDS. This is why you see it on Amazon RDS Console, it provides all the advantages of RDS. Hence, we can consider it as an enhanced version of Amazon RDS.
Let’s say you need to create a MySQL or a PostgreSQL database on AWS. Generally, I see that many clients have already been using Amazon RDS and even do not consider Aurora. However, especially if you are developing a serious, enterprise level application, Aurora provides you additional features in terms of performance, reliability and durability. Therefore, although it is slightly a few bucks more expensive, I think it is more feasible in the long-term and it worth the cost. So let’s discuss what makes Aurora different.
Aurora’s storage is fault tolerant by design
When you use regular Amazon RDS, the architecture is similar to installing it on Amazon EC2 manually, leaving the provisioning and maintenance to AWS. Of course, RDS provides many features like automatic failovers, backups, etc. But essentially, you have an instance and an EBS volume attached to it.
To achieve reliability on this archictecture, you need to enable the Multi-AZ feature on your RDS instance and replicate it synchronously to a standby replica in another Availability Zone. So you gain 2 copies of your database in 2 availability zones which is great. You can see the diagram of the regular RDS below.
However, Aurora provides more reliability in terms of storage. Its database storage is separated from the instances and your data has 6 copies as 10GB chunks distributed to 3 Availability Zones. Hence, even if you have only one Aurora instance, your data will have 6 copies.
Besides, Aurora scans each copy of data nodes regularly and corrects them using one of the remaining copies if there is a failure in it. Your database storage is reliable and fault-tolerant by design.
Aurora’s performance is higher and more consistent!
When you use community edition MySQL or PostgreSQL, because of the synchronous replication between replicas, the performance degrades in time if the load increases. However, due to its unique storage design, Aurora has at least 3x more performance (5x more when compared to MySQL) according to AWS.
Aurora writes logs directly to the storage, it does not keep log buffers. The replication to the replicas is asynchronous and for only cached data. As the replicas also share the same storage cluster, the replica lag is very small and consistent over time. The main reason using a read replica is high-availability and sometimes read scalability. Because as I said before, storage cluster is separated from your db instances and you don’t need a read replica to replicate your data. Your data is durable by design.
But there are 6 copies of data, right? Then what about the write performace? Aurora manages the acknowledgements using a quorum structure. As there are 6 copies of storage nodes, the acknowledgement of only 4 of them is enough to assume a request as a success. The remaining 2 gets the replicated data from other copies. So this increases the performance greatly.
More read replicas on Aurora and reader endpoints
On Aurora, you can scale your read queries by creating 15 read replicas whereas 5 in regular RDS.
As you know, on RDS there is a cluster endpoint which you use for your write queries. This is the DNS endpoint pointing to the current master db instance. During a failover, RDS points this endpoint to the new master by a simple DNS change. However, for read replicas, you have to balance the load in your application using the instance endpoints. Regular RDS does not provide a load balancer for read replicas.
On Aurora, you still use cluster endpoint for your write queries. But it also provides a reader endpoint acting as a load balancer for your read replicas. So you can simply use this endpoint for your read queries and create 15 read replicas behind it. In case of a failover, one of the read replicas become master and removed from this reader set.
Failover time is faster on Aurora!
On regular Amazon RDS, because of the native storage structure of both community edition db engines, failover time depends on the load and detoriates in time. However, on Aurora, because the storage is separated and logs are kept at storage, only DNS propogation matters during a failover. Aurora does not keep a log buffer on the instances. DNS propogation takes around 30 seconds. So the failover is also faster, providing more high availability to your applications.
What about security and backups?
Both regular RDS and Aurora share the same security and backup features of Amazon RDS. Hence, you still achieve security using the same tools like security groups, IAM authentication, encryption at rest and in transit, etc. Nothing is different in terms of security.
Both Aurora and regular RDS provide point in time recovery and 1 to 35 days backup retention. So backup feature is also same. This is because Aurora is just a different engine managed by Amazon RDS. Hence, we actually compare its differences to community edition MySQL and PostgreSQL engines here.
Aurora’s unique storage design provides many advantages when compared to community edition RDS databases. Although there is a small increase in hourly bills when compared to RDS, I highly recommend using it for your enterprise level applications.
For those looking for more technical details about the storage design, how logging handled, I recommend watching the ReInvent::2018 deep dives on Aurora. I believe you will find lots of useful information there.
Thanks for reading!