AWS re:Invent 2020 Day 10: The Future of Cloud Runs on ARM

AWS re:Invent 2020 Day 10: The Future of Cloud Runs on ARM

Graviton2, AWS’ latest generation of ARM-based processors, was announced a year ago at re:Invent 2019. But the amount of times Graviton2 was mentioned in re:Invent 2020 keynotes almost made me believe it was this year’s major announcement. Let’s look at why Amazon seems to be all in on ARM, and what to expect in the coming years.

The history of ARM and x86

ARM has been around since the 1980’s. The company Arm Ltd. (also known as Arm Holdings) designs processor architectures and cores based on a reduced instruction set (RISC). The designs for these architectures and cores are licensed to third parties, who use them to design their own microcontrollers (MCUs), Systems-on-Chips (SOCs) and CPUs. Examples are Qualcomm’s Snapdragon, Apple’s mobile CPUs (A4, A5, currently up to A14), Apple’s desktop CPU M1, and Amazon’s Graviton and Graviton2 processors. There are hundreds of other types of processors built on ARM architecture, but these are some of the most famous ones.

ARM processors were originally designed for use in personal computers. However, their efficiency and the fact that third parties could design their own specialized SOCs and CPUs around the ARM core made ARMs especially popular for embedded devices like printers, network appliances and mobile devices. Windows’ rise to dominance on the desktop market, combined with its requirement for the x86 architecture, seemed to seal ARMs faith for use in small, efficient devices only.

Since the start of the century, smartphones have become ubiquitous. These devices need to be small and efficient, which made ARM processors a logical choice. In fact, almost every smartphone runs on ARM. The first iPhone was based on a Samsung 32-bit RISC ARM processor, and the first Android phone - the HTC Dream - was based on a Qualcomm ARM11 processor. There have been a few phones based on the x86 instruction set, but none have been successful.

At the same time, the x86 processors produced by Intel and AMD started to run into their design limits. For decades, these processors became more powerful by increasing their clock speed - they simply were able to run more operations per second. But somewhere between 4GHz and 5GHz, increasing the clock speed became nearly impossible; the increased speed required higher voltages, which led to higher temperatures which had to be channeled away. At some point these CPUs either needed giant cooling systems or they would simply melt. The solution was to introduce multiple cores in a single CPU. Each of these cores was limited at a certain speed, but by executing multiple processes in parallel systems could still become faster. To increase the utilization of these cores, multithreading was invented. With multithreading, a single core could time-share its capacity, allowing two or more threads to run semi-simultaneously on a single core. To make this work, an SMT (simultaneous multithreading) controller had to be added to the CPU. The SMT controller managed and assigned different threads for the CPUs cores. However, SMT itself uses power, which made multithreading CPUs less efficient. Adding more components also made CPUs more complex and expensive. And as we will see in a later section, SMT controllers introduced new security risks.

Linux, Raspberry Pi, smartphones and the Apple M1

Although Windows could not run on ARM, Linux has been supporting desktops on ARM since about the year 2000. Propelled by the introduction of the Raspberry Pi in 2012, many developers and hobbyists have been experimenting with running full-fledged operating systems and desktops on ARM platforms. Additionally, the lucrative smartphone market pushed mobile phone developers to create ever faster and more efficient processors. Apple introduced their A4 SOC in the iPhone 4, and they have been pushing the performance of their ARM processors every year since. In 2018, Anandtech wrote:

What is quite astonishing, is just how close Apple’s A11 and A12 are to current desktop CPUs. I haven’t had the opportunity to run things in a more comparable manner, but taking our server editor, Johan De Gelas’ recent figures from earlier this summer, we see that the A12 outperforms a moderately-clocked Skylake CPU in single-threaded performance.

This has been a turning point in the history of processors. Since 2018, ARM processors have proven to be more powerful, efficient, secure and cheaper than their x86 siblings. Additionally, the performance growth path for x86 seems to be plateauing, while there is a lot of room for performance improvement in the ARM architecture. With these facts it came as no surprise that Apple started looking into running their personal computer line on ARM years ago. This year they finally released their MacBook Air, MacBook Pro and Mac Mini with the M1 CPU, based on the ARMv8.6-A architecture.

With Apple moving towards ARM, Amazon doing the same, almost all smartphones running ARM now and in the foreseeable future, and the existing market for embedded devices, the future looks bright for ARM. Nvidia definitely seems to agree - they acquired Arm Holdings from Softbank for a cool 40 billion in September 2020.

Arguments for ARM-based servers

Driven by ARM’s increasing popularity, many software platforms, operating systems (including Windows nowadays!), applications and libraries support the AArch64 (ARM 64-bit) architecture. These include popular web server applications like NginX and Apache, databases like MySQL and PostgreSQL, and programming languages like Python and Java.

With this broad support, switching from x86 to AArch64 is not quite as difficult as it used to be. There might be some compatibility issues, but most applications should be able to migrate without too much effort. This means there is a giant market of potential customers of ARM-based servers. But why would customers want to migrate?

The answer is the obvious and only argument for any migration: money. An ARM processor is cheaper than an Intel or AMD processor, while - as discussed above - ARM equals or even tops x86 performance. Depending on your workload, AWS’ Graviton2 processors can offer up to 40% better price/performance than comparable x86 chips. And while 40% is only achievable in ideal situations, even 20% is very significant!

Another driver for ARM based servers is security. As stated earlier the SMT controller adds additional complexity to processors. The controller also needs access to all the cores and threads on the CPU to effectively manage them. Through a recent vulnerability attackers were able to abuse SMT to access sensitive data from other cores. This is especially important in public cloud environments, where customers almost always share physical hardware with other customers. The Graviton2 architecture shares no cache or threads between different cores: every thread runs on a single core with its own L1 and L2 cache.

Separate Cores

Driving sustainability with Graviton2

In the infrastructure keynote it became clear that Graviton2 is an important component in achieving AWS’ sustainability goals. ARM processors require less power to achieve the same performance as x86 processors. This means less power consumption, but it also has indirect benefits. Less power consumption means less excess heat, which means less cooling, which itself means less power consumption as well. Of course, this too is translated into the lower consumer costs for Graviton2.

Low-friction cost savings with managed services like RDS

Migrating from x86 to AArch64 is relatively easy. However, ‘relatively easy’ can still be ‘absolutely difficult’ in large and complex environments. If you want to benefit from Graviton’s improved price/performance over x86 as soon as possible, there might be some low hanging fruit. Amazon is supporting Graviton in more and more of their managed services. They started with RDS, followed by ElastiCache and they just announced support in Aurora. These databases are all reachable over a well-established protocol, and the inputs and outputs between one CPU architecture and another should be no different. This allows you to achieve instant cost savings without any architectural change.

Other services currently supporting Graviton2 processors are EMR, EKS, ECS, CodeBuild and Outposts. Switching architectures with these services will require more effort than the databases, but it’s good to be aware of their support.

Conclusion

I believe Amazon will bring ARM support to every single one of the services where customers select instance types, including Sagemaker, Neptune, Redshift DocumentDB, DMS, MSK, and so on. Additionally, I think they will internally use Graviton to power many services where the underlying hardware is invisible, such as S3, DynamoDB, Lambda, Cognito, Alexa, Route 53 and many others. It’s simply more cost efficient.

For AWS customers ARM support has unlocked a large number of cost savings opportunities. Frankly I can’t think of any reason not to try RDS, ElastiCache or Aurora on Graviton instances. Likewise, you can’t go wrong exploring the cost savings Graviton2 offers on your compute (EC2, ECS, EKS) workloads. The investment for the migration process might be significant, but I think you’ll be surprised by existing compatibility and the long-term savings you might gain. As they say in Dutch: you’re a thief of your own wallet if you don’t.

This article is part of a series published around re:Invent 2020. If you would like to read more about re:Invent 2020, check out my other posts:

I share posts like these and smaller news articles on Twitter, follow me there for regular updates! If you have questions or remarks, or would just like to get in touch, you can also find me on LinkedIn.

Luc van Donkersgoed
Luc van Donkersgoed