AWS re:Invent 2020 Day 9: Roundup Week 2

AWS re:Invent 2020 Day 9: Roundup Week 2

Luc van Donkersgoed

Luc van Donkersgoed

The second week of re:Invent 2020 is done! Another week with more than 50 releases, mainly centered around the topics of machine learning, data analytics and a bit of networking. In this post we’ll review all the announcements and releases, from AWS Audit Manager to VPC Reachability Analyzer.

Tuesday December 8th - Machine Learning

On Tuesday Swami Sivasubramanian introduced all that’s new in the Machine Learning keynote. Before, during and after the event many new Machine Learning capabilities were announced and released. And that might be an issue. Let me introduce you to the list of releases:

Name AWS News AWS Blog
Introducing distributed training on Amazon SageMaker link -
Detect bias in ML models and explain model behavior with Amazon SageMaker Clarify link link
Announcing new capabilities for Amazon SageMaker Debugger with real-time monitoring of system resources and profiling training jobs link link
Introducing Amazon SageMaker JumpStart – Easily and quickly bring machine learning applications to market link link
Amazon SageMaker Model Monitor now supports new capabilities to maintain model quality in production link -
AWS introduces Amazon SageMaker Edge Manager - Model Management for Edge Devices link link
New – Managed Data Parallelism in Amazon SageMaker Simplifies Training on Large Datasets - link
Amazon SageMaker Simplifies Training Deep Learning Models With Billions of Parameters - link
Amazon Lookout for Metrics
Amazon Lookout for Equipment link link
Amazon Lookout for Vision link link
Amazon AppFlow now provides Amazon Lookout for Metrics connectivity to several cloud applications link -
Amazon announces Amazon Neptune ML: easy, fast, and accurate predictions for graphs link -
AWS announces Amazon Redshift ML (preview) link -
Amazon Kendra adds Google Drive connector link -
Amazon Kendra launches incremental learning link -
Amazon Kendra adds support for custom synonyms link -
Amazon Kendra launches connector library link -
Amazon CodeGuru Profiler adds Memory Profiling and Heap Summary link link
Announcing Amazon Forecast Weather Index – automatically include local weather to increase your forecasting model accuracy link -

So… SageMaker distributed training, SageMaker Clarify, SageMaker Debugger, SageMaker JumpStart, SageMaker Model Monitor, SageMaker Edge Manager, SageMaker Managed Data Parallelism, and SageMaker Model Parallelism. Are you keeping up?

The strength of SageMaker has always been its low-friction introduction to Machine Learning. It allowed engineers without a degree in mathematics to get started with training and hosting models, without immediately boggling them with all the complexities that make ML work. Now that core is still there - but it’s buried under a shroud of additional services with unintuitive names. If you want to train your first model, where do you even start?

Nevertheless SageMaker is still a powerful tool, and with all the new services and features it has become even more so. It now offers a broad feature set for machine learning novices, intermediates and experts.

Amazon Lookout

Then Amazon introduced a whole new namespace of services: Amazon Lookout for Equipment, Amazon Lookout for Vision and Amazon Lookout for Metrics. You can imagine that failure prevention, image recognition and anomaly detection are among the most common use cases for machine learning. With the Lookout series, Amazon is bringing common applications to the masses, with little to no machine learning skills required.

Amazon Kendra

Another service that saw a few new features is Amazon Kendra. This enterprise search service scans document and data stores like Jira, Dropbox, Google Drive and more. Then it allows users to perform natural language queries, which will return answers from all connected data stores. This enables large enterprises to provide their employees with all the information they need, without expecting them to know all the places they could possibly find it. This is another example of a common machine learning application. Amazon obviously aims to do the heavy lifting, and solve a problem that many of their customers might have.

With new Kendra features like the Google Drive connector and the connector library, more data sources can be connected, making Kendra even more powerful than before.

Wednesday December 9th - Data Analytics

On Wednesday we saw Rahul Pathak, vice president of analytics at AWS, deliver his leadership session. Around his talk a number of data analytics were released. Again, let’s take a look at the list:

Name AWS News AWS Blog
Amazon Redshift now includes Amazon RDS for MySQL and Amazon Aurora MySQL databases as new data sources for federated querying (Preview) link -
Amazon Redshift announces support for native JSON and semi-structured data processing (preview) link -
Amazon Redshift introduces data sharing (preview) link -
Amazon Redshift announces Automatic Table Optimization link -
Amazon Redshift launches the ability to easily move clusters between AWS Availability Zones (AZs) link -
Amazon Redshift announces native console integration with partners (Preview) link -
Amazon Redshift launches ra3.xlplus nodes with managed storage link -
Introducing Amazon HealthLake to make sense of health data link link
Amazon EMR Studio makes it easier for data scientists to build and deploy code link -
Simplify running Apache Spark jobs with Amazon EMR on Amazon EKS link link
Announcing preview of AWS Lake Formation features: Transactions, Row-level Security, and Acceleration link -
Amazon QuickSight now supports Amazon Elasticsearch Service, and adds new box plot and filled map visuals link -

Amazon Redshift is the beating heart of the data analytics category at AWS. Redshift is one of the most used specialty services in AWS, and one of the most used data warehouses in the world. For a long time, Redshift had a quite rigid pricing and scaling structure: you could only select four instance classes, and each of them had a fixed amount of CPU, memory and storage. At re:Invent 2019, AWS introduced the new Redshift RA3 class, which separated compute and memory from storage, allowing you to scale them independently. However, the ra3.4xlarge and ra3.16xlarge classes were pretty expensive, at $3.26 per hour and $13.04 per hour, respectively.

The new ra3.xlplus class offers exactly one third of the ra3.4xlarge instance: 4 vCPUs (vs. 12), 32 GiB RAM (vs. 96), 0.65 GB/s IO (vs. 2.00 GB/s), at a price of $1.086 per hour (vs. $3.26). This brings the modern ra3 class within budget for more companies.

A cool non-Redshift release is the ability to run Apache Spark jobs with Amazon EMR on Amazon EKS. Previously, running EMR workloads required running an EMR EC2 cluster. Although EMR on EC2 supports spot and reserved instances, these clusters are often expensive and non-ideally utilized. With many companies running EKS clusters nowadays, EMR on EKS allows them to share resources, optimize scaling, and ultimately reduce costs.

EMR on EKS

Thursday December 10th - Infrastructure

On Thursday Peter DeSantis held his annual Infrastructure Keynote. This has always been my favorite re:Invent event, because DeSantis generally lifts a bit of the covers on what makes AWS infrastructure tick. If you listen carefully you will often hear details you can’t get anywhere else. This year was no different, with insights on power supplies, blast radii, running Mac in AWS, and their custom silicon.

Complexity and Blast Radius

Name AWS News AWS Blog
AWS Global Accelerator launches custom routing link link
VPC Reachability Analyzer link link
AWS customers can now use industry standard Internet Group Management Protocol (IGMP) to easily deploy, manage and scale their multicast applications in AWS cloud link -
Introducing AWS Transit Gateway Connect to simplify SD-WAN branch connectivity link link
Amazon EC2 announces Spot Blueprints, an infrastructure code template generator to get started with EC2 Spot Instances link -
Amazon EC2 announces new network performance metrics for EC2 instances link link

We also saw some new releases, including Custom Routing for Global Accelerator, about which I wrote a separate blog post. VPC Reachability Analyzer is also notable: it allows engineers and developers to troubleshoot network connectivity between two points in one or more VPCs. This eases a pain point familiar to any engineer who has worked with EC2 for even a short amount of time: you’re trying to connect from an EC2 instance to another instance, but it just doesn’t work. You start by checking the security groups, then the NACLs, the route tables, the VPC peering, and so on, to find out where the issue lies. This is time consuming and frustrating. Now you can use VPC Reachability Analyzer to go through this entire process for you, in just a few clicks.

AWS Transit Gateway Connect is also interesting, from a technical standpoint as well as its underlying idea. Transit Gateway Connect makes it easier to connect private data centers to AWS. Additionally, when using Transit Gateway Connect you get a total bandwidth up to 20 Gbps per Connect attachment. This release is strongly aligned with AWS’ move towards edge locations and hybrid scenarios, as detailed in my blog post AWS is coming to a data center (or pizza parlor) near you!.

The last announcement that jumped out are the new network performance metrics for EC2 instances. Any EC2 instance has a number of networking limits it cannot exceed. These limits are often more restrictive on cheaper instances. Customers previously had no way of finding out whether an instance breached these limits, except for contacting AWS support. With this release, new metrics are available for inbound and outbound bandwidth, packets-per-second (PPS), connections tracked and PPS to link-local services. This provides more control and insight, and helps engineers prevent issues instead of figuring out what caused them after the fact.

Other releases

Name AWS News AWS Blog
Amazon ECR announces cross region replication of images link link
AWS announces AWS Audit Manager link link
Amazon Braket tensor network simulator supports 50-qubit quantum circuits link -
Amazon Braket now supports PennyLane link link
AWS Security Hub integrates with AWS Audit Manager for simplified security posture management link -
Amplify CLI enables serverless container deployments using AWS Fargate link link
Introducing Amazon Aurora R6g instance types, powered by AWS Graviton2 processors, in preview link -
Amazon API Gateway now supports integration with Step Functions StartSyncExecution for HTTP APIs link -
Amazon EBS reduces the minimum volume size of Throughput Optimized HDD and Cold HDD Volumes by 75% link -
Simplify EC2 provisioning and viewing cloud resources in the ServiceNow CMDB with AWS Service Management Connector for ServiceNow link -
AWS IDE Toolkit now available for AWS Cloud9 link -
Amazon Aurora PostgreSQL Integrates with AWS Lambda link -
AWS Security Hub now supports bidirectional integration with ServiceNow ITSM link -

Aside from the topical releases we also saw a number of announcements for other services. There are two I’d like to highlight: ECR cross region replication and AWS Audit Manager. ECR cross region replication is a very welcome addition for everyone using containers in multiple regions. Previously, systems either had to fetch their images from another region, which is both slow and expensive, or developers had to upload their images to multiple regions, which is cumbersome. With this release, images can be pushed to ECR in one region, after which they are automatically forwarded to another region.

Audit Manager is a significant release for all customers who run sensitive workloads on AWS. In an earlier time, security officers prohibited use of public cloud because of perceived security issues in a multi-tenant environment. Nowadays, security officers push for a move to the cloud because of the controls and insights available to them. However, collecting information about your environments and actually proving it is well-architected, secure and compliant is still a difficult task. AWS Audit Manager automates the collection of evidence and makes it available in an easy-to-use portal. Additionally, it provides insights into missing information and any compliance issues it has found.

Conclusion

Week 2 at re:Invent 2020 was more focused around specific topics. The releases were also more iterative than big bangs, but nevertheless we have seen some great additions that clearly show the direction AWS is moving in: machine learning everywhere, hybrid cloud and migrations as a gateway to AWS, and a strong foundation on EC2, optionally based on their custom Graviton2 silicon. There is one more week to go, let’s see what other surprises AWS has to offer!

This article is part of a series published around re:Invent 2020. If you would like to read more about re:Invent 2020, check out my other posts:

I share posts like these and smaller news articles on Twitter, follow me there for regular updates! If you have questions or remarks, or would just like to get in touch, you can also find me on LinkedIn.

Luc van Donkersgoed
Luc van Donkersgoed