New CDK Bootstrap and the EKS Cluster
In the AWS CDK Version v1.25.0, the CDK team added a new bootstrap template that includes new resources like IAM Role and S3 Buckets. From the AWS CDK Documentation: > The AWS CDK supports two…
The second week of re:Invent 2020 is done! Another week with more than 50 releases, mainly centered around the topics of machine learning, data analytics and a bit of networking. In this post we’ll review all the announcements and releases, from AWS Audit Manager to VPC Reachability Analyzer.
On Tuesday Swami Sivasubramanian introduced all that’s new in the Machine Learning keynote. Before, during and after the event many new Machine Learning capabilities were announced and released. And that might be an issue. Let me introduce you to the list of releases:
|Name||AWS News||AWS Blog|
|Introducing distributed training on Amazon SageMaker||link||-|
|Detect bias in ML models and explain model behavior with Amazon SageMaker Clarify||link||link|
|Announcing new capabilities for Amazon SageMaker Debugger with real-time monitoring of system resources and profiling training jobs||link||link|
|Introducing Amazon SageMaker JumpStart – Easily and quickly bring machine learning applications to market||link||link|
|Amazon SageMaker Model Monitor now supports new capabilities to maintain model quality in production||link||-|
|AWS introduces Amazon SageMaker Edge Manager - Model Management for Edge Devices||link||link|
|New – Managed Data Parallelism in Amazon SageMaker Simplifies Training on Large Datasets||-||link|
|Amazon SageMaker Simplifies Training Deep Learning Models With Billions of Parameters||-||link|
|Amazon Lookout for Metrics|
|Amazon Lookout for Equipment||link||link|
|Amazon Lookout for Vision||link||link|
|Amazon AppFlow now provides Amazon Lookout for Metrics connectivity to several cloud applications||link||-|
|Amazon announces Amazon Neptune ML: easy, fast, and accurate predictions for graphs||link||-|
|AWS announces Amazon Redshift ML (preview)||link||-|
|Amazon Kendra adds Google Drive connector||link||-|
|Amazon Kendra launches incremental learning||link||-|
|Amazon Kendra adds support for custom synonyms||link||-|
|Amazon Kendra launches connector library||link||-|
|Amazon CodeGuru Profiler adds Memory Profiling and Heap Summary||link||link|
|Announcing Amazon Forecast Weather Index – automatically include local weather to increase your forecasting model accuracy||link||-|
So… SageMaker distributed training, SageMaker Clarify, SageMaker Debugger, SageMaker JumpStart, SageMaker Model Monitor, SageMaker Edge Manager, SageMaker Managed Data Parallelism, and SageMaker Model Parallelism. Are you keeping up?
The strength of SageMaker has always been its low-friction introduction to Machine Learning. It allowed engineers without a degree in mathematics to get started with training and hosting models, without immediately boggling them with all the complexities that make ML work. Now that core is still there - but it’s buried under a shroud of additional services with unintuitive names. If you want to train your first model, where do you even start?
Nevertheless SageMaker is still a powerful tool, and with all the new services and features it has become even more so. It now offers a broad feature set for machine learning novices, intermediates and experts.
Then Amazon introduced a whole new namespace of services: Amazon Lookout for Equipment, Amazon Lookout for Vision and Amazon Lookout for Metrics. You can imagine that failure prevention, image recognition and anomaly detection are among the most common use cases for machine learning. With the Lookout series, Amazon is bringing common applications to the masses, with little to no machine learning skills required.
Another service that saw a few new features is Amazon Kendra. This enterprise search service scans document and data stores like Jira, Dropbox, Google Drive and more. Then it allows users to perform natural language queries, which will return answers from all connected data stores. This enables large enterprises to provide their employees with all the information they need, without expecting them to know all the places they could possibly find it. This is another example of a common machine learning application. Amazon obviously aims to do the heavy lifting, and solve a problem that many of their customers might have.
With new Kendra features like the Google Drive connector and the connector library, more data sources can be connected, making Kendra even more powerful than before.
On Wednesday we saw Rahul Pathak, vice president of analytics at AWS, deliver his leadership session. Around his talk a number of data analytics were released. Again, let’s take a look at the list:
|Name||AWS News||AWS Blog|
|Amazon Redshift now includes Amazon RDS for MySQL and Amazon Aurora MySQL databases as new data sources for federated querying (Preview)||link||-|
|Amazon Redshift announces support for native JSON and semi-structured data processing (preview)||link||-|
|Amazon Redshift introduces data sharing (preview)||link||-|
|Amazon Redshift announces Automatic Table Optimization||link||-|
|Amazon Redshift launches the ability to easily move clusters between AWS Availability Zones (AZs)||link||-|
|Amazon Redshift announces native console integration with partners (Preview)||link||-|
|Amazon Redshift launches ra3.xlplus nodes with managed storage||link||-|
|Introducing Amazon HealthLake to make sense of health data||link||link|
|Amazon EMR Studio makes it easier for data scientists to build and deploy code||link||-|
|Simplify running Apache Spark jobs with Amazon EMR on Amazon EKS||link||link|
|Announcing preview of AWS Lake Formation features: Transactions, Row-level Security, and Acceleration||link||-|
|Amazon QuickSight now supports Amazon Elasticsearch Service, and adds new box plot and filled map visuals||link||-|
Amazon Redshift is the beating heart of the data analytics category at AWS. Redshift is one of the most used specialty services in AWS, and one of the most used data warehouses in the world. For a long time, Redshift had a quite rigid pricing and scaling structure: you could only select four instance classes, and each of them had a fixed amount of CPU, memory and storage. At re:Invent 2019, AWS introduced the new Redshift RA3 class, which separated compute and memory from storage, allowing you to scale them independently. However, the ra3.4xlarge and ra3.16xlarge classes were pretty expensive, at $3.26 per hour and $13.04 per hour, respectively.
The new ra3.xlplus class offers exactly one third of the ra3.4xlarge instance: 4 vCPUs (vs. 12), 32 GiB RAM (vs. 96), 0.65 GB/s IO (vs. 2.00 GB/s), at a price of $1.086 per hour (vs. $3.26). This brings the modern ra3 class within budget for more companies.
A cool non-Redshift release is the ability to run Apache Spark jobs with Amazon EMR on Amazon EKS. Previously, running EMR workloads required running an EMR EC2 cluster. Although EMR on EC2 supports spot and reserved instances, these clusters are often expensive and non-ideally utilized. With many companies running EKS clusters nowadays, EMR on EKS allows them to share resources, optimize scaling, and ultimately reduce costs.
On Thursday Peter DeSantis held his annual Infrastructure Keynote. This has always been my favorite re:Invent event, because DeSantis generally lifts a bit of the covers on what makes AWS infrastructure tick. If you listen carefully you will often hear details you can’t get anywhere else. This year was no different, with insights on power supplies, blast radii, running Mac in AWS, and their custom silicon.
|Name||AWS News||AWS Blog|
|AWS Global Accelerator launches custom routing||link||link|
|VPC Reachability Analyzer||link||link|
|AWS customers can now use industry standard Internet Group Management Protocol (IGMP) to easily deploy, manage and scale their multicast applications in AWS cloud||link||-|
|Introducing AWS Transit Gateway Connect to simplify SD-WAN branch connectivity||link||link|
|Amazon EC2 announces Spot Blueprints, an infrastructure code template generator to get started with EC2 Spot Instances||link||-|
|Amazon EC2 announces new network performance metrics for EC2 instances||link||link|
We also saw some new releases, including Custom Routing for Global Accelerator, about which I wrote a separate blog post. VPC Reachability Analyzer is also notable: it allows engineers and developers to troubleshoot network connectivity between two points in one or more VPCs. This eases a pain point familiar to any engineer who has worked with EC2 for even a short amount of time: you’re trying to connect from an EC2 instance to another instance, but it just doesn’t work. You start by checking the security groups, then the NACLs, the route tables, the VPC peering, and so on, to find out where the issue lies. This is time consuming and frustrating. Now you can use VPC Reachability Analyzer to go through this entire process for you, in just a few clicks.
AWS Transit Gateway Connect is also interesting, from a technical standpoint as well as its underlying idea. Transit Gateway Connect makes it easier to connect private data centers to AWS. Additionally, when using Transit Gateway Connect you get a total bandwidth up to 20 Gbps per Connect attachment. This release is strongly aligned with AWS’ move towards edge locations and hybrid scenarios, as detailed in my blog post AWS is coming to a data center (or pizza parlor) near you!.
The last announcement that jumped out are the new network performance metrics for EC2 instances. Any EC2 instance has a number of networking limits it cannot exceed. These limits are often more restrictive on cheaper instances. Customers previously had no way of finding out whether an instance breached these limits, except for contacting AWS support. With this release, new metrics are available for inbound and outbound bandwidth, packets-per-second (PPS), connections tracked and PPS to link-local services. This provides more control and insight, and helps engineers prevent issues instead of figuring out what caused them after the fact.
|Name||AWS News||AWS Blog|
|Amazon ECR announces cross region replication of images||link||link|
|AWS announces AWS Audit Manager||link||link|
|Amazon Braket tensor network simulator supports 50-qubit quantum circuits||link||-|
|Amazon Braket now supports PennyLane||link||link|
|AWS Security Hub integrates with AWS Audit Manager for simplified security posture management||link||-|
|Amplify CLI enables serverless container deployments using AWS Fargate||link||link|
|Introducing Amazon Aurora R6g instance types, powered by AWS Graviton2 processors, in preview||link||-|
|Amazon API Gateway now supports integration with Step Functions StartSyncExecution for HTTP APIs||link||-|
|Amazon EBS reduces the minimum volume size of Throughput Optimized HDD and Cold HDD Volumes by 75%||link||-|
|Simplify EC2 provisioning and viewing cloud resources in the ServiceNow CMDB with AWS Service Management Connector for ServiceNow||link||-|
|AWS IDE Toolkit now available for AWS Cloud9||link||-|
|Amazon Aurora PostgreSQL Integrates with AWS Lambda||link||-|
|AWS Security Hub now supports bidirectional integration with ServiceNow ITSM||link||-|
Aside from the topical releases we also saw a number of announcements for other services. There are two I’d like to highlight: ECR cross region replication and AWS Audit Manager. ECR cross region replication is a very welcome addition for everyone using containers in multiple regions. Previously, systems either had to fetch their images from another region, which is both slow and expensive, or developers had to upload their images to multiple regions, which is cumbersome. With this release, images can be pushed to ECR in one region, after which they are automatically forwarded to another region.
Audit Manager is a significant release for all customers who run sensitive workloads on AWS. In an earlier time, security officers prohibited use of public cloud because of perceived security issues in a multi-tenant environment. Nowadays, security officers push for a move to the cloud because of the controls and insights available to them. However, collecting information about your environments and actually proving it is well-architected, secure and compliant is still a difficult task. AWS Audit Manager automates the collection of evidence and makes it available in an easy-to-use portal. Additionally, it provides insights into missing information and any compliance issues it has found.
Week 2 at re:Invent 2020 was more focused around specific topics. The releases were also more iterative than big bangs, but nevertheless we have seen some great additions that clearly show the direction AWS is moving in: machine learning everywhere, hybrid cloud and migrations as a gateway to AWS, and a strong foundation on EC2, optionally based on their custom Graviton2 silicon. There is one more week to go, let’s see what other surprises AWS has to offer!
This article is part of a series published around re:Invent 2020. If you would like to read more about re:Invent 2020, check out my other posts:
I share posts like these and smaller news articles on Twitter, follow me there for regular updates! If you have questions or remarks, or would just like to get in touch, you can also find me on LinkedIn.