Pre:Invent 2019 Update - Load Balancers and Auto Scaling Groups

Pre:Invent 2019 Update - Load Balancers and Auto Scaling Groups

Re:Invent 2019 is almost upon us, and as every year AWS releases a large amount of features in the run-up to the main event. In this Pre:Invent 2019 post I will highlight some significant updates to Elastic Load Balancing (ELB) and Auto Scaling Groups (ASG).

In this post we’ll look at the following features and why they are significant:

Weighted Target Groups

When I open the official AWS Quick Start on Blue/Green deployments today, it contains this image:


In the diagram, you can see two load balancers and two auto scaling groups. Routing / weighting between the two is achieved through DNS with Route 53.

With weighted target groups we still have two auto scaling groups, represented by two load balancing target groups. However, we only have one application load balancer, which we can configure to send a certain percentage of traffic to each target group.

This greatly simplifies the setup and minimizes the amount of moving parts, leading to a smaller risk of misconfiguration and incidents.

Additionally, we can configure the load balancer to apply group-level stickiness, increasing the likelyhood of related requests being processed by the same auto scaling group.


For more information, see the AWS news article: Application Load Balancer simplifies deployments with support for weighted target groups

Maximum Instance Lifetime

In a cloud native environment, we should be able to terminate instances at any moment without impact. However, the fact that we can does not mean we actually do. That might lead to long running instances, which in turn might impact performance, security and application reliability.

The solution might be to periodically replace the instances, for example by running a Lambda function to check the age of an instance and terminating it. However, implementing a custom solution like this costs time and effort to build and maintain, and again, increases the amount of moving parts.

With the new Maximum Instance Lifetime feature, the auto scaling group can do all this heavy lifting for us.

For more information, see the AWS news article: Amazon EC2 Auto Scaling Now Supports Maximum Instance Lifetime

Least Outstanding Requests

Previously, Application Load Balancers only supported round robin balancing. This means that if there are two instances behind a load balancers, the requests would be routed to instance A, instance B, A, B, A, B, and so on. With three instances it would be A, B, C, A, B, C, A, B, C.

There is no inspection of the requests and the load of the target servers is not taken into account. Theoretically, there might be a situation where there are two servers, every odd request is simple (GET /status) and every even request is complex (POST /database/purge). In this hypothetical situation, all heavy requests end up on the same instance. This server would be overburdened, while the other is idling, even though the requests were equally spread.

This becomes even more interesting in an auto scaling group that scales out at 70% average CPU load; because the average load is 50%, the ASG won’t kick in to solve the problem.

The problem becomes even bigger when sticky sessions have been enabled on the load balancer (but if you need sticky sessions you might want to take a good look at the cloud nativity your application anyway).

With the new Least Outstanding Requests feature, the load balancer will use the amount of outstanding requests (the amount of requests the instance still needs to process) as a metric to determine where to route requests, instead of using ‘dumb’ round robin routing. This will reduce the burden on busy servers, and might actually make sticky sessions usable!

For more information, see the AWS news article: Application Load Balancer now supports Least Outstanding Requests algorithm for load balancing requests

Luc van Donkersgoed
Luc van Donkersgoed