What I learned using the AWS CDK in the past year

What I learned using the AWS CDK in the past year

In 2019, I worked as a Developer when I decided to change my profile and become an AWS Architect. I started working with the CDK in October 2019. I learned a lot from the CDK and from the AWS itself.
Working with the AWS CDK was not enough, so I started contributing to the project. Since it’s an open-source project, I need to fork the project in github and start working on some small issues.

Now it’s a year, and this is what I learned so far:

High and Low level components

First of all, we need to clarify what is the High and Low level components.

    sqs.CfnQueue()
    ...
    sqs.Queue()

The Modules prefixed with Cfn are the Low Level Component. These components are the most closer that you can get to the CloudFormation template definition.
When you inspect these Cfn components and compare with what the CloudFormation expects in its template, you see they are the same (unless CloudFormation had released an update a few days ago).

Using our Queue as an example, we have:

High Level Component

class Queue(QueueBase, metaclass=jsii.JSIIMeta, jsii_type="@aws-cdk/aws-sqs.Queue"):
    """A new Amazon SQS queue."""

    def __init__(
        self,
        scope: constructs.Construct,
        id: builtins.str,
        *,
        content_based_deduplication: typing.Optional[builtins.bool] = None,
        data_key_reuse: typing.Optional[aws_cdk.core.Duration] = None,
        dead_letter_queue: typing.Optional[DeadLetterQueue] = None,
        delivery_delay: typing.Optional[aws_cdk.core.Duration] = None,
        encryption: typing.Optional[QueueEncryption] = None,
        encryption_master_key: typing.Optional[aws_cdk.aws_kms.IKey] = None,
        fifo: typing.Optional[builtins.bool] = None,
        max_message_size_bytes: typing.Optional[jsii.Number] = None,
        queue_name: typing.Optional[builtins.str] = None,
        receive_message_wait_time: typing.Optional[aws_cdk.core.Duration] = None,
        retention_period: typing.Optional[aws_cdk.core.Duration] = None,
        visibility_timeout: typing.Optional[aws_cdk.core.Duration] = None,
    ) -> None:

Low Level Component

class CfnQueue(
    aws_cdk.core.CfnResource,
    metaclass=jsii.JSIIMeta,
    jsii_type="@aws-cdk/aws-sqs.CfnQueue",
):
    """A CloudFormation ``AWS::SQS::Queue``.

    :see: http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-sqs-queues.html
    :cloudformationResource: AWS::SQS::Queue
    """

    def __init__(
        self,
        scope: aws_cdk.core.Construct,
        id: builtins.str,
        *,
        content_based_deduplication: typing.Optional[typing.Union[builtins.bool, aws_cdk.core.IResolvable]] = None,
        delay_seconds: typing.Optional[jsii.Number] = None,
        fifo_queue: typing.Optional[typing.Union[builtins.bool, aws_cdk.core.IResolvable]] = None,
        kms_data_key_reuse_period_seconds: typing.Optional[jsii.Number] = None,
        kms_master_key_id: typing.Optional[builtins.str] = None,
        maximum_message_size: typing.Optional[jsii.Number] = None,
        message_retention_period: typing.Optional[jsii.Number] = None,
        queue_name: typing.Optional[builtins.str] = None,
        receive_message_wait_time_seconds: typing.Optional[jsii.Number] = None,
        redrive_policy: typing.Any = None,
        tags: typing.Optional[typing.List[aws_cdk.core.CfnTag]] = None,
        visibility_timeout: typing.Optional[jsii.Number] = None,
    ) -> None:

The main difference here is how prepared is the High Level Component, for example, the message_retention_period parameter. This parameter refers to a Duration time, how many seconds to retain a message. Using the Cfn you would need to add an integer here and do some calculations (1 day, 24 hours, 3600 seconds), but the High Level Component is expecting aws_cdk.core.Duration, and if you would like to pass 1 day to this parameter, you just need:

sqs.Queue(
    self, "SampleAppQueue",
    retention_period=core.Duration.days(1),
)

So, I just need to stick with the High Level Components?

Before you change everything you have to the High Level Components, there are some problems as well.

Many Low Level Components don’t have the correspondent High Level. If we look at the Global Accelerator resource, there are only the Cfn modules, so no High Level Component is available. This can change in the future when some contributor decides to implement the High Level Component. But for now, we are stuck to create Global Accelerator using Low Level Components.

Sometimes Low Level is just the Best Solution

You need to know your use-case!

Not so long ago, I have needed to create a Secrets Manager secret with a template and let the client provide the credentials inside the secret.

The High Level Module does not provide an easy way to create a secret without generating some random value to it! I would need to add an extra key to create my template, so I just went to the Low Level:

cfn_secrets = secretsmanager.CfnSecret(
    scope, 'MyCfnSecret',
    secret_string=json.dumps({
        "api_token": "",
        "pass_token": "",
        "download_token": "",
        "external_ids": ""
    })
)

And after that I just need to import to High Level Component, and use the benefits from it:

_secrets = secretsmanager.Secret.from_secret_arn(
    scope, 'MySecret',
    cfn_secrets.ref
)

# The user permissions to update the Secret
_secrets.grant_read(iam_user)
_secrets.grant_write(iam_user)

Experimental High Level Components

This is an ongoing discussion in the AWS CDK repository.

Sometimes they update a HLC, and this update replaces the Resource you deployed.

You need to be careful with modules marked with Experimental. Sometimes it is best to stick with the LLC to avoid a headache in the future.

But how can I know which Component is Experimental?

  • Check the page of the module
    HTTP API Doc

  • Check the Documentation inside of the Library
    HTTP API Python

Overall

I DO believe in using the HLC, and it depends on your case. But I would like you to have these points in mind:

  • Are you comfortable with experimental components?
  • The High Level Component offers everything you need?
  • Is there a High Level Component for what you need?

Better way to structure AWS CDK project

I already wrote an article about this, how to transform Stacks into Libraries, and re-used!

Take some time to look at it:

A better way to structure AWS CDK projects around Nested Stacks

Importing Resources

I already spoke briefly about this in the section High and Low level components, but I would like to focus on this here.

Whenever the case (importing existing resources, importing newly created resources), it’s good to know how you can import the Resources into HLC and take advantages of it.

Sharing a role between Pipelines

One use case was that the Role used for the CodeBuild in one Stack needed to be used in another Stack.

After we create the Pipeline in the first Stack, we save the Role into Systems Manager Parameter Store:

ssm.StringParameter(
    self,
    'CdkDeployRole',
    parameter_name='/codebuild/role',
    string_value=cdk_build_and_deploy.role.role_arn
)

In the other Stack, we grab the Parameter and import the role to be used by the CodeBuild:

imported_role_arn = ssm.StringParameter.from_string_parameter_attributes(
    self,
    'CdkDeployRole',
    parameter_name='/codebuild/role'
).string_value

cicd_role = iam.Role.from_role_arn(
    self,
    'CiCdSharedRole',
    imported_role_arn
)

Again, learning how to import the resources can save you some times, and since the Systems Manager Parameter Store is low, you can avoid some problems in the CDK.

Core Components

We talked about HLC and LLC, but another important component to take a look at is the Core Components.

There are many components here, but I would like to take a look at just a few of them.

core.CfnResource

Sometimes, there is no Low Level Component of a CloudFormation Resource, maybe you are using an old version, or even this Resource is very new, and the CDK Team didn’t have the time to update the CloudFormation library. So, what can you do? Not use the Resource? Ask the CDK Team to keep the pace? Or just use the core.CfnResource until the module is updated in the CDK?

I would go with the last option!

But sometimes, you are just more comfortable with defining all the Properties yourself, and you don’t trust anyone!

As an example, the VpcLink for the ApiGatewayV2 was released in 12-03-2020, and the CDK added this Resource in 13-08-2020, until the Resource were added to the CDK I needed to deploy this, and so I went to the core.CfnResource:

vpc_link = core.CfnResource(
    self,
    "VPCLink",
    type="AWS::ApiGatewayV2::VpcLink",
    properties={
        "Name": 'VpcLink',
        "SecurityGroupIds": [security_group.security_group_id],
        "SubnetIds": [subnet_1, subnet_2, subnet_3]
    }
)

core.CfnJson

I learned it the hard way that if you need a dynamic value in the key of a JSON object to deploy a Resource in CloudFormation, you will have some problems! And it does not just treat the JSON as a string, CloudFormation expects an object, and you need to give it an object!

When creating a Role used by a serviceaccount in the K8S, the condition for the TrustedPolicy is that the key includes the OIDC Server address. This is dynamic information and depends on the deployment of the EKS Cluster. How can I have this:

{
    "Condition": {
        "StringEquals": {
            "oidc.eks.eu-west-1.amazonaws.com/id/EKS_CLUSTER_ID:sub": "system:serviceaccount:namespace:service-account"
        }
    }
}

So the solution, was to create the Condition for the Role using the core.CfnJson:

condition = core.CfnJson(
    self,
    'RoleCondition',
    value={
        f"oidc.eks.{region}.amazonaws.com/id/{EKS_CLUSTER_ID}:sub": (
            "system:serviceaccount:namespace:service-account"
        )
    }
)

core.CfnOutput

This is the most used by me of all the core components!

Every time I need to debug the output result of a resource deployed (Cluster ID, Instance ID, Region…), I prefer to add this just to output the result, and sometimes it stays there or not… ;)

Using the previous example, to identify if the EKS Cluster ID is correct for what we need, we would just need to:

core.CfnOutput(
    self,
    'DebugOutput',
    value=EKS_CLUSTER_ID
)

Override component values

What would you think if I tell you that developers sometimes make mistakes and add properties where it should not be added?

The methods add_deletion_override() and add_override() help solve many problems when you identify that the CDK is adding more properties than it should or if you want to add a different value for a specific property.

.add_deletion_override()

The CDK adds a Role for the CodeStarConnection, which doesn’t work well for the CodeStar Connection. I have a couple of Support Cases with AWS about this, and sometimes it is just best not to add this information!

CDK adds this information whether you want it or not! So how can we remove this property in the CloudFormation?

pipeline.node.default_child.add_deletion_override(
    "Properties.Stage.0.Action.0.RoleArn"
)

Where the pipeline is the CodePipeline project. The CodeStar Connection is added to a Stage that is added to a Stage for his turn. The .add_deletion_override() is a base resource method, meaning it’s not part of the CodeStar Connection but to the CodePipeline, because you don’t deploy a CodeStar Connection. Instead, you deploy a CodePipeline that has a CodeStar Connection (ufff).

.add_override()

If instead of remove you want to add more information?

The .add_override() changes the information you want inside of a Resource, so if you want to add more Principals to a Trusted Policy, you add:

_role.node.default_child.add_override(
    'Properties.AssumeRolePolicyDocument.Statement.0.Principal.AWS',
    [
        f'arn:aws:iam::{_account_1}:root',
        f'arn:aws:iam::{_account_2}:root'
    ]
)

This would place both accounts to be allowed to assume the Role in the Trusted Policy!

CDK Development Tips

Now, in this section, I am sharing my tips, what I usually use when developing IaC using the CDK in Python!

Here are a mix of Code Standards, Debug Tools, and Documentations!

NOTE: These are my ideas. If you have more, please let me know!

DRY vs. WET

From Wikipedia:

DRY

Don’t repeat yourself (DRY, or sometimes do not repeat yourself) is a principle of software development aimed at reducing repetition of software patterns,[1] replacing it with abstractions or using data normalization to avoid redundancy.

WET

Write Every Time (WET) solutions are common in multi-tiered architectures where a developer may be tasked with, for example, adding a comment field on a form in a web application. The text string “comment” might be repeated in the label, the HTML tag, in a read function name, a private variable, database DDL, queries, and so on.

Paradigms like OOP can help here in the development of CDK. My article about A better way to structure AWS CDK projects around Nested Stacks can help to start to convert the Stacks into Libraries and start re-using the code. I’m taking a step forward and begin abstracting the Alarms for the AutoScale Group.

We normally have 2 Alarms (CPUUtilization and StatusCheckFailed) that we want for each ASG, so:

    @staticmethod
    def asg(
        scope: core.Construct,
        section: str,
        asg_name: str,
        topic: sns.Topic
    ) -> typing.Dict[str, cloudwatch.Alarm]:
        '''
        :param scope: Scope from the CDK
        :param section: Which section this ASG Alarms refers to (EKS, Bastion)
        :param asg_name: The ASG name
        :param topics: The dict of the SNS Topics to send the notifications

        Creates the necessary alarms for the AutoScaleGroup.
        example:
        Alarms.asg(scope, 'EKS', eks.auto_scaling_group_name, topics)
        '''

        # region Monitoring Alarms (CPUUtilization)
        cpu_utilization = cloudwatch.Alarm(
            scope,
            f'{section}CpuUtilization',
            metric=cloudwatch.Metric(
                metric_name='CPUUtilization',
                namespace='AWS/EC2',
                dimensions={
                    'AutoScalingGroupName': asg_name
                },
                unit=cloudwatch.Unit.PERCENT
            ),
            evaluation_periods=evaluation_period,
            threshold=threshold_cpu_usage,
            actions_enabled=True,
            alarm_description=(
                'Alarm - CPUUtilization '
                f'high {section} Instance ASG-{asg_name}'
            ),
            comparison_operator=HalloumiAlarms.GREATER_THAN_THRESHOLD,
            period=core.Duration.minutes(1),
            statistic='Average',
            treat_missing_data=cloudwatch.TreatMissingData.NOT_BREACHING
        )
        cpu_utilization.add_alarm_action(
            cloudwatch_actions.SnsAction(topic=topic)
        )
        # endregion

        # region Monitoring Alarms (StatusCheckFailed)
        status_check = cloudwatch.Alarm(
            scope,
            f'{section}StatusCheckFailed',
            metric=cloudwatch.Metric(
                metric_name='StatusCheckFailed',
                namespace='AWS/EC2',
                dimensions={
                    'AutoScalingGroupName': asg_name
                },
                unit=cloudwatch.Unit.PERCENT
            ),
            evaluation_periods=evaluation_period,
            threshold=threshold_status_check,
            actions_enabled=True,
            alarm_description=(
                'Alarm - StatusCheckFailed '
                f'high {section} Instance ASG-{asg_name}'
            ),
            comparison_operator=HalloumiAlarms.GREATER_THAN_THRESHOLD,
            period=core.Duration.minutes(1),
            statistic='Minimum',
            treat_missing_data=cloudwatch.TreatMissingData.NOT_BREACHING
        )
        status_check.add_alarm_action(
            cloudwatch_actions.SnsAction(topic=topic)
        )
        # endregion

        return {
            'cpu_utilization': cpu_utilization,
            'status_check': status_check
        }

Now every time we need Alarms for an ASG, the developer can re-use this Library!

Think of all the code you type every day. I believe that you can abstract at least 10% of it!

Returning Types

This is for those who are not familiar with the programming language and are developing their own libraries!

When you define the Type your method (or a simple function) returns, you help yourself in the future. When you use this method, you already know that this method will return a Role from the -> iam.Role:.

And your IDE will be able to help you inspect the possible methods and properties from that new object you just created are.

def new_role(
    scope: core.Construct,
    statements: typing.List[iam.PolicyStatement]
) -> iam.Role:
    '''
    :param scope: Scope from the CDK
    :param statements: List of Statements contained the policies

    Create a new Role and add the policies from the statements
    '''
    the_role = iam.Role(
        scope,
        'MyRole'
    )
    for policy in statements:
        the_role.add_to_policy(policy)

    return the_role

Python Doc

This is the most controversial topic. Some say “Make a good code that doesn’t need documentation” others say “Comment every step”.
I like it when I create a good code, and I just need a simple explanation on how to re-use what I just made!

I need to be honest that this is not 100% of my work! But when I see the method needs some explanation, I use to follow this python documentation:

def new_function(
    param1: str,
    param2: str
) -> str:
    '''
    :param param1: Little explanation for param1
    :param param2: Little explanation for param2

    What are the use cases for this function!!!

    Example:
    Do we need an example?
    '''

I don’t need to go inside of the function and start detailing here I do this, after receiving this value I call this method, and so on. But some introduction is required, so I use the documentation for methods, classes, and simple functions!

CDK Debugging

Recently I took some time to look at this article from Kah Tang (a colleague here in Sentia), where he talks about the Debugging Process using the Python Extension in the VS Code. I recommend you take some time and read it as well.

Debugging a CDK Python project in Visual Studio Code

Finalizing

It’s being a year, many projects, and lots of coding! I was able to learn a lot and start refactory my libraries. The CDK continues to evolve, and my code needs to follow the same path!

So what about you? What have you created with the CDK?

Rafael Zamana Kineippe
Rafael Zamana Kineippe

AWS Consultant at Sentia Consulting