Make your APIs faster, cheaper and safer with Rust and Go

Make your APIs faster, cheaper and safer with Rust and Go

Luc van Donkersgoed

Luc van Donkersgoed

The AWS Well-Architected Framework consists of five pillars: operational excellence, security, reliability, performance efficiency, and cost optimization. It is said that the last three pillars follow the ‘pick two’ principle. You can have reliability and performance, for example, but not at a low cost. But sometimes… sometimes you can have your cake and eat it to. Migrating your Python and NodeJS Lambda Functions to Go or Rust is one of those moments.

In this article we will explore a common serverless scenario for APIs in AWS: an API frontend, for example API Gateway or GraphQL, is backed by Lambda functions. These functions query DynamoDB, parse the content, and return it to the user.

Layers

The Lambda processing layer is often written in NodeJS or Python. These languages are easy to understand and relatively fast. I say relatively fast because they will execute simple tasks quickly enough. But they are also interpreted languages, which means the code is processed by an executable at runtime. The executable converts your code into machine language, which is the code the CPU will actually run. This process is also called just-in-time (JIT) compilation. Compare this to compiled languages, where the conversion to machine language takes place in an earlier stage: when the developer or build pipeline packages their app. The result of compilation is a binary file composed of the machine language the CPU understands. This code will execute faster because no conversion needs to take place at runtime.

Interpreted vs. Compiled

Rust and Go are compiled languages which have seen massive gains in popularity in the last decade. They are almost as fast as C and C++, historically the kings of performance. Additionally, Rust and Go are statically typed, Rust has smart pointers, and Go has garbage collection. These features make Rust and Go much safer than either C, C++, Python or NodeJS.

AWS has seen these same benefits, which has lead to a large increase of the use of Rust across their teams. You can read more about that in their articles How our AWS Rust team will contribute to Rust’s future successes and Congratulations, Rustaceans, on the creation of the Rust Foundation!.

Testing the impact of programming languages on performance

Common tasks for APIs include processing large amounts of JSON and performing cryptographic operations like hashing with MD5 or SHA. In this article we will combine these tasks in an example Lambda function that takes a large JSON, filters it and generates a hash of each of its objects. We have written three versions of this Lambda function - one in Python, one in Rust and one in Go. The input and output of the three functions will be exactly the same so we can objectively compare their performance. The JSON file will be embedded in the function through a Lambda Layer. This eliminates external sources like S3 or DynamoDB, which might otherwise influence the results.

Although not an exact copy of an API Gateway - Lambda - DynamoDB setup, this testbed will show how different languages influence the execution layer of the architecture. The full source of the test can be found on GitHub.

We will run the benchmark with multiple memory configurations. In AWS Lambda, CPU processing power scales with memory. The highest single-thread performance is achieved at 3008MB of memory. Anything above that value only has benefit in multi-threaded environments. Additional details on the chosen memory values can be found in my article Optimizing Lambda Cost with Multi-Threading.

We will also run the benchmark with two randomly generated JSON files. The first has 10.000 objects, the second has 100.000 objects. A single object in these dictionaries looks like this:

{
    "af9137aa-e76e-4819-944e-143f51868d6c": {
        "make": "TOYOTA",
        "model": "SEQUOIA 2WD",
        "license_plate": "DM-362-K",
        "origin": {
            "country": "Poland",
            "year": 1982
        }
    }
}

Benchmark step one: parsing JSON

The first test compares how fast Python, Go and Rust can read a JSON file, store it in memory and parse it into a native object. In Python the code looks like this:

    # Load the file from JSON into memory
    file_name = os.environ.get('TEST_DATA_FILE')
    with open(f'/opt/{file_name}', 'r') as file_handle:
        test_data = json.load(file_handle)
    print(f'JSON parsing took {int((time.time() - t1) * 1000)} ms')

In Go it is:

    // Load the file from JSON into memory
    file_name := os.Getenv("TEST_DATA_FILE")
    jsonFile, err := os.Open(fmt.Sprintf("/opt/%s", file_name))
    if err != nil {
        fmt.Println(err)
    }
    defer jsonFile.Close()

    byteValue, _ := ioutil.ReadAll(jsonFile)
    var sourceMap map[string]Vehicle
    json.Unmarshal([]byte(byteValue), &sourceMap)
    fmt.Printf("JSON parsing took %d ms\n", time.Now().Sub(t1).Milliseconds())

And in Rust it’s implemented as:

    // Load the file from JSON into memory
    let file_name = env::var("TEST_DATA_FILE").unwrap();
    let data = fs::read_to_string(
        format!("/opt/{}", file_name)
    ).expect("Unable to read file");
    let map: HashMap<String, Vehicle> = serde_json::from_str(&data).unwrap();
    println!("JSON parsing took {:?} ms", t1.elapsed().as_millis());

After running this code 1.000 times and filtering the outliers, the results are surprising!

Read 10.000 JSON Objects

Against all expectations, Go is actually slower than Python. After a bit of googling I learned that this is a known shortcoming in Go. For example see here, here, and here: the standard Go JSON library is quite inefficient at handling large JSON files - which is exactly what we’re doing here. The real takeaway, however, is that Rust is blazing fast at parsing JSON. At the 128 MB memory configuration Rust is 35% faster than Python, and at 1769 MB it’s a whopping 72% faster.

Memory size Average time Python Average time Rust Improvement
128 MB 688,08 ms 443,86 ms 35%
256 MB 331,16 ms 103,5 ms 69%
832 MB 66,98 ms 57,92 ms 14%
1769 MB 50,88 ms 14,06 ms 72%
3008 MB 51 ms 17,04 ms 67%

At 100.000 objects we see the same pattern. The Python 128 MB benchmark is missing because Python required more memory to parse and filter this amount of objects, while Rust was able to do it in 115 MB and Go in exactly 128 MB.

Read 100.000 JSON Objects

Benchmark step two: filtering the hashmap and SHA256-hashing strings

The next step of the benchmark involves reading the objects in the hashmap, splitting a license plate string on dashes and calculating SHA256 values. In Python this looks like:

    t2 = time.time()
    filtered_list = []
    # For every item, check its license plate. If it has an 'A' in the first
    # section and a 0 in the second section, add it to the list. For example
    # 'AT-001-B' matches, but 'A-924-VW' doesn't.
    for obj in test_data.values():
        license_plate_components = obj['license_plate'].split('-')
        if 'A' in license_plate_components[0] and '0' in license_plate_components[1]:
            # If the license plate matches, add a new field 'make_model_hash'
            # to the object. This field contains the sha256 hash of the make and model.
            obj['make_model_hash'] = hashlib.sha256(
                f"{obj['make']}{obj['model']}".encode()
            ).hexdigest().upper()

            # Add it to the results list
            filtered_list.append(obj)
    print(f'Object filtering took {int((time.time() - t2) * 1000)} ms')

In Go it is this:

    t2 := time.Now()
    vector := []VehicleWithHash{}
    // For every item, check its license plate. If it has an 'A' in the first
    // section and a 0 in the second section, add it to the list. For example
    // 'AT-001-B' matches, but 'A-924-VW' doesn't.
    for _, vehicle := range sourceMap {
        comps := strings.Split(vehicle.LicensePlate, "-")
        if strings.Contains(comps[0], "A") && strings.Contains(comps[1], "0") {
            makeModelHash := sha256.Sum256([]byte(fmt.Sprintf(
                "%s%s", vehicle.Make, vehicle.Model,
            )))

            // Add it to the results list
            vector = append(
                vector,
                VehicleWithHash{
                    Make:          vehicle.Make,
                    Model:         vehicle.Model,
                    LicensePlate:  vehicle.LicensePlate,
                    Origin:        vehicle.Origin,
                    MakeModelHash: strings.ToUpper(hex.EncodeToString(makeModelHash[:])),
                },
            )
        }
    }
    fmt.Printf("Object filtering took %d ms\n", time.Now().Sub(t2).Milliseconds())

And in Rust it looks like this:

    let t2 = Instant::now();
    let mut vec = Vec::<VehicleWithHash>::new();
    // For every item, check its license plate. If it has an 'A' in the first
    // section and a 0 in the second section, add it to the list. For example
    // 'AT-001-B' matches, but 'A-924-VW' doesn't.
    for (_, v) in &map {
        let comps: Vec<&str> = v.license_plate.split('-').collect();
        if comps[0].contains("A") && comps[1].contains("0") {
            let mut hasher = Sha256::new();
            hasher.update(format!("{}{}", v.make, v.model));

            // Add it to the results list
            vec.push(
                VehicleWithHash {
                    make_model_hash: format!("{:X}", hasher.finalize()),
                    make: v.make.clone(),
                    model: v.model.clone(),
                    origin: v.origin.clone(),
                    license_plate: v.license_plate.clone()
                }
            );
        }
    }
    println!("Object filtering took {:?} ms", t2.elapsed().as_millis());

In the 10.000-objects benchmark Rust is slightly faster than Python, while in the 100.000-objects test it is slightly slower. This time around the performance crown goes to Go, which is faster than Python and Rust for both source JSONs.

Parse and filter 10.000 Objects

Memory size Average time Python Average time Go Average time Rust Improvement for Go Improvement for Rust
256 MB 60 ms 21,44 ms 19 ms 11% 4%
832 MB 4 ms 2 ms 3 ms 50% 25%
1769 MB 3 ms 2 ms 3 ms 33% 0%
3008 MB 3 ms 2 ms 3 ms 33% 0%

These are slightly coarse results because at 832MB and up the processes finish very quickly. In the 100.000-objects test the results look like this:

Parse and filter 100.000 Objects

Memory size Average time Python Average time Go Average time Rust Improvement for Go Improvement for Rust
256 MB 275,06 ms 239,58 ms 297,08 ms 13% -8%
832 MB 80,9 ms 75,5 ms 84 ms 7% -4%
1769 MB 37 ms 33,7 ms 39 ms 9% -5%
3008 MB 37 ms 32 ms 39 ms 14% -5%

Security and performance

At the top of the article I mentioned that Rust and Go are statically typed, while Python and NodeJS are not. This became obvious in our second benchmark. Where in Python it is fine to add fields to objects on the fly…

obj['make_model_hash'] = hashlib.sha256(
    f"{obj['make']}{obj['model']}".encode()
).hexdigest().upper()

… this is impossible in Rust and Go, where we had to create a new object with additional fields:

vec.push(
    VehicleWithHash {
        make_model_hash: format!("{:X}", hasher.finalize()),
        make: v.make.clone(),
        model: v.model.clone(),
        origin: v.origin.clone(),
        license_plate: v.license_plate.clone()
    }
);

The statically typed solution is safer: you define exactly what data should look like, and the code will fail when the data does not match this definition. This prevents mistakes that might otherwise go unnoticed. However, it is also more complex: you need to specify objects or structs to define the shape of your data, and you’re constantly copying data around when you’re reshaping objects (see the amount of clone() operations in the Rust example). Another upside of statically typed languages is that the compiler knows exactly how much memory your objects will occupy. This allows for additional performance gains.

What can Rust and Go do for you?

In this article we have seen two very specific workloads. In the first Rust absolutely shined, in the second Go displayed obvious improvements. Your application will almost always be more complex than this example - it will contain some JSON parsing, some hashing, some calculations, a few comparisons and maybe some compression. Which language serves your application best will vary case by case. Generally speaking though, Rust and Go will always be faster than Python and Node. Additionally, Rust and Go will always be safer than Python and Node, and they will always use less memory.

In the serverless world, faster and smaller directly translate to cheaper. And because Rust and Go are statically typed, they are also more reliable. Therefore choosing a powerful compiled language can lead to big wins across the three pillars of reliability, performance efficiency, and cost optimization. You don’t have to pick two - just enjoy your cake.

I share posts like these and smaller news articles on Twitter, follow me there for regular updates! If you have questions or remarks, or would just like to get in touch, you can also find me on LinkedIn.

Luc van Donkersgoed
Luc van Donkersgoed