Common Caching Pitfalls

Common Caching Pitfalls

We all know and love caching, right? It makes all of our complex algorithms and expensive database or filesystem lookups seem so fast and simple. But there are downsides to caching. For example, how do you handle invalidating? Or distributing it over multiple systems? Or what happens when process A is reading the cache, while process B is writing to it? Those downsides and more will be the topic of this blog.

Caching benefits

The benefits of caching stuff are lots. For example, making it so you do not need to do an expensive to call to an external system. No matter if we’re talking about a database or an external webservice, you’ll gain at least hundred milliseconds of latency. The same could be said about reading something from the filesystem. A hard disk easily takes 10+ms to read something, more if it’s in a cloud environment.

When you have to get data like this, it’s often relatively easy to cache it. Most, if not all, web frameworks will have some kind of cache system included. Some will store it in memory, others on disk. Maybe you’ll even have the option of storing it in an external system, you don’t have to worry about distributing it between multiple instances.

Caching problems

However, there are some problems with caching. What happens when you update a piece of information that is cached? For example, what if you update the styling of a website while it’s cached?

For styling and other webresources, the easy option is to add a so called “cache buster” to the URL. That’s a fancy name for adding a unique value to the query string, so instead of https://example.com/css/main.css you’ll be requesting https://example.com/css/main.css?v=1. When you update the styling, you just change the value in the URL.

Invalidating your cache

Unfortunately, backend caching is often a lot harder to invalidate properly. Not so much because you have a lot of systems, but because deciding exactly when to invalidate it can be hard. For example, when you have a database, and some cache depends on the data in there, how are you going to invalidate the cache without doing a call to the database every time?

You might be inclined to add a simple version counter to the table, and auto incrementing it. But now you’re back to doing a call to the database every time. Sure, you’re downloading less data from it, but still. Instead, you might have some “master key” which you invalidate every time the data in the row changes. A proper caching system will allow you to add a related key to your cache. When that key is invalidated, it’ll also invalidate all the related items.

If you want to implement this over multiple instances, you can of course add something like Redis or Memcached. Using this as a cache backend solves your problems. However, when running in the cloud it means another container or service you have to manage. Instead, you could also simply have a background process running that detects changes to the cache. When a change is detected, it’ll call an API endpoint on the other instances to invalidate their cache. However, now you need to know all other endpoints. A solution could be running a service bus (e.g. Azure Service Bus or Amazon EventBridge), and have all instances listen to it. Still an extra service, but often cheaper than running Redis or Memcached.

Timed invalidations

While a timed invalidation might seem like a nice workaround, it comes with it’s own issues. The two most common issues are user expectations and performance.

User expectations might be that an update is visible immediately. But when you have a timed invalidation, the user will have to wait for that time window to pass. Not that big of a deal when it’s 5 minutes, but it’s going to be a lot harder to explain when it takes a day.

Performance might suffer because the data on the backend you’re calling might not be changed yet. So while you might be invalidating your cache of valuta conversion ratios every 5 minutes, the backend only updates this hourly or daily. Now everytime a user comes in after the cache has been invalidated, they’ll have to wait for the backend service as well instead of just your cache.

Timed invalidations work fine when you know the data is updated infrequently, or if you have an easy way to flush the cache if it is changed. For example, the number of subscribers doesn’t need to be updated for every new subscriber, so you could easily cache that for a few hours. The same goes for the number of views for a page, even YouTube does this.

Updating after adding to cache

Another issue you might run into, especially with a cache backend that stores objects in memory, is when you update an object after adding it. Let’s take a look at the following code:

[HttpPost]
public IActionResult AnswerQuestion(AnswerModel answer)
{
    var question = _cacheService.Get<Question>("question_hp_20201010");

    if (question == null)
    {
        question = _questionRepository.GetQuestionOfTheDay();
        _cacheService.Add("question_hp_20201010", question);
    }

    question.Answer = answer.Value;

    _questionService.AnswerQuestion(question);

    return View(question);
}

What happens with the question variable after setting the Answer? This depends on the implementation of the _cacheService. Does it serialize the object passed in? Does it just do a reference copy to keep everything fast?

When it’s doing just a reference copy you’ll most likely update the item in cache. So whenever you’ll get the question to show, it’ll already be answered! Worse still, it will be answered with someone else’s data and not some kind of default. If this happens in a form that collects personal information (if only a newsletter subscription form), you have a hidden data leak.

Doing a deep copy or serializing the object prevents this, but takes a lot longer to perform.

Reading while someone else is writing

This is probably one of the hardest things to solve. Or at least, solve without causing performance degradations. The simplest solution would be to just use whatever your language has for locking, in C# this would look something like this:

public class CacheService
{
	private static readonly object cacheLock = new object();
	private static Dictionary<string, object> cacheDict = new Dictionary<string, object>();

	public static void Add(string key, object value)
	{
		lock (cacheLock)
		{
			cacheDict[key] = value;
		}
	}

	public static object Get(string key)
	{
		lock (cacheLock)
		{
			return cacheDict[key];
		}
	}
}

Note, I do not recommend using anything like this in production. Not only will it throw exceptions when getting a cache key that doesn’t exist, it’ll also perform terribly because you cannot add from more than one thread at a time. It is, however, perfectly safe to use from multiple threads because of the locking.

This isn’t something with a simple solution. My best advice here would be to use whatever your platform offers.

How are CMS systems handling this?

Episerver has the ISynchronizedObjectInstanceCache for this. This works by using memory caching on all instances, and synchronizing this between servers via the Azure Service Bus.

Umbraco has the ICacheRefresher interface that can keep your cache up-to-date. A few sample implementations can be found on their GitHub.

And when in doubt, your platform probably has some caching features as well. For ASP.NET applications, have a look at ASP.NET Caching Overview. For .NET Core, the MemoryCache class might be of use to you.

Further reading

This blog doesn’t even scratch the surface of caching, and it’s difficulties. There are far more difficulties in caching than a single blog post could ever answer, or at least while keeping it at a readable length.

If you want to know more about this, have a look at your platform documentation. And if you want to know more about the nitty gritty details, you could always look at how the different caching strategies work. Keywords for that are read-through, write-through and write-back.

Sanne Bregman
Sanne Bregman

.NET Developer