Cache-aside is one of the commonly used caching strategies, where cache and database are independent, and it is the responsibility of the application code to manage cache and database to maintain data consistency. Let’s understand this from another perspective!
Some applications use read-through or write-through patterns, where cache system defines the logic for updating or invalidating the cache and works as a transparent interface to the application. If the cache system does not provide these features, it is the responsibility of the application code to manage cache lookups, cache updates on writes, and database queries on cache misses.
This is where the cache-aside pattern comes into the picture: the application interacts with the cache and database, and the cache doesn’t interact with the database at all. So cache is “kept aside” as a scalable in-memory data store.
For reading the data, application first checks the cache. If data is available (cache hit), the cached data will be returned directly to the application. If data is not available (cache miss), application code will retrieve data from the database, update the cache, and return data to the user.
Now, the critical question is: How do we perform write operations? For writing data (add, modify, or delete), we can think about these two approaches.
One approach is to update the database and invalidate or evict corresponding data in the cache (if it exists in the cache). This will ensure that the next read operation will retrieve updated data from the database. So, when there is a need to read the same data again, there will be a cache miss, and application will retrieve data from the database and add it to the cache.
Here is one observation: After updating the database, data will not be stored in the cache until there is a next read request for the same data. So this strategy will load data on-demand into the cache (lazy-loading). It makes no assumptions about which data an application will require in advance.
There is a critical question: Can we first invalidate the cache, and then write to the database? If we invalidate the cache first, there is a small window of time when a user might fetch the data before the database is updated. So this will result in a cache miss (because data was removed from the cache), and earlier version of data will be fetched from the database and added to the cache. The can result in stale cache data.
This idea is similar to the write-through strategy: After updating the database, application also update cache with the latest modified data. This will improve subsequent read performance, reduce the likelihood of stale data, and ensure data consistency between cache and database.
In the cache-aside pattern, applications can ensure that the cache data is up to date as far as possible. But in a practical scenario, cached data can be inconsistent with the data in database. For example, an item in the database can be changed at any time by some external process, and this change might not be reflected in the cache until the next read of the same item. This can be a major issue in the system that replicates data across data stores and synchronization occurs frequently.
So application developers should use proper cache invalidation schemes or expiration policies to handle the data inconsistency. Here is another example: Suppose you have successfully updated the cache but database fails to update. So the code needs to implement retries. In the worst case, during unsuccessful retries, cache contains a value that database doesn’t.
If an application repeatedly accesses the same data, cache-aside can be used in the local (in-memory) caching environment. Here local cache is private to each application instance. Here is a challenge: If multiple instances are dealing with the same data, each instance might store its copy of data in its local cache. Due to this, cached data across instances could become inconsistent. For example, if one instance updates the shared database, the other instance's caches won’t be automatically updated, resulting in stale or outdated data being used.
To address this limitation of local caching, we recommend exploring the use of distributed caching. In this caching, system maintains a centralized cache accessible to all instances of the application. Due to this, all instances can read and update the same cached data (data consistency). If required, we can also spread cache across multiple nodes (improves scalability and fault tolerance).
What will happen when cache data is not accessed for a specified period? As discussed above, there can be a chance that data become stale. To solve this inconsistency problem, one solution is to implement expiration policy in the application code.
In cache expiration policy, each item in the cache is associated with a time-to-live (TTL) value, which specifies how long the item can remain in the cache before it is considered expired. When the data expires, the application can be forced to fetch the latest data from the main data source and update the cache with the fresh data. By doing this on consistent time intervals, application can ensure that data is up-to-date.
But it may result in unnecessary evictions if data remains valid for a longer period than TTL. So the question is: What should be the best value of the TTL? For this, we need to ensure that the expiration policy should match the access pattern of data in the application.
Two ideas are important to determine appropriate TTL value: 1) Understanding the rate of change of the underlying data. 2) Evaluating the risk of outdated data being returned to your application. For example, we can cache static data (rarely updated data) for longer TTL and dynamic data (that changes often) for smaller TTL. This lowers the risk of returning outdated data while still providing a buffer to offload database requests.
When caching multiple items, if all items have the same fixed TTL, they will expire simultaneously after that specific duration. This can lead to a “cache stampede” or “thundering herd,” where all the expired items are requested to be refreshed from database at the same time. Such a sudden surge in requests can cause high load and strain on the database.
To mitigate this problem, TTL with Jitter technique (TTL = Initial TTL value + jitter) introduces randomness in TTL values of individual cached items. Instead of having the same fixed TTL for all items, a small random delta or offset is added to each item’s TTL. This effectively spreads out the expiration times of the cached items, so they will not expire simultaneously.
What will happen when cache is full? We should implement cache replacement policies in the application code to handle this situation. The goal is to evict some existing cache entries to make space for new ones. There are two common cache replacement policies:
Instead of waiting for actual data requests from application to trigger caching, some applications proactively load the cache with relevant data beforehand. This is an idea of cache priming: Populating the cache with data that an application is expected to require during its startup or initialization process.
One of the primary reasons for priming the cache is to quickly speed up the system. The reason is simple: When actual requests arrive, there is a higher chance that the requested data is already in the cache. So in scenarios with heavy read traffic, this will reduce initial read latency and initial load on the database.
The effectiveness of cache priming depends on understanding the data access patterns and choosing the right data to preload into the cache. Over-priming or preloading unnecessary data can lead to wasted cache space and reduced overall cache performance.
Please write in the message below if you want to share some feedback or if you want to share more insight. Enjoy learning, Enjoy system design!