Have you ever experienced a website taking longer to load on the first visit but significantly faster on subsequent visits? This phenomenon is related to caching: an idea to improve the performance of distributed systems. In this blog, we will discuss the caching concept, how it works, and its types.
Caching is the process of storing frequently accessed data in a temporary storage called a cache to improve the speed of data access. Now, the question arises: What is a cache? A cache is a high-speed storage that stores a small proportion of critical data so that requests for that data can be served faster.
Now, let's understand how caching works in practice when a user requests some data.
The above idea is one of the commonly used caching strategies: cache-aside strategy. Here, the cache and database are independent, and it is the responsibility of the application code to manage operations on the cache and database. There can be other types of caching strategies like read-through strategy, write-around strategy, write-back strategy, and write-through strategy. We will discuss these concepts later in this blog.
Note: Cache hits and misses are critical metrics for measuring cache performance.
Cache eviction policies are algorithms that manage data stored in a cache. When the cache is full, some data needs to be removed in order to make room for new data. So cache eviction policy determines which data to remove based on certain criteria. There are several cache eviction policies:
When data in the database is constantly being updated, it is important to ensure that the cache is also updated to reflect these changes. Otherwise, the application will serve outdated or stale data. So, we use cache invalidation techniques to maintain the cache consistency.
We mostly use five types of caching strategy to solve the problem of cache invalidation: Cache-aside strategy, Read-through strategy, Write-through strategy, Write-around strategy and Write-back strategy. We have already discussed the cache-aside strategy in the above section. So let's discuss the rest of them.
In this strategy, cache is present between the application code and the database. In case of a cache miss, it retrieves data from the database, stores that data in the cache and returns it to the application. So read-through caches are also one of the good choices for read-heavy systems.
Both cache-aside and read-through look similar, but there are differences. In cache-aside, application code is responsible for retrieving data from the database and storing it in the cache. In read-through, this is supported by the cache provider. So read-through strategy simplifies the application code by abstracting away the complexity of cache management.
In both strategies, the most basic approach is to write directly to the database. Due to this, cache will be inconsistent with the database. So what is the solution?
In write-through cache, writes are first made to the cache and then to the database. If both writes are successful, write operation is considered successful. This will ensure that cache and database remain consistent and reduces the risk of data loss in case of database crash.
In write-around cache, write operations bypass the cache and go directly to the database rather than being written to the cache first. This technique does not ensure that data in the cache is always up-to-date because updated data may not be available in the cache.
The write-back cache is used in systems with high write activity to improve write performance. Writes are temporarily stored in a cache layer, where they are quickly verified and then asynchronously written to the database. This results in lower write latency and higher write throughput.
However, this technique carries the risk of data loss if cache layer fails, because cache is the only copy of the written data. To minimize this risk, it is recommended to have multiple cache replicas that acknowledge the write. This way, if one cache fails, data can still be recovered from another cache.
If you're wondering how websites load quickly on subsequent visits, one of the reasons is browser caching. This idea temporarily stores resources like images, HTML and JavaScript files within a cache in a web browser. When you revisit the same website, the browser will retrieve these resources from the cache. This is also known as client-side caching.
Browser cache has limited capacity and is set to store resources for a specific time duration. When the cache reaches its capacity or resources reach their expiration date, the browser will automatically clear the cache and retrieve updated resources from the network during the next visit. Users can also manually clear their browser cache if it becomes full.
Web server caching is used to improve performance by storing resources on the server side. This reduces the load on the server. There are several ways to implement web server caching i.e. reverse proxy cache and key-value store such as Memcached or Redis.
The reverse proxy cache acts as an intermediary between the browser and the web server. When a user makes a request, the reverse proxy cache checks if it has a copy of the requested data. If yes, it will serve a cached version to the user rather than forwarding the request to the web server.
We can also use key-value database to cache application data. These databases are typically accessed by the application code. Unlike reverse proxies, which cache HTTP responses for specific requests, key-value databases can cache any user-specific data or frequently accessed data based on need.
Content Delivery Network is designed to improve the delivery speed of static content like web pages, images, videos, and other media files. These proxy servers are located in strategic locations around the world to reduce the distance between the end user and the origin server (reducing latency).
When a user requests content from a website that uses a CDN, CDN fetches the content from the origin server and stores a copy of it. If the user requests the same content again, CDN serves content directly rather than fetching it again from the origin server.
Think CDN like a chain of grocery stores: Instead of going all the way to farms where food is grown, which could be hundreds of miles away, customers can go to their local grocery store. Grocery store stocks food from faraway farms, allowing customers to get the food they need in a matter of minutes, rather than days.
Distributed caching is the practice of using multiple caching servers that are spread across a network. Unlike traditional caches, which are usually limited to the memory of a single machine, distributed cache can scale beyond the memory limits of a single machine by linking together multiple machines (distributed clusters).
In distributed caching, each caching server maintains a portion of the cached data, and requests for data are directed to the appropriate server based on a hashing algorithm or some distribution strategy.
In summary, caching is a useful technique for improving performance, reducing cost, and increasing the scalability of a system by storing frequently accessed data in fast storage called cache!
Thanks to Chiranjeev and Navtosh for their contribution in creating the first version of this content. If you have any queries or feedback, please write us at contact@enjoyalgorithms.com. Enjoy system design!