For large-scale applications, system design provides a high-level understanding of the components and their relationships. It helps us to define architecture by breaking down the system into small components. On the other side, effective system design can impact the success of a software project by ensuring that the resulting system is well-structured, maintainable, and meets the needs of its users.
So tech professionals must understand various system design concepts to make informed decisions about various tradeoffs. On the other hand, system design is also one of the important concepts that tech companies ask about during interviews.
System design deals with designing the structure of a software system to ensure that it meets desired functionality, performance, scalability, reliability, and other quality attributes. It is the process of defining the architecture, components, data, etc., for a system to satisfy functional and non-functional requirements.
When designing such systems, it is essential to have an excellent understanding of key system design terms like scalability (horizontal and vertical), caching, load balancing, data partitioning, various types of databases, network protocols, database sharding, etc. Similarly, one must consider various trade-offs like latency vs throughput, performance vs scalability, consistency vs availability, etc. Overall, system design is an open-ended discussion topic. That's why most tech firms prefer to include system design interview rounds.
Some key steps involved in designing a system:
Now let’s dive in to get familiar with the essential concepts used in system design.
Availability means: System should always remain up and return the response of a client request. In other words, whenever any user want to use the service, system must be available and satisfy user request.
Availability can be quantified by measuring the percentage of time system remain operational in a given time window. We usually define availability of a system in terms of number of 9s (as shown in the table below).
For designing a highly available system, we need to consider several factors like replication, load balancing, failover mechanisms, disaster recovery strategies, etc. Our goal should be to minimize downtime and ensure that system is always accessible and functioning as intended.
In computer systems, throughput is one of the key metrics used to measure the performance and capacity of a system in handling tasks, transactions, or operations within a given time interval. Similarly in the context of a network, it represents the amount of data transferred successfully over a network within a certain timeframe.
High throughput is crucial when handling a large data or a high number of concurrent requests. One excellent method to achieve high throughput is to split the requests and distribute them across various machines.
Latency is another important metric related to the system performance. It is a measure of the time duration to produce the result. In other words, it is the time taken by a request to receive a response from a server.
Almost every system has dependency on networks. It acts as a platform for communication between users and servers or different servers. On another side, protocols are the rules governing how servers or machines communicate over the network. Some of the common network protocols are HTTP, TCP/IP, etc.
Load balancers are machines that balance the load among various servers. To scale the system, we can add more and more servers. So there must be a way to direct requests to these servers so that there is no heavy load on one server. In other words, load balancers are used to to prevent a single point of failure.
Proxy is present between the client and server. When a client sends a request, it passes through the proxy and then reaches the server. Proxies are of two types: Forward Proxy and Reverse Proxy. The forward proxy acts as a mask for clients and hides the client’s identity from the server. Similarly, Reverse proxy acts as a mask for servers and hides server’s identity from the response.
Systems are associated with data that needs to be stored for later retrieval. One way to classify databases is based on their structure, which can be either relational or non-relational. Relational databases, such as MySQL and PostgreSQL, enforce strict relationships among data and have a highly structured organization. Non-relational databases, such as Cassandra and Redis, have flexible structures and can store both structured and unstructured data. They are often used in systems that are highly distributed and require high speed
Database Partition is the way of dividing database into smaller chunks to increase performance of the system. It is another critical concept used to improve latency and throughput so that more and more requests can be served. Here is the list of some popular databases based on Stack Overflow Developer Survey 2021.
Relational and non-relational databases ensure different types of compliance. Relational databases are associated with ACID compliance, while non-relational associated with BASE compliance.
ACID: Atomicity, Consistency, Isolation, Durability
ACID compliance ensures the relational nature of databases and ensures that transactions are executed in a controlled manner. A transaction is an interaction with the database.
BASE: Basically Available Soft State Eventual Consistency
BASE compliance maintains integrity of NoSQL databases and ensures their proper functioning. It is a key factor in the scalability of NoSQL databases.
While designing any application, one needs to be clear about the type of storage according to the system requirements. If system is distributed in nature and scalability is essential, then NoSQL databases are the best choice to go with. No-SQL databases are also preferred when amount of data is huge.
SQL databases are favourable when data structure is more important. It is preferred when queries are complex and databases require fewer updates. However, there is always a trade-off when choosing between NoSQL vs SQL database. Sometimes according to business requirements, a Polyglot architecture comprising both SQL and NoSQL databases is used to ensure application performance.
As services grow and number of requests increases, system can become slow and performance may be affected. Scaling is the best way to address this issue, which involves increasing the capacity of system to handle more requests. There are two main ways to scale: horizontal scaling and vertical scaling.
Horizontal scaling: Adding more servers to distribute requests. This allows system to handle more traffic by spreading it across multiple servers.
Vertical scaling: Increasing the capacity of a single machine by upgrading its resources like CPU, RAM, etc. This allows the system to handle more traffic by increasing the capacity of a single machine.
Caching helps to improve the performance of a system by reducing its latency. It involves storing frequently used data in a cache so that it can be accessed quickly, instead of querying the database. However, implementing a cache also adds complexity because it is important to maintain consistency between data in the cache and data in the main database.
Cache is relatively expensive, so the size of cache will be limited. To ensure the best performance, various cache eviction algorithms like LIFO, FIFO, LRU, and LFU, are used to manage the cache. These algorithms determine which data should be removed from cache when space is needed.
Consistent hashing is a widely used concept in distributed systems because it offers flexibility in scaling. It is an improvement over traditional hashing, which are ineffective for handling requests over a network.
In consistent hashing, users and servers are located in a virtual circular ring structure called a hash ring. The ring is considered infinite and can accommodate any number of servers. Servers are assigned random locations based on a hash function. This concept allows for efficient distribution of requests or data among servers. It helps us achieve horizontal scaling, which increases throughput and reduces latency.
CAP Theorem is an essential concept for designing networked shared data systems. It states that a distributed database system can only provide two of these three properties: consistency, availability, and partition tolerance. We can make trade-offs between three properties based on the use cases of our dbms system.
System design is an essential skill to have and is equally important from the interview point of view. One needs to be well aware of all the trade-offs while designing any system. In this blog, we tried to cover some basic concepts necessary for getting started.
We are looking forward to your feedback in the message below. Enjoy learning, Enjoy system design!