Difficulty: Medium, Asked-in: Microsoft, Amazon, Goldman Sachs, Intel, Yahoo, Qualcomm, Bloomberg.
Merge sort is one of the fastest algorithms for sorting an array (or linked list) in O(nlogn) time. Before exploring the design and analysis of merge sort, let's understand its importance from various angles:
Suppose we need to sort an array A[l...r] of n integers starting from index l and ending at index r. The critical question is: Can we solve the sorting problem of size n using the solution of smaller sub-problems or applying the idea of recursion? Let's think!
If we observe the above diagram, the idea looks like this:
Divide: Divide the problem of size n into two equal sub-problems of size n/2.
Conquer: Solve each sub-problems recursively i.e. sort both smaller subarrays recursively using the same function. During this process, we should not worry about the solution to the sub-problems because the recursive call to the subproblems will do this work for us. This is a recursive leap of faith!
Combine part: Merge both sorted subarrays to get the final sorted array. In other words, combine the solution of two n/2 size sub-problems to solve sorting problems of size n.
Suppose the function mergeSort(int A[], int l, int r) sorts the entire array A[].
void mergeSort(int A[], int l, int r)
{
if(l >= r)
return
int mid = l + (r - l)/2
mergeSort(A, l, mid)
mergeSort(A, mid + 1, r)
merge(A, l, mid, r)
}
After the conquer step, both left part A[l, mid] and right part A[mid + 1, r] will be sorted. Now we need to combine the solution of smaller sub-problems to build a solution to the larger problem, i.e., merging both sorted halves to create the larger sorted array. The critical question is: Both sorted halves are part of the same array A[]. Can we merge these two halves in O(n) time without using extra space? This will be not possible! You can try this by doing some swapping and comparison.
O(n) time idea using O(n) extra space: If we store sorted halves into two extra arrays of size n/2 (X[] and Y[]), we can transform this problem into merging sorted arrays X[] and Y[] into the larger sorted array A[]. For this, we compare values one by one in both smaller arrays and build the larger array sorted A[].
How can we implement it? Here is an idea: We use two pointers i and j to traverse X[] and Y[] from the start. During this process, we compare X[i] and Y[j], place the smaller value on A[] and move one of the pointers. Another way of thinking would be: After each comparison, we add one element to the sorted output and incrementally build a partially sorted array A[].
Step 1: Memory allocation and data copy.
We allocate two extra arrays of size equal to the size of left and right sorted parts i.e. size of left sorted part = mid - l + 1, size of right sorted part = r - mid. After this, we copy left and right sorted parts of A[] into extra arrays. Note: We are including the value at mid-index in the left part.
int n1 = mid - l + 1
int n2 = r - mid
int X[n1], Y[n2]
for(int i = 0; i < n1; i = i + 1)
X[i] = A[l + i]
for(int j = 0; j < n2; j = j + 1)
Y[j] = A[mid + 1 + j]
Step 2: Now start the merging process.
Note: This is a two-pointer approach of problem-solving where we build a partial solution by moving pointers i and j in the same direction.
int i = 0, j = 0, k = l
while(i < n1 && j < n2)
{
if(X[i] <= Y[j])
{
A[k] = X[i]
i = i + 1
}
else
{
A[k] = Y[j]
j = j + 1
}
k = k + 1
}
Step 3: Loop termination and boundary conditions.
The loop will stop when one of the two pointers reaches the end of its array (either i = n1 or j = n2). At this stage, there will be two boundary conditions:
Boundary condition 1: If we exit the loop due to j = n2, then we have reached the end of array Y[] and placed all values of Y[] in A[]. But there may be remaining values in X[] that still need to be put in the array A[]. So we copy the remaining values of X[] into the end of the A[].
while(i < n1)
{
A[k] = X[i]
i = i + 1
k = k + 1
}
Boundary condition 2: If we exit the loop due to the condition i = n1, we have reached the end of array X[] and placed all its values in A[]. But there may still be some values remaining in Y[] that need to be placed in the array A[]. So we copy the remaining values of Y[] into the end of array A[].
while(j < n2)
{
A[k] = Y[j]
j = j + 1
k = k + 1
}
void merge(int A[], int l, int mid, int r)
{
int n1 = mid - l + 1
int n2 = r - mid
int X[n1], Y[n2]
for(int i = 0; i < n1; i = i + 1)
X[i] = A[l + i]
for(int j = 0; j < n2; j = j + 1)
Y[j] = A[mid + 1 + j]
int i = 0, j = 0, k = l
while(i < n1 && j < n2)
{
if(X[i] <= Y[j])
{
A[k] = X[i]
i = i + 1
}
else
{
A[k] = Y[j]
j = j + 1
}
k = k + 1
}
while(i < n1)
{
A[k] = X[i]
i = i + 1
k = k + 1
}
while(j < n2)
{
A[k] = Y[j]
j = j + 1
k = k + 1
}
}
This is an excellent code to understand the analysis of iterative algorithms. To understand this better, let's visualize the time complexity of each critical step and calculate the overall time complexity.
Overall time complexity = O(1)+ O(n) + O(n) + O(n1) + O(n2) = O(n). If we observe closely, time complexity depends on the complexity of the merging loop where comparison, assignment, and increment are critical operations. There could be two different perspectives to understand this:
Space complexity = Extra space for left part + Extra space for right part = O(n1) + O(n2) = O(n1 + n2) = O(n).
Suppose T(n) is the time complexity for sorting n elements using merge sort. So to calculate the value of T(n), we can break down the time complexities as follows.
To calculate T(n), we need to add up the time complexities of the divide, conquer, and combine parts.
=> T(n) = O(1) + 2T(n/2) + O(n) = 2T(n/2) + O(n) = 2T(n/2) + cn.
Here is the simplified recurrence relation:
Merge sort function works correctly when the number of elements is not even. To simplify the analysis, we assume that n is a power of 2. This assumption does not affect the order of growth in the analysis.
In this method, we draw the recursion tree and calculate the total number of operations at each level of recursion. Finally, we calculate the overall time complexity by doing the sum of operations count level by level. Note: To learn about the fundamentals of recursion analysis, you can explore the blog post on time complexity analysis of recursion.
Based on the above recursion tree, the cost at each level is O(n), and there are O(logn) levels. So the time complexity of the merge sort algorithm is the sum of the cost at each level, which is O(logn)*O(n) = O(nlogn).
The best and worst-case time complexity of the merge sort algorithm is O(nlogn). The idea is simple: irrespective of the input, merge sort divides the input into equal halves and takes O(n) time at each level.
The Master method is a direct way to obtain the solution for recurrences that can be transformed to the type T(n) = aT(n/b) + O(n^k), where a ≥ 1 and b > 1. There are three cases for analysis using the master theorem:
Let's compare with the merge sort recurrence:
Here, a = 2, b = 2 (a > 1 and b > 1).
So k = logb(a). This means that the merge sort recurrence satisfies the 2nd case of the master theorem. So merge sort time complexity T(n) = O(n^k * logn) = O(n^1 * logn) = O(nlogn).
The space complexity of the merge sort depends on the extra space used by the merging process and the size of the recursion call stack.
The recursive merge sort uses a call stack to store intermediate values of function parameters l and r and other information. On the other side, we can implement it iteratively without using an explicit auxiliary stack.
The idea behind iterative merge sort is to consider window sizes in exponential order, i.e., 1, 2, 4, 8, ... over the input array. For each window of size k, we merge all adjacent pairs of windows into a temporary space and then put them back into the array. The critical question is: What would be the time and space complexity of this approach? Explore and think!
Pseudocode of iterative merge sort
void iterativeMergeSort(int A[], int n)
{
int windowSize
int left
for(windowSize = 1; windowSize < n ; windowSize = 2 * windowSize)
{
for(left = 0; left < n - 1; left = left + 2 * windowSize)
{
int mid = min(left + windowSize - 1, n - 1)
int right = min(left + 2 * windowSize - 1, n - 1)
// Code of this merge function is same as the code
// used in the above recursive implementation
merge(A, left, mid, right)
}
}
}
Content reference: Algorithms by CLRS
Please write in the message below if you find anything incorrect, or you want to share more insight. Enjoy learning, Enjoy algorithms!