Why do we study the design, analysis, and comparison of sorting algorithms? Here are some critical reasons:
- We can learn several problem-solving approaches using sorting: Incremental approach (selection and insertion sort), Divide and conquer approach (merge and quicksort), Two pointers approach (merging and partition), Problem-solving using data structures (heapsort and tree sort), etc.
- Sorting is one of the best ways to learn complexity analysis of recursive and iterative code.
- We also use sorting as a problem-solving approach i.e. we can efficiently solve several problems by organizing data into sorted order.
Another side, to choose a correct sorting algorithm based on a given input scenario or constraints, we need to understand the critical comparisons between sorting algorithms based on different parameters like time complexity, space complexity, in-place sorting, stability, online vs. offline sorting, etc.
Comparison based sorting
In comparison-based sorting, we get the sorted order output by comparing elements in the input. It's important to note that all comparison-based sorting algorithms have a lower bound of O(nlogn). In other words, any comparison-based sorting algorithm will take at least O(nlogn) time to sort an array of n elements.
Here are some examples of sorting algorithms which use comparison as a critical operation:
- Bubble sort: Compares elements to bubble up the maximum to the end.
- Selection sort: Compares elements to find the minimum element in the unsorted part and places that element at the end of the sorted part.
- Insertion sort: Compares elements to determine the position of an element in the partially sorted array.
- Merge sort: Compares elements of two sorted halves to merge them into the final sorted array.
- Quick sort: Compares elements to partition the unsorted array into two different halves around the pivot.
- Heapsort: Compares elements during the heapify process to place the elements at the correct position in the sorted array.
The worst-case time complexities of the above sorting algorithms will be categorized into two parts: O(n^2) sorting and O(nlogn) sorting.
In addition to comparison operations, these sorting algorithms also use other types of operations.
- Bubble sort: Swapping
- Selection sort: Swapping
- Insertion sort: Shifting
- Merge sort: Allocation of extra memory and data copy.
- Quicksort: Swapping
- Heapsort: Swapping
However, the count of these operations will always be less than or equal to the count of comparison operations. That's why the comparison operation is the deciding factor for the time complexity.
Linear time sorting
There are sorting algorithms that have a faster time complexity than O(nlogn), but they require special assumptions about the input to determine the sorted order. These sorting algorithms use operations other than comparison and they work in O(n) time complexity. So, the lower bound of O(nlogn) does not apply to these sorting algorithms.
Examples of such sorting algorithms are counting sort, radix sort, and bucket sort. Counting sort and radix sort assume that the input consists of integers in a small range. Meanwhile, bucket sort assumes that the input is generated by a random process that distributes elements uniformly over the interval [0, 1).
Here is a comparison of the time and space complexities of some popular sorting algorithms:
In-place sorting algorithms
An in-place sorting algorithm sorts the elements without using extra memory. Here we store a constant number of elements temporarily outside the array while sorting. Examples of in-place sorting are bubble sort, selection sort, insertion sort, quicksort, and heapsort. In contrast, merge sort and counting sort are not in-place as they require additional memory.
Stable sorting algorithms
A sorting algorithm is stable if it preserves the relative order of elements with the same value. For example, if A[i] is equal to A[j] and i < j, then A[i] must appear before A[j] in the sorted output.
- Stability is important when we sort the same data set multiple times because it maintains the relative order of equal elements. If all elements have unique values, stability is not a concern.
- Bubble sort, insertion sort, merge sort, counting sort, and radix sort are stable sorting algorithms. Selection sort, quicksort, and heapsort are non-stable sorting algorithms.
We can make non-stable sorting algorithms stable by extending the comparison operation to include tie-breaking rules based on the order of the original input. This process may require additional time and space to remember the original order.
Online and offline sorting algorithms
An algorithm that sorts data as it becomes available is called an online sorting algorithm. Such algorithms sequentially process input data and allow sorting to occur before the entire input set is available. For example, insertion sort considers one input element at a time and places that element in a partially sorted array. This will maintain a sorted output by inserting each new input element in its proper place as it becomes available. So, insertion sort is an online sorting algorithm.
In contrast, an offline algorithm requires all input data to be present in memory before sorting can start. For example, the selection sort algorithm sorts an array by finding the minimum element in the unsorted part and placing it in the partially sorted array. This process requires access to the entire input. So, selection sort is an offline sorting algorithm.
Comparison based on problem-solving approaches
Sorting is an effective idea for mastering problem-solving technique to solve various coding problems.
How to choose the best sorting algorithm?
The choice of the best sorting algorithm depends on several factors, like the size of input data, the order of the data, memory usage, stability, performance, etc.
- For small input data, a simple algorithm like insertion sort can work best. However, for larger data sets, quicksort, merge sort, or heapsort are the best choices.
- If data is almost sorted, insertion sort works best with O(n) time complexity. If the data is random, quicksort, merge sort, or heapsort can be better options.
- When memory usage is an important consideration, algorithms like heapsort [O(1) extra space] or quicksort [O(logn) extra space] are preferred over merge sort [O(n) extra space].
- For sorting linked lists, merge sort is the optimal choice. It is relatively simple to implement and requires O(nlogn) time, and O(1) extra space. However, linked lists have slow random-access performance, resulting in poor performance for algorithms such as quicksort and making others like heapsort infeasible.
- In a parallel computing environment, merge sort is often the preferred choice due to its divide-and-conquer approach. This method divides the input equally at each stage, and each smaller sub-problem is independent of the others. This makes it easy to process data in parallel across multiple clusters.
- Quick sort and merge sort can be relatively simple to implement, but heapsort may require a deeper understanding of binary heaps.
- If stability is a concern, merge sort is the best choice because quicksort and heapsort are not stable sorting algorithms.
- Quick sort can be cache-friendly due to its in-place sorting property and fewer memory accesses compared to merge sort. Heapsort can also have good cache performance due to its use of binary heaps.
- Regardless of the order of the data, when guaranteed O(nlogn) performance is required, merge sort and heapsort are the best choices for sorting. Quick sort performs exceptionally well on random data with O(nlogn) average time complexity, but its performance can degrade on sorted or nearly sorted data [O(n^2) time].
Critical ideas to think!
- How can we improve the running time of quicksort by taking advantage of the fast running time of insertion sort when its input is almost sorted?
- Can we modify the quick sort or heap sort algorithm to make it stable?
- How can we implement quicksort and merge sort iteratively?
- Some additional sorting algorithms to explore: Shell sort, Tree sort, and Tournament sort.
Enjoy learning, Enjoy sorting!