Inversion Count in an Array

Difficulty: Medium, Asked In Microsoft, Amazon, Adobe

Key takeaway: An excellent problem to learn problem-solving using the divide and conquer idea similar to the merge sort algorithm.

Let’s understand the problem

Given an array of n integers X[], write a program to find the total number of inversion counts in X[]. An inversion occurs when there are two elements in the array such that i < j and X[i] > X[j]. In this situation, the pair (i, j) is called an inversion of X[] and the inversion count represents the count of such inversions present in the array.

Problem Note

  • The inversion count indicates how close the array is to being sorted. If the array is sorted in increasing order, the inversion count is 0. If the array is sorted in reverse order, the inversion count is at its maximum.
  • Assume that all elements are unique.
  • Before attempting to solve this problem, we highly recommend learning or revising the idea of the divide and conquer approach and the merge sort algorithm.

Examples

Input: X[] = [5, 2, 6, 3] Output: 3

Explanation: There are 3 inversions: (5, 2), (5, 3), and (6, 3).

Input: X[] = [3, 2, 1] Output: 3

Explanation: There are 3 inversions: (3, 2), (3, 1), and (2, 1).

Input: X[] = [1, 2, 3] Output: 0

Explanation: The array is already sorted in increasing order, so there are no inversions.

Discussed Solution Approaches

  • Brute force approach using nested loops
  • Brute force approach using divide and conquer approach
  • Efficient divide and conquer approach using merge sort

Brute force approach using nested loops

Solution idea

One basic idea is to traverse the array and find the inversion count for every element. To do this, we count the number of smaller elements on its right side. During this process, we also keep adding the inversion count for each element to obtain the overall inversion count.

Solution steps

We can implement the above idea using two nested loops.

  1. We Initialize a variable invCount to track the total inversion count.
  2. Now we traverse the array from i = 0 to n - 2 using the outer loop. For each element, X[i], we find the count of elements smaller than X[i] on the right side (j = i + 1 to n - 1) using an inner loop. We track this count using the variable smallerCount.
  3. By the end of the inner loop, we add smallerCount to invCount.
  4. By the end of the outer loop, we return the value stored in invCount.

Solution code C++

int countInversions(int X[], int n) 
{
    int invCount = 0;

    for (int i = 0; i < n - 1; i = i + 1) 
    {
        int smallerCount = 0;

        for (int j = i + 1; j < n; j = j + 1) 
        {
            if (X[j] < X[i])
                smallerCount = smallerCount + 1;
        }

        invCount = invCount + smallerCount;
    }

    return invCount;
}

Another implementation using nested loops

The idea is to compare each pair of elements (X[i] and X[j] where i < j). If the element at index i is greater than the element at index j, it indicates an inversion. So, we increase the inversion count by one.

int countInversions(int X[], int n) 
{
    int invCount = 0;
    for (int i = 0; i < n - 1; i = i + 1) 
    {
        for (int j = i + 1; j < n; j = j + 1) 
        {
            if (X[i] > X[j])
                invCount = invCount + 1;
        }
    }
    return invCount;
}

Time and space complexity analysis

We are using two nested loops and performing an O(1) operation at each iteration. The total number of loop iterations = (n - 1) + (n - 2) + … + 2 + 1 = n(n - 1)/2. So time complexity = O(n²). As we are using constant extra space, the space complexity = O(1).

Brute force approach using divide and conquer

Solution idea

In this approach, we divide the unsorted array into two equal halves and recursively find the inversion count of the first half (invCountLeft) and the inversion count of the second half (invCountRight). The total inversion count up to this point is equal to invCountLeft + invCountRight.

To calculate the overall inversion count, we need to consider an additional scenario: the type of inversion where one element is in the left part and another element is in the right part (invCountCrossing). To find this, we traverse the left half and, for each element x in the left half, we count all elements y in the right half that are less than x. This operation takes O(n²) time because both arrays have a size of n/2.

Total inversion count = invCountLeft + invCountRight + invCountCrossing.

Solution code C++

int countInversions(int X[], int left, int mid, int right) 
{
    int invCount = 0;
    for (int i = left; i <= mid; i = i + 1) 
    {
        for (int j = mid + 1; j <= right; j = j + 1) 
        {
            if (X[i] > X[j])
                invCount = invCount + 1;
        }
    }
    return invCount;
}

int findInversionCount(int X[], int left, int right) 
{
    int mid;
    int invCountLeft = 0;
    int invCountRight = 0;
    int invCountCrossing = 0;
    if (right > left) 
    {
        mid = left + (right - left)/ 2;

        // Find inversion counts in the two halves recursively
        invCountLeft = findInversionCount(X, left, mid);
        invCountRight = findInversionCount(X, mid + 1, right);

        // Count inversions across the two halves
        invCountCrossing = countInversions(X, left, mid, right);
    }
    return (invCountLeft + invCountRight + invCountCrossing);
}

Time and space complexity analysis

We solve the problem of size n by recursively solving the two sub-problems of size n/2 and performing extra O(n²) operations to find invCountCrossing. So the recurrence relation for time complexity is T(n) = 2T(n/2) + O(n^2), which is a divide-and-conquer recurrence. We can apply the master theorem to solve it. Note: If you want to learn about recursion analysis, explore this blog: Analysis of recursion.

Master theorem

If T(n) = aT(n/b) + O(n^k), where a ≥ 1 and b > 1, there are three cases of the master theorem:

  • If k < logb(a): T(n) = O(n^logb(a))
  • If k = logb(a): T(n) = O((n^k) * logn)
  • If k > logb(a): T(n) = O(n^k)

Comparing the given recurrence T(n) = 2T(n/2) + O(n^2) with the master theorem recurrence:

  • T(n) = 2T(n/2) + O(n^2)
  • T(n) = aT(n/b) + O(n^k)

Here, a = 2, b = 2, and k = 2 => logb(a) = log2(2) = 1. Since k > logb(a), it satisfies the third case of the master theorem. So the time complexity = O(n^k) = O(n^2). There is no improvement in time complexity compared to the previous approach!

The space complexity depends on the size of the recursion call stack, which is equal to the maximum depth of the recursion tree. Since the input parameters decrease by a factor of 2, the maximum depth of the recursion tree is O(logn). So, the space complexity is O(logn).

Efficient divide and conquer approach using merge sort algorithm

Solution idea

This is an interesting approach that leverages the idea of merge sort to optimize the previous divide and conquer approach. The key question is: What is the thought process behind this idea? Let's think!

The previous divide and conquer approach does not work well because its time complexity is O(n²). This is primarily due to the time-consuming process of counting inversions across the two halves. Can we think of a way to optimize this approach? Let's think!

Consider the following scenario: After finding the inversion count of both the left and right halves, suppose both halves also get sorted in increasing order. Now, sorting will help us to easily count the inversions across the two halves in O(n) time. The idea is simple: since both halves are sorted, we can use a two-pointer merging process similar to merge sort!

Upon further observation, we can realize that this idea aligns closely with the merge sort algorithm, with a slight modification:

  • Recursive calls to both halves in merge sort will help us find the inversion count of each half. The advantage is that both halves will also become sorted.
  • The merging process in merge sort will also help in identifying the inversion count across the two halves. In the subsequent section, we will delve into this idea in detail.

Solution steps

Here is an efficient approach that utilizes an idea similar to merge sort:

  1. Divide the array into two halves.
  2. Recursively sort each half and calculate the inversion counts.
  3. Merge the sorted halves together, while also tracking the inversion count across both halves.
  4. Return the total inversion count, which is the sum of the inversion counts in the first half, the second half, and the inversions during the merging process.

Inversion count solution using merge sort

Solution code C++

int mergeSort(int X[], int l, int r) 
{
    int invCountLeft = 0;
    int invCountRight = 0;
    int invCountCrossing = 0;
    if (l < r) 
    {
        int mid = l + (r - l) / 2;
        invCountLeft = mergeSort(X, l, mid);
        invCountRight = mergeSort(X, mid + 1, r);
        invCountCrossing = merge(X, l, mid, r);
    }

    return (invCountLeft + invCountRight + invCountCrossing);
}

Implementation of merge function: Counting inversion count crossing both halves

During the merge process of merge sort, we can determine the inversion count by using a two-pointer approach to compare elements from both sorted halves. How? Let's think!

Suppose the size of the left and right halves are n1 and n2, respectively. Here n1 = mid - l + 1 and n2 = r - mid. Now, following the merging process of merge sort, we create two auxiliary memory spaces, A[n1] and B[n2], to store copies of both sorted halves.

Now, we initialize two pointers, i and j, where i is used to traverse the first sorted half and j is used to traverse the second sorted half. We traverse both sorted halves simultaneously using a loop condition: while (i < n1 && j < n2).

  • If A[i] > B[j], it indicates an inversion. Since the left and right subarrays are sorted, there are (n1 - i) inversions. The reason is simple: All the remaining elements in the left subarray (A[i+1], A[i+2], ..., A[n1-1]) will also be greater than B[j].
  • If A[i] ≤ B[j], there is no inversion. The remaining steps are similar to the merging process of merge sort.

Counting inversion during merge process of merge sort algorithm

int merge(int X[], int l, int mid, int r) 
{
    int n1 = mid - l + 1;
    int n2 = r - mid;
    int invCount = 0;
    int A[n1];
    int B[n2];

    for (int i = 0; i < n1; i = i + 1)
        A[i] = X[l + i];
    for (int j = 0; j < n2; j = j + 1)
        B[j] = X[mid + 1 + j];

    int i = 0, j = 0, k = l;
    while (i < n1 && j < n2) 
    {
        if (A[i] > B[j]) 
        {
            X[k] = B[j];
            j = j + 1;
            invCount = invCount + (n1 - i);  // Update inversion count
        } 
        else 
        {
            X[k] = A[i];
            i = i + 1;
        }
        k = k + 1;
    }

    while (i < n1) 
    {
        X[k] = A[i];
        i = i + 1;
        k = k + 1;
    }
    while (j < n2) 
    {
        X[k] = B[j];
        j = j + 1;
        k = k + 1;
    }

    return invCount;
}

Time and space complexity analysis

The time and space complexity of this approach is the same as the time complexity of the merge sort algorithm. Time complexity = O(nlogn), space complexity = O(n).

Critical ideas to think!

  • Can we solve this problem using a hash table, binary search tree, or segment tree?
  • During the merge process, if A[i] > B[j], why do we increment the inversion count by n1 - i?
  • Can we consider solving this problem using the insertion sort or bubble sort algorithm?
  • If there are O(n) inversions in the array, what would be the time complexity of the insertion sort algorithm?
  • What approach should we take if we need to count the inversion count for every element?

Similar coding questions to practice

If you have any queries or feedback, please write us at contact@enjoyalgorithms.com. Enjoy learning, Enjoy algorithms!

More from EnjoyAlgorithms

Self-paced Courses and Blogs