Difficulty: Medium, Asked-In: Google, Facebook
Key takeaway: An excellent problem to learn time complexity optimization using various approaches. The two-pointer and hash table solutions are intuitive and worth exploring.
Given two unsorted arrays X[] and Y[] of sizes m and n respectively, write a program to find the intersection of these two arrays. Suppose m > n and the elements in both arrays are distinct.
Example 1
Input: X[] = [3, 4, 6, 2, 8, 5, 9], Y[] = [6, 3, 2, 7, 5, 1]
Output: [3, 6, 2, 5]
Explanation: Common elements are 3, 6, 2, and 5. So the intersection of both arrays is 3, 6, 2, and 5.
Example 2
Input: X[] = [3, 4, 6, 7, 10, 12, 5], Y[] = [7, 11, 18, 15, 12]
Output: [7, 12]
Explanation: Common elements are 7 and 12. So the intersection of both arrays is 7 and 12.
Example 3
Input: X[] = [3, 4, 6, 10, 5], Y[] = [7, 11, 18, 15, 12]
Output: There are no common elements. So the output is an empty array.
One basic idea is to traverse the array X[] and linearly search each element X[i] in Y[]. If element X[i] is present in Y[], then X[i] will be a common element in both arrays.
vector<int> arrayIntersection(int X[], int Y[], int m, int n)
{
vector<int> intersection;
for (int i = 0; i < m; i = i + 1)
{
for (int j = 0; j < n; j = j + 1)
{
if (X[i] == Y[j])
{
intersection.push_back(X[i]);
break;
}
}
}
return intersection;
}
def arrayIntersection(X, Y, m, n):
intersection = []
for i in range(m):
for j in range(n):
if X[i] == Y[j]:
intersection.append(X[i])
break
return intersection
We are running two nested loops and performing O(1) operations at each iteration. In the worst case, the inner loop will run n times. So time complexity = O(m) * O(n) * O(1) = O(mn).
We are using constant extra space, so space complexity = O(1). Why are we not considering the output list intersection as part of space complexity? Think!
The critical question is: How can we improve the time complexity of the above approach? Searching is an essential operation in the above code. Can we think to improve the efficiency of searching?
One idea is: If we sort the array Y[], we can search each element X[i] using binary search. This will help us to improve time complexity because binary search will take O(logn) time to search an element in the sorted array.
int binarySearch(int Y[], int n, int target)
{
int l = 0, r = n;
while (l <= r)
{
int mid = l + (r - l) / 2;
if (Y[mid] == target)
return mid;
else if (Y[mid] > target)
r = mid - 1;
else
l = mid + 1;
}
return -1;
}
vector<int> arrayIntersection(int X[], int Y[], int m, int n)
{
vector<int> intersection;
sort(Y, Y + n);
for (int i = 0; i < m; i = i + 1)
{
if (binarySearch(Y, n, X[i]) != -1)
intersection.push_back(X[i]);
}
return intersection;
}
def binarySearch(Y, n, target):
l, r = 0, n
while l <= r:
mid = l + (r - l) // 2
if Y[mid] = target:
return mid
elif Y[mid] > target:
r = mid - 1
else:
l = mid + 1
return -1
def arrayIntersection(X, Y, m, n):
intersection = []
Y = sorted(Y)
for i in range(m):
if binarySearch(Y, n, X[i]) != -1:
intersection.append(X[i])
return intersection
Suppose, we are using heap sort and iterative binary search for the implementation. So time complexity = Time complexity of sorting array Y[] using heap sort + Time complexity of searching each element X[i] in Y[] using binary search = O(nlogn) + m * O(logn) = O(nlogn + mlogn). If m > n, then mlogn > nlogn, and the time complexity will be O(mlogn).
Space complexity = Space complexity of heap sort + Space complexity of the iterative binary search = O(1) + O(1) = O(1). What will be the space complexity if we use merge sort and recursive binary search for the implementation? Think!
In the above approach, the time complexity is dominated by the searching operation of m elements, which is O(mlogn). The critical question is: Can we improve the time complexity of searching further? Here's an idea: If we sort both arrays, we can find the intersection elements in both arrays by applying the two-pointers approach, similar to the merging algorithm of merge sort. How can we implement this? Let's think!
After sorting both arrays, we start comparing the elements of both arrays from the start using two pointers, i and j. If X[i] == Y[j], then we have found a common element. So we add one of them to the output list and move to the next elements in both arrays. Otherwise, we move to the next element in the array that includes the smaller element. Think!
Now we run a loop while i < m and j < n:
vector<int> arrayIntersection(int X[], int Y[], int m, int n)
{
vector<int> intersection;
sort(X, X + m);
sort(Y, Y + n);
int i = 0, j = 0;
while (i < m && j < n)
{
if (X[i] == Y[j])
{
intersection.push_back(X[i]);
i = i + 1;
j = j + 1;
}
else if (X[i] < Y[j])
i = i + 1;
else
j = j + 1;
}
return intersection;
}
def arrayIntersection(X, Y, m, n):
intersection = []
X.sort()
Y.sort()
i, j = 0, 0
while i < m and j < n:
if X[i] == Y[j]:
intersection.append(X[i])
i = i + 1
j = j + 1
elif X[i] < Y[j]:
i = i + 1
else:
j = j + 1
return intersection
Suppose we are using heap sort for the implementation.
For m > n, if we compare the time complexity of this approach with the previous approach, then it is not better! The idea is: We have improved the time complexity of searching, but time complexity is dominated by the sorting algorithm, which is O(mlogm). The time complexity of the previous approach is O(mlogn), which is better than O(mlogm). Think!
We can improve the time complexity of searching using a hash table because a hash table can perform searching in O(1) time average. How can we implement this? Let's think!
The idea is to first insert all elements of Y[] into the hash table and then search each element X[i]. If X[i] is present in the hash table, then it is a common element to both arrays. So we add it to the output list.
vector<int> arrayIntersection(int X[], int Y[], int m, int n)
{
vector<int> intersection;
unordered_set<int> H;
for (int i = 0; i < n; i = i + 1)
H.insert(Y[i]);
for (int i = 0; i < m; i = i + 1)
{
if (H.find(X[i]) != H.end())
intersection.push_back(X[i]);
}
return intersection;
}
List<Integer> arrayIntersection(int[] X, int[] Y, int m, int n)
{
List<Integer> intersection = new ArrayList<>();
HashSet<Integer> H = new HashSet<>();
for (int i = 0; i < n; i = i + 1)
H.add(Y[i]);
for (int i = 0; i < m; i = i + 1)
{
if (H.contains(X[i]))
intersection.add(X[i]);
}
return intersection;
}
def arrayIntersection(X: List[int], Y: List[int], m: int, n: int) -> List[int]:
intersection = []
H = set()
for i in range(n):
H.add(Y[i])
for i in range(m):
if X[i] in H:
intersection.append(X[i])
return intersection
Time complexity = Time complexity of inserting n elements of Y[] in hash table + Time complexity of searching m elements of X[] in hash table = n * O(1) + m * O(1)) = O(m + n). Space complexity = O(n), for storing values in the hash table.
If you have any queries/doubts/feedback, please write us at contact@enjoyalgorithms.com. Enjoy learning, Enjoy algorithms!