Difficulty: Medium, Asked-in: Microsoft, Amazon, Morgan Stanley.
Key takeaway: An excellent problem to learn problem solving using sliding window approach. We can solve a lot of problems using a similar idea.
You are given a string of size n, write a program to find the length of the longest substring without repeating characters. Here substring is a continuous subpart of a string.
Input: str = "abcbbcab", Output: 3
Explanation: There are three substrings of unique characters with the longest length of 3: "abc", "bca", and "cab". So the output is 3.
Input: str = "bbbbb", Output: 1
Explanation: There is only one unique character, so the output is 1.
Input: str = "enjoyalgorithms", Output: 11
Explanation: The only repeating character is "o", and the answer is "yalgorithms", which has a length of 11.
One basic idea is to explore all substrings, and for each substring, we check whether it contains unique characters or not. For a string with n characters, there will be n*(n + 1)/2 substrings.
Step 1: We initialize a variable maxLength to track longest substring with unique characters.
Step 2: Now, we explore all substrings using two nested loops: an outer loop from i = 0 to n - 1, and an inner loop from j = i to n - 1. The idea is to pick a starting character via the outer loop and explore all substrings starting from that character using the inner loop.
Step 3: By the end of the nested loops, we return the value stored in maxLength.
bool isUnique(string str, int i, int j)
{
bool visited[256] = {false}
for (int k = i; k <= j; k = k + 1)
{
if (visited[str[k]] == true)
return false
visited[str[k]] = true
}
return true
}
int longestSubstring(string str, int n)
{
int maxLength = 0
for (int i = 0; i < n; i = i + 1)
{
for (int j = i; j < n; j = j + 1)
{
if (isUnique(str, i, j) == true)
maxLength = max(maxLength, j - i + 1)
}
}
return maxLength
}
We are exploring all n(n + 1)/2 substrings using nested loops. For each substring, function isUnique(str, i, j) takes O(j - i + 1) time to check whether the substring str[i, j] has unique characters or not.
Time complexity = n(n + 1)/2 * O( j - i + 1) = O(n^2) * O( j - i + 1). For the upper bound of the worst-case analysis, we consider O(j - i +1) as O(n). So, overall time complexity in the worst case = O(n^2) * O(n) = O(n^3). We are using a few extra variables and a constant-size set visited[]. So space complexity is O(1).
Now, a critical question is: How can we optimize the time complexity of the brute-force approach? In the brute-force, we repeatedly check a substring starting from the character str[i] to see if it has duplicate characters. In other words, we use O(n) time for every substring. Can we reduce this computation?
Here’s an insight: If we know that the substring str[i, j-1] has no duplicate characters, then for the substring str[i, j], we only need to check if str[j] is already present in the substring str[i, j-1]. To do this, we can use a visited array to check whether the character str[j] is in the substring str[i, j-1], which can be done in O(1) time.
Step 1: Initialize the variable maxLength to keep track of the longest substring.
Step 2: Now we run an outer loop from i = 0 to n - 1 to explore the substring starting from the character str[i]. Inside the loop, we use visited[256] to store the status of characters in the current substring.
Step 3: Now, we run an inner loop from j = i to n - 1 to find the longest substring with unique characters starting from the character str[i]. At each iteration of the inner loop, we check if visited[str[j]] is false. If it is, we mark the status of str[j] in visited[] and update the maxLength. In other words, we are sliding the current window one position to the right by incrementing j until str[j] is present in the substring [i, j - 1].
Step 4: The inner loop will stop when the character str[j] is already present in the current window. Then, we move to the next iteration of the outer loop to check the new window of the substring starting from index i + 1.
Step 5: At the end of the outer loop, we return the value stored in the variable maxLength.
int longestSubstring(string str, int n)
{
int maxLength = 0
for (int i = 0; i < n; i++)
{
bool visited[256] = {false}
for (int j = i; j < n; j++)
{
if (visited[str[j]] == false)
{
maxLength = max(maxLength, j - i + 1)
visited[str[j]] = true
}
else
break // Stop further checking if a duplicate character is found
}
}
return maxLength
}
We are using nested loops and performing an O(1) operation at each iteration. The worst case occurs when the inner loop runs until the end every time. In other words, the worst case occurs when all characters in the string are unique. In such a situation, a nested loop will explore all possible substrings i.e. n(n + 1)/2. Time complexity = n(n + 1)/2 * O(1) = O(n^2). We are using a constant size set visited[] of size 256, so space complexity = O(1).
Now, the critical questions are: Can we optimize the above approach further and solve this problem in O(n) time? Can we eliminate the inner loop? Here is an insight from the previous approach!
In the ith iteration of the outer loop, we run the inner loop from j = i to find the longest substring with unique characters starting from index i. Now in the (i+1)th iteration, we ignore the value of j from the previous iteration and reset it to j = i+1 to run the inner loop. But there’s no need to do this! If the characters in the substring [i, j] are unique, then the characters in the substring [i+1, j] will also be unique. So, in the next iteration of the outer loop, there’s no need to start from j = i+1.
Note: Sliding window is an excellent technique for solving array or string problems. A window is a range of elements defined by the start and end indices, where the window "slides" forward by incrementing either the left or right end.
int longestSubstring(string str, int n)
{
bool visited[256] = {false}
int maxLength = 0
int i = 0, j = 0
while (i < n && j < n)
{
if (visited[str[j]] == false)
{
visited[str[j]] = true
j = j + 1
maxLength = max(maxLength, j - i)
}
else
{
visited[str[i]] = false
i = i + 1
}
}
return maxLength
}
At each iteration of the while loop, we either increment pointer i or pointer j. So, the loop will run n times and perform a constant number of operations at each iteration. The time complexity is O(n). We are using a constant-size set visited[] of size 256, so space complexity is O(1).
If you have any queries/doubts/feedback, please write us at contact@enjoyalgorithms.com. Enjoy learning, Enjoy algorithms!