Difficulty: Medium, Asked-In: Amazon, Microsoft, Yahoo, Zoho, Visa.
Key takeaway: An excellent problem to learn problem solving using dynamic programming.
We are given n items, where each item has a weight and profit associated with it. We are also given a knapsack with capacity C (knapsack can hold at most C weight). Write a program to determine the maximum profit that can be obtained by selecting a subset of items without exceeding the knapsack capacity.
Constraint
Input: n = 3, W[] = [3, 1, 4], P[] = [25, 20, 20], C = 5
Output: 45
Explanation: There are 7 possible non-empty subsets of items: {3}, {1}, {4}, {3}, {3, 1}, {3, 4}, {1, 4}, and {3, 1, 4}. Out of these choices, {3, 4} and {3, 1, 4} have combined weights that exceed the maximum capacity of 5. So we will not consider them in the solution. Among the remaining subsets, {3}, {1}, {4}, {3}, {3, 1}, and {1, 4}, the subset {3, 1} will give the maximum profit, which is 45.
Input: n = 3, W[] = [4, 5, 1], P[] = [1, 2, 3], C = 4
Output: 3
Explanation: There are two items which have a weight less than or equal to 4. If we select the item with weight 4, the profit is 1. And if we select the item with weight 1, the profit is 3. So the maximum possible profit is 3. Note: we cannot put items with weights 4 and 1 together because the capacity of the knapsack is 4.
Input: n = 3, W[] = [4, 5, 6], P[] = [1, 2, 3], C = 3
Output: 0
Explanation: The weight of each item exceeds the maximum capacity of the knapsack, so we cannot select any items. The maximum possible profit is 0.
One basic solution will be to generate all possible subsets of items and calculate the combined weight of each subset. If the weight of the subset is greater than the knapsack capacity C, we ignore that subset. Otherwise, we calculate the profit associated with that subset and update the maximum profit. In other words: we only choose the subsets with a weight smaller than the knapsack capacity C. Among these subsets, we identify the subset with the maximum profit.
The critical question is: How can we generate all subsets of items? Here is an idea! There are two choices for every item: Either item is included in the knapsack or item is excluded from the knapsack. Let's move forward to discuss these two scenarios separately. Suppose we use the function knapsack(P[], W[], n, C) to find the maximum profit for n items.
If the weight of the nth item (W[n - 1]) is less than or equal to the knapsack capacity C, we can include nth item in the knapsack. If we include the nth item, we will acquire P[n - 1] profit, and the remaining knapsack capacity will be C - W[n - 1].
Now problem gets reduced to a smaller version of the knapsack problem: Maximizing the profit from the remaining n - 1 items with capacity C - W[n - 1]. So, we call the same function recursively with new input parameters, i.e., knapsack(P[], W[], n - 1, C - W[n - 1]).
If (C >= W[n-1])
profitInclude = P[n-1] + knapsack(P[], W[], n-1, C - W[n-1])
If we exclude the nth item, the capacity of the knapsack remains the same, but we have only n - 1 items left. Now, the problem is reduced to another smaller version of the knapsack problem: Maximizing the profit from the remaining n - 1 items with a capacity of C. So, we call the same function recursively with updated input parameters, i.e., knapsack(P[], W[], n - 1, C).
profitExclude = knapsack(P[], W[], n-1, C)
The maximum profit obtained from n items will be the max of the above two cases i.e., max (profitInclude, profitExclude).
Base case: If there are no more items to consider (n == 0) or the knapsack capacity becomes zero (C == 0), we can’t select any more items. So, we return 0 as the maximum profit.
int knapsack(int P[], int W[], int n, int C)
{
if (n == 0 || C == 0)
return 0;
int profitInclude = 0;
if (C >= W[n - 1])
profitInclude = P[n - 1] + knapsack(P, W, n - 1, C - W[n - 1]);
int profitExclude = knapsack(P, W, n - 1, C);
return max(profitInclude, profitExclude);
}
def knapsack(P, W, n, C):
if n == 0 or C == 0:
return 0
profitInclude = 0
if C >= W[n - 1]:
profitInclude = P[n - 1] + knapsack(P, W, n - 1, C - W[n - 1])
profitExclude = knapsack(P, W, n - 1, C)
return max(profitInclude, profitExclude)
There are two possibilities for each of the n items. So the total number of possible subsets = (2 * 2 * 2 * ...n times) = 2^n. In the worst case, recursion will explore all these possible combinations of items. So time complexity = O(2^n).
This solution is inefficient because recursion will solve the same sub-problems repeatedly. We can observe repeated sub-problems by drawing the recursion tree. In the following image, one sub-problem is repeated. If we expand the recursion tree further, we will observe more repeated sub-problems.
Now, the critical question is: How can we improve the time complexity? This is an optimization problem where sub-problems are repeated during recursion. So we can think to apply the dynamic programming approach and store the sub-problems solution in extra memory to avoid repeated calculation.
One idea is to use the top-down approach of dynamic programming (memoized version of the above recursive solution). Here is the idea: When any subproblem appears first time during recursion, we calculate it and store that value in an extra memory or look-up table. When the same subproblem appears again, we directly retrieve the stored solution from the extra memory or look-up table in constant time.
We initialize a 2D table dp[][] of size (n + 1) * (C + 1) and set all values to -1. We use this table to store the solutions of the sub-problems. The critical question is: Why (n + 1) * (C + 1)? During recursion, input size n and C decrease by at least 1 until they reach 0. So there can be (n + 1) * (C + 1) possible sub-problems, and we need to allocate an entry for each one of them.
Now, we call a helper function knapsack(P[], W[], i, C, dp), where the initial value of i is n.
If dp[i][C] < 0, result is not yet computed. So, we proceed with the recursive calculation.
int knapsack(int P[], int W[], int i, int C, int** dp)
{
if (i == 0 || C == 0)
return 0;
if (dp[i][C] < 0)
{
int profitInclude = 0;
if (C >= W[i-1])
profitInclude = P[i-1] + knapsack(P, W, i - 1, C - W[i-1], dp);
int profitExclude = knapsack(P, W, i - 1, C, dp);
dp[i][C] = max(profitInclude, profitExclude);
}
return dp[i][C];
}
int zeroOneKnapSack(int P[], int W[], int n, int C)
{
int** dp = new int*[n + 1];
for (int i = 0; i <= n; i = i + 1)
{
dp[i] = new int[C + 1];
for (int j = 0; j <= C; j = j + 1)
dp[i][j] = -1;
}
int result = knapsack(P, W, n, C, dp);
return result;
}
def knapsack(P, W, i, C, dp):
if i == 0 or C == 0:
return 0
if dp[i][C] < 0:
profitInclude = 0
if C >= W[i - 1]:
profitInclude = P[i - 1] + knapsack(P, W, i - 1, C - W[i - 1], dp)
profitExclude = knapsack(P, W, i - 1, C, dp)
dp[i][C] = max(profitInclude, profitExclude)
return dp[i][C]
def zeroOneKnapSack(P, W, n, C):
dp = [[-1] * (C + 1) for _ in range(n + 1)]
result = knapsack(P, W, n, C, dp)
return result
We solve each sub-problem once using recursion. So time complexity = O(n*C). This is a significant improvement compared to the previous approach!
We require extra memory of size (n + 1)(C + 1). So space complexity = O(nC). There will be additional overhead of space complexity for the recursion call stack. So, What would be the size of the recursion call stack? Explore and think!
This is a dynamic programming problem. So, we can efficiently solve this problem using the iterative idea of bottom-up approach. How can we do this? Let's think!
We start by solving subproblems with smaller sizes (0 items and 0 capacity). Then, we build the solution iteratively for larger subproblems using the previously solved smaller subproblems until we find the solution to the given problem (n items and C capacity). During this process, we use additional memory to store the solution of each subproblem.
Table size: We define a 2D table dp[n + 1][C + 1] to store the solution for each possible subproblem.
Table initialization: Now, we initialize the table with the solution of the smallest version of the subproblem (similar to the base case in the recursive solution). For this, we set the value 0 in the first row and the first column of the table.
Iterative structure to fill the table: Now, we define an iterative structure to fill the table and store the solution of subproblems in a bottom-up manner. We can get this insight from the recursive structure of the above recursive solution. Here are the steps!
We use nested loops to fill the remaining cells of the dp table. The outer loop runs from i = 1 to n to iterate over each item, and the inner loop runs from j = 1 to C to consider each possible capacity value. At each ith iteration:
if (W[i - 1] > j)
dp[i][j] = dp[i - 1][j];
else
dp[i][j] = max(P[i - 1] + dp[i - 1][j - W[i - 1]], dp[i - 1][j]);
Returning the final solution: After filling the entire dp table, we return the value stored in the bottom-right cell dp[n][C], which represents the maximum profit after considering all n items and knapsack capacity C.
int knapsack(int P[], int W[], int n, int C)
{
int dp[n + 1][C + 1];
for (int i = 0; i <= n; i++)
dp[i][0] = 0;
for (int j = 0; j <= C; j++)
dp[0][j] = 0;
for (int i = 1; i <= n; i++)
{
for (int j = 1; j <= C; j++)
{
if (W[i - 1] > j)
dp[i][j] = dp[i - 1][j];
else
dp[i][j] = max(P[i - 1] + dp[i - 1][j - W[i - 1]], dp[i - 1][j]);
}
}
return dp[n][C];
}
def knapsack(P, W, n, C):
dp = [[0] * (C + 1) for _ in range(n + 1)]
for i in range(1, n + 1):
for j in range(1, C + 1):
if W[i - 1] > j:
dp[i][j] = dp[i - 1][j]
else:
dp[i][j] = max(P[i - 1] + dp[i - 1][j - W[i - 1]], dp[i - 1][j])
return dp[n][C]
We use two separate loops for initialization and one nested loop for storing results in the table. During each iteration of the loop, we perform constant operations. So time complexity = O(n) + O(C) + O(n*C) = O(n*C). We use an extra 2D table of size (n + 1)*(C + 1), so space complexity = O(n*C).
This approach is slightly more efficient compared to the top-down approach. The reason is simple: there is no overhead of recursive calls and call stack.
The above solution uses O(n*C) space. The critical question is: Can we reduce the space further? Can we use a table with two rows to get the solution? Let's think!
In the above solution, we can observe a pattern: To fill the (i, j) entry, we use two values stored in the previous row at (i - 1, j) and (i - 1, j - W[i - 1]). In other words, for filling the ith row, we use the value stored in the (i - 1)th row. So, we only need two rows at a time: previous row and current row.
We define a 2D dp[][] array of size 2 *(C + 1) to represent the two rows: dp[0] and dp[1]. We store ith row values at index i % 2 and previous row values at index (i - 1) % 2. Taking modulus with 2 helps us switch between rows based on the current row index i.
For example, for i = 3, the previous row will be stored at dp[0] and the current row will be filled at dp[1]. Then, for i = 4, dp[1] will contain the entries of the previous row and the current row will be filled at dp[0] (overriding the previous value stored in dp[0]). This is the idea of a rolling array in dynamic programming.
So there are two major changes in the above solution: 1) Instead of an (n + 1) x (C + 1) table, we use a 2 x (C + 1) table. 2) Instead of using i and i - 1, we use i % 2 and (i - 1) % 2. The remaining ideas will remain the same!
int knapsack(int P[], int W[], int n, int C)
{
int dp[2][C + 1];
for (int i = 0; i <= n; i++)
{
for (int j = 0; j <= C; j++)
{
if (i == 0 || j == 0)
dp[i % 2][j] = 0;
else if (W[i - 1] <= j)
dp[i % 2][j] = max(P[i - 1] + dp[(i - 1) % 2][j - W[i - 1]], dp[(i - 1) % 2][j]);
else
dp[i % 2][j] = dp[(i - 1) % 2][j];
}
}
return dp[n % 2][C];
}
def knapsack(P, W, n, C):
dp = [[0] * (C + 1) for _ in range(2)]
for i in range(n + 1):
for j in range(C + 1):
if i == 0 or j == 0:
dp[i % 2][j] = 0
elif W[i - 1] <= j:
dp[i % 2][j] = max(P[i - 1] + dp[(i - 1) % 2][j - W[i - 1]], dp[(i - 1) % 2][j])
else:
dp[i % 2][j] = dp[(i - 1) % 2][j]
return dp[n % 2][C]
The time complexity of this approach will be the same as the previous approach, i.e., O(n*C). However, the space complexity will be reduced to O(C) because we are using only a table with two rows.
The above solution works well in O(C) space. The critical question is: Can we solve 0-1 knapsack problem using a single row or 1-D array? The idea is: If we start traversing rows from right to left, then it can be done with a single row only. How? Let's think!
Suppose we take a 1-D table dp[C + 1] and initialize all values with 0. Now at the (i - 1)th iteration, suppose we have stored (i - 1)th row in the table. To calculate the values at each cell dp[j] in the ith iteration (from right to left), we have two values available from (i - 1)th iteration: dp[j] and dp[j - W[i - 1]].
In other words, at the start of the ith iteration:
So, when we fill the value at dp[j] while traversing from right to left in the ith iteration (equivalent to dp[i][j]), we already have required values available at dp[j] and dp[j - W[i - 1]].
int knapsack(int P[], int W[], int n, int C)
{
int dp[C + 1];
memset(dp, 0, sizeof(dp));
for (int i = 1; i < n + 1; i++)
{
for (int j = C; j >= 0; j--)
{
if (W[i - 1] <= j)
dp[j] = max(dp[j], dp[j - W[i - 1]] + P[i - 1]);
}
}
return dp[C];
}
def knapsack(P, W, n, C):
dp = [0] * (C + 1)
for i in range(1, n + 1):
for j in range(C, -1, -1):
if W[i - 1] <= j:
dp[j] = max(dp[j], dp[j - W[i - 1]] + P[i - 1])
return dp[C]
The time and space complexity of this approach will be the same as the previous approach, i.e., O(n*C) and O(C). However, the space complexity will be reduced because we are only using a 1-D table instead of a two-row 2D table.
If you have any queries or feedback, please write us at contact@enjoyalgorithms.com. Enjoy learning, Enjoy algorithms!