0/1 Knapsack Problem and Dynamic Programming

189469

Dynamic Programming Career Knapsack Problem Algorithm

0/1 Knapsack is important problem for dynamic programming study since it provides many useful insights.

Statement: Given a set of n items numbered from 1 up to n, each with a weight wi and a value vi, along with a maximum weight capacity W, maximize the sum of the values of the items in the knapsack so that the sum of the weights is less than or equal to the knapsack's capacity.

Naive Solution:
Lets see naive solution - there are only 2 choices for each item, either to include in knapsack or ignore the item.
If item is included then check for remaining items (N - 1) by decreasing capacity W - vi and accumulating value of item. Else, check for remaining items (N - 1) without change in capacity and value. Again, each next item will have two choices. If you visualise as tree it would something like below decision tree:

Each level d there are 2 ^ d choices, there are N items so complexity is 2 ^N.

Another way to consider each item as bit then we check for all possible combinations of setting and unsetting, and find maximum value obtained while satisifying weight constraint. It is clear we need to check (1 << n) or 2 ^ N iterations. So, naive solution is 2 ^ N.

Tabular Method:

Consider very simple example - Weights = {1, 2, 3} and Values ={6, 10, 12} and we have knapsack of capacity 5.

Now, let implement the same using tabular method

On columns, we are increasing capacity from 0 to W, i.e., 0 to 5 the max capacity. On each row, we consider the items, we have noted its weights and values. For each row, we only consider items considered in previous rows and for each column we consider that much capacity. The base case is for 0 weight (no item) the value is 0 regardless of capacity and similarly if we have 0 capacity then we cannot put any item so value would be 0.

The first row (row with weight 1) is simple we have weight 1, so we can fill its value from capacity 1. Since it is only the entire row would have value 6

For second row, now weight is 2, we can fill values same as row above it upto capacity 2. For capacity 2 it would be 2 choice - include or exclude current item. If we exclude current item then value would be same as top row 6. If we include then value would be = current value (10) + d(1, the current capacity (2) - weight(2)) = 10 + d(1, 2- 2) = 10 + 0 = 10. The max is 10 so result is 10. Now, the d function is previous item (1) and zero weight = 0.
This is how we got formula:
d(i, w) = Math.Max( d(i - 1, w), d(i - 1, w - weight[i]) + value[i])

Consider 3rd row and capacity 4, excluding item 3, we got 16 value from above row and including it we have find the value = 12 + d(2, 1) = 12 + 6 = 18.

It is evident that we are computing only once for each capacity 0 to W and each item 0 to N, so complexity is O(NW).

Code for reference:

// given N, maxWeight, weights and values
long[,] d = new long[N + 1, maxWeight + 1];

for (long i = 0; i < N; i++)
{
	for (long w = 0; w <= maxWeight; w++)
	{                    
		if (weights[i] <= w)
		{
			// Exclude or include
			d[i + 1, w] = Math.Max(d[i, w], d[i, w - weights[i]] + values[i]);
		}

		else
		{
			// Exclude
			d[i + 1, w] = d[i, w];
		}                    
	}
}

Wait, how we improve time complexity from O(2 ^ N) to O(N * W)?
It is because we have reused the solutions that is already calculated. For example if capacity = 7, instead of trying different combinations of items like 4 + 3, 2 + 5, 1 + 6, 2 + 4, etc. We are doing only one calculation either to exclude or include current item. When we include current item, we are reusing solution already found reduced capacity and fewer items.

Many dynamic programming problems follows similar patterns like

We have optimisation function - maximise value, minimise distance, etc
Optimal substructure - Recursively find optimal solutions to subproblems
Overlapping subproblems - Same subproblems are solved again and again.

Dynamic programming solves each subproblem once and reuse results. There are 2 approaches:
1. Top Down: During recursive computation of subproblems, we store the results, so again when we attempt to subproblem, we directly use stored results instead of recomputing again. So, results should be stored in way that it can be retrieved in O(1) time - like using arrays/dictionary.

2. Bottom Up: Here we try to solve smaller sub-problems like fewer items and capacity above and then arrive at bigger problems. The solution to bigger subproblems are generating by using solutions of already computed subproblems.

In either case, we need to find out states on which subproblems build up. For example, the items considered and remaining capacity is our state, regardless of number of items remaining and total capacity for same i and w we have same value. Identifying states is crucial for dynamic programming.

Bonus exercise: Above algorithm is not suitable for large weights. Consider small number of items (N <= 100) but huge capacacity (10 ^ 9) and max value (V <= 10 ^ 3), how will modify algorithm? I will update solution after some days.

Comments (19)