Subset Sum

←	• HOME • UP •	→
	Top/NP-complete/Subset sum

12.5. The Subset Sum Problem

The Subset Sum problem

The subset sum problem is as follows.

Input. A list of positive integers x₁, x₂, …, x_m and a positive integer K.

Question. Does there exist an index set I ⊆ {1, 2, …, m} so that ∑_{i ∈ I} x_i = K?

For example, suppose that the list is (5, 2, 9, 3, 5, 8) and K = 20. Then the answer is true: 9 + 3 + 8 = 20.

Similarly, if the list is (5, 2, 9, 3, 5, 8) and K = 12, the answer is true: 5 + 2 + 5 = 12.

But if the list is (5, 2, 9, 3, 5, 8) and K = 4, the answer is clearly false.

Subset Sum is in NP

The question for Subset Sum is: Does there exist an index set I ⊆ {1, 2, …, m} so that ∑_{i ∈ I} x_i = K?

So a nondeterministic algorithm for Subset Sum is as follows.

Evidence: An index set I ⊆ {1, 2, …, m}.

Accept evidence if: ∑_{i ∈ I} x_i = K.

It should be clear that

the evidence is short (bounded in length by a polynomial in n, the length of the input) and
testing the evidence can be done in polynomial time.

Subset Sum is NP-complete

Theorem. The Subset Sum problem is NP-complete.

We have seen that Subset Sum is in NP. All that is left is to reduce some known NP-complete problem to Subset Sum.

We reduce 3-SAT to Subset Sum.

The reduction function takes a clausal formula φ with 3 literals per clause and it yields a list (x₁, x₂, …, x_m) and a positive integer K.

Here is how the reduction works.

Suppose that φ = c₁ ∧ c₂ ∧ … ∧ c_u, and that φ contains v variables.

As a running example, suppose that φ is: (x ∨ y ∨ ¬z) ∧ (¬y ∨ w ∨ z) ∧ (¬w ∨ ¬x ∨ y).

There are m = 2(u + v) numbers in the list (x₁, x₂, …, x_m). They are as follows, written in standard decimal notation, for the example. The lines are just to break them up.

	w	x	y	z		c₁	c₂	c₃
w	1	0	0	0	\|	0	1	0
w	1	0	0	0	\|	0	0	1
x		1	0	0	\|	1	0	0
x		1	0	0	\|	0	0	1
y			1	0	\|	1	0	1
y			1	0	\|	0	1	0
z				1	\|	0	1	0
z				1	\|	1	0	0
					\|	1	0	0
					\|	1	0	0
					\|		1	0
					\|		1	0
					\|			1
					\|			1

	1	1	1	1		3	3	3

The value of K is shown at the end: 1111333.

In general, there are two numbers for each variable x, one standing for x and one for ¬x. Let's write N_x and N_x for those two variables.

There is also a column for each variable. Notice that K has a 1 in each of those columns. Any collection of these numbers that adds up to K must have exactly one of the numbers N_x and N_x.

Selecting N_x corresponds to choosing x to be true.

Selecting N_x corresponds to choose x to be false.

Now look at the columns labeled by clauses.

There is a 1 in column c_i of N_x if clause c_i contains literal x.
There is a 1 in column c_i of N_x if clause c_i contains literal ¬x.
Among the numbers that stand for variables, there are no more than 3 1's in a given column, since each clause contains only 3 literals.
Each clause column has two additional 1's that only occur in that column.
So each clause column has no more that 5 1's. That means, even if you select all of the columns, there can never be a carry. The columns add up independently of one another.
Notice that there are only two 1's in a clause node that are not in numbers that correspond to variables, and each clause column must add up to 3.

So for each clause c_i, at least one of the numbers that corresponds to a variable (or its negation) that has a 1 in column c_i must be selected.
Choosing N_x, having a 1 in column c_i, corresponds to x being true.

N_x has a 1 in column c_i because clause c_i contains x. So clause c_i is true.
Choosing N_x, having a 1 in column c_i, corresponds to x being false.

N_x has a 1 in column c_i because clause c_i contains ¬x. So again, clause c_i is true.

There is a direct correspondence between variable assignments that make φ true and collections of numbers from the given list that add up to K.

So φ is satisfiable if and only if the Subset Sum problem just described has a solution. ◊

←	• HOME • UP •	→
	Top/NP-complete/Subset sum

	w	x	y	z		c₁	c₂	c₃
w	1	0	0	0	\|	0	1	0
w	1	0	0	0	\|	0	0	1
x		1	0	0	\|	1	0	0
x		1	0	0	\|	0	0	1
y			1	0	\|	1	0	1
y			1	0	\|	0	1	0
z				1	\|	0	1	0
z				1	\|	1	0	0
					\|	1	0	0
					\|	1	0	0
					\|		1	0
					\|		1	0
					\|			1
					\|			1

	1	1	1	1		3	3	3

	w	x	y	z		c₁	c₂	c₃
w	1	0	0	0	\|	0	1	0
w	1	0	0	0	\|	0	0	1
x		1	0	0	\|	1	0	0
x		1	0	0	\|	0	0	1
y			1	0	\|	1	0	1
y			1	0	\|	0	1	0
z				1	\|	0	1	0
z				1	\|	1	0	0
					\|	1	0	0
					\|	1	0	0
					\|		1	0
					\|		1	0
					\|			1
					\|			1

	1	1	1	1		3	3	3

	w	x	y	z		c₁	c₂	c₃
w	1	0	0	0	\|	0	1	0
w	1	0	0	0	\|	0	0	1
x		1	0	0	\|	1	0	0
x		1	0	0	\|	0	0	1
y			1	0	\|	1	0	1
y			1	0	\|	0	1	0
z				1	\|	0	1	0
z				1	\|	1	0	0
					\|	1	0	0
					\|	1	0	0
					\|		1	0
					\|		1	0
					\|			1
					\|			1

	1	1	1	1		3	3	3