Independent set problem
From Wikipedia, the free encyclopedia
It has been suggested that Independent set be merged into this article or section. (Discuss) |
In mathematics, the independent set problem (IS) is a well-known problem in graph theory and combinatorics. The independent set problem is known to be NP-complete. It is almost identical to the clique problem.
Contents |
[edit] Description
Given a graph G, an independent set is a subset of its vertices that are pairwise not adjacent. In other words, the subgraph induced by these vertices has no edges, only isolated vertices. Then, the independent set problem asks: given a graph G and a positive integer k, does G have an independent set of cardinality at least k?
The corresponding optimization problem is the maximum independent set problem, which attempts to find the largest independent set in a graph. Given a solution to the decision problem, binary search can be used to solve the optimization problem with O(log |V|) invocations of the decision problem's solution. The optimization problem is known to have no constant-factor approximation algorithm if P≠NP.
Independent set problems and clique problems may be easily translated into each other: an independent set in a graph G is a clique in the complement graph of G, and vice versa.
[edit] Algorithms
The simplest brute force algorithm for independent set simply examines every vertex subset of size at least k and checks whether it is an independent set. This requires n!/(n - k)!k! where n is the order of the graph.
A much easier problem to solve is that of finding a maximal independent set, which is an independent set not contained in any other independent set. We begin with a single vertex. We find a vertex not adjacent to it and add it, then find a vertex adjacent to neither of these and add it, and so on until we can find no more vertices to add. At this time the set is maximal independent. More complex algorithms are known for listing all maximal independent sets, but the number of such sets can be exponential in the number of vertices. This can be seen by constructing a graph containing n disconnected triangles. The maximum independent set is obviously of order n, taking one vertex from each triangle. The number of maximal independent sets is therefore 3n (Moon & Moser 1965).
[edit] Proof of NP-completeness
It's easy to see that the problem is in NP, since if we have a subset of vertices, we can check to make sure there are no edges between any two of them in polynomial time. To show the problem is NP-hard, we will use a reduction from another NP-complete problem.
Assume we already know Cook's result that the boolean satisfiability problem is NP-complete. One can efficiently reduce any boolean formula to conjunctive normal form (CNF). In conjunctive normal form:
- The formula is a conjunction (and) of clauses.
- Each clause is a disjunction (or) of literals.
- Each literal is either a variable or its negation.
For example, the following formula is in CNF form, where ~ denotes negation:
- (x1 or ~x2 or ~x3) and (x1 or x2 or x4)
Such a formula is satisfiable if we can assign true/false values to each variable in such a way that at least one literal in every clause is true. For example, any assignment with x2 false and x4 true satisfies the above formula. The problem of determining whether a formula in CNF form is satisfiable is also NP-complete and is called CNF-SAT.
Now, we describe a polynomial-time many-one reduction from CNF-SAT to the independent set problem. First, create a vertex for every literal in the formula; include duplicate vertices for those occurring more than once. Put an edge between:
- Any two literals which are negations of one another.
- Any two literals which are in the same clause.
For example, in our example above, x2 would be adjacent to ~x2, the first x1 would be adjacent to ~x3, and the second x1 would be adjacent to x4.
We argue now that this graph has an independent set of size at least k, where k is the number of clauses, if and only if the original formula was satisfiable.
Suppose we have an assignment satisfying the original formula. Then we can choose one literal from each clause which is made true by this assignment. This set is independent, because it only includes one literal from each clause (no edges of type 2), and because no assignment makes both a literal and its negation true (no edges of type 1).
On the other hand, suppose we have an independent set of size k or greater. It can't contain any two literals in the same clause, since these are pairwise adjacent. But then, since there are at least k vertices and k clauses, we must have at least one in each clause (in fact exactly one). It also can't contain both a literal and its negation, because there are edges between these. That means it's easy to choose an assignment that makes these k clauses all true, and this assignment will satisfy the original formula.
What makes this reduction to independent set so simple is the capacity of the edges in the graph to express constraints, such as the necessity of never choosing both a literal and its negation. The graph coloring problem also benefits from this useful property.
[edit] References
- Richard Karp. Reducibility Among Combinatorial Problems. Proceedings of a Symposium on the Complexity of Computer Computations. 1972.
- Michael R. Garey and David S. Johnson (1979). Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman. ISBN 0-7167-1045-5. A1.2: GT20, pg.194.