Advanced algorithms & complexity
(Polynomial transformation: problems solvable in polynomial time) Theorem. Let L1 ∝ L2 and L2 ∈ P. Then
Then L1 ∈ P. (Equivalently: if L1∉ P, then L2 ∉ P) Proof. Since L2 ∈ P, there exists TM M2 deciding for y ∈ A∗2 whether or not y ∈ L2. Now we construct a new TM, M1, deciding membership to L1. Let f : A∗1 → A∗2 be a function realizing the polynomial transformation of L1 to L2, and Mf be a TM for computing f in polynomial time. For a given input x ∈ A∗1 the machine M1 works in two stages: on the first one, Mf computes f(x); on the second, M1 decides whether f(x) ∈ L2 using the machine M2 as a subroutine. To complete the proof it remains to estimate complexity of M1. Let Tf (n) and T2(n) be the complexities of Mf and M2 respectively. Then the complexity of M1 is O(Tf (|x|) + T2(|f(x)|)). But |f(x)| ≤ Tf (|x|) because the Tf(|x|) computes f(x) so it must take at least as many steps to process the output) therefore the complexity of M1 is O(Tf (|x|) +T2(Tf (|x|)), i.e., is polynomial in |x|. Theorem is proved.
Theorem for the approximation algorithm for Travelling Salesman with triangle inequality
Theorem. (1) The approximation algorithm for Travelling Salesman with triangle inequality is correct (i.e., produces a tour). (2) The algorithm has polynomial complexity. (3) The algorithm has a ratio bound ρ(n) = 2. Proof. (1) By its definition, the Preorder Walk is the Full Walk from which duplicates are deleted. The Full Walk includes all vertices of G, thus L consists of all vertices of G. (2) Straightforward analysis. (3) Let L∗ denote an optimal tour, d(L) be the length of L, d(L∗) be the length of L∗. Observe that a spanning tree for G can be obtained by deleting any edge from any tour. Thus, for a minimal spanning tree T, with sum of weights on all edges d(T), the following inequality holds d(T) ≤ d(L∗). Let W be a Full Walk of T, and d(W) be its length. Since W visits every edge of T exactly twice, d(W) = 2d(T). It follows that d(W) ≤ 2d(L∗). By the triangle inequality, Preorder Walk L is not longer than W: d(L) ≤ d(W). As a result, we have: d(L) ≤ 2d(L∗), or, equivalently, d(L)/d(L∗) ≤ 2
Cook's Theorem: formulation
Theorem. Boolean Satisfiability problem is NP-complete. Actually we will prove that CNF-satisfiability problem is NP-complete. idea of the proof We are going to present an arbitrary problem from the class NP by the Nondeterministic Turing Machine M which solves this problem in polynomial time. To prove the theorem we need to produce a function f from the definition of the polynomial transformation. An argument x for this function will be an input of M, while the value f(x) will be a Boolean formula F expressing the statement that M accepts x. If M does accept x, then F = f(x) is satisfiable, else F is not satisfiable. Go over long proof on 3rd set of notes
Transitivity of ∝
Theorem. For three languages L1, L2, L3, if L1 ∝ L2 and L2 ∝ L3, then L1 ∝ L3. Proof. Let f1,2 and f2,3 be functions realizing the polynomial transformations L1 ∝ L2 and L2 ∝ L3. Define the function f1,3 as a composition of f1,2 and f2,3: f1,3(x) = f2,3(f1,2(x)). It is easy to check that f1,3 realizes a polynomial transformation L1 ∝ L3. Theorem is proved.
Theorem: No ratio bound for general Travelling Salesman
Theorem. If P ≠ NP then there is no approximation algorithm for general Travelling Salesman with polynomial complexity and constant ratio bound. Proof. We show how to solve Hamiltonian Cycle (HC) problem in polynomial time using a hypothetical approximation algorithm, with polynomial complexity and constant ratio bound ρ, for general Travelling Salesman (TS), as a subroutine. (but HC is NP complete). Let G = (V, E) be an input of HC. Construct an input of TS with the set of cities V and the distance function: d(u, v) = 1 if (u, v) ∈ E d(u, v) = ρ|V | + 1 if (u, v) ∉ E. Clearly, this input of TS can be constructed from G in polynomial time. If G contains a hamiltonian cycle, then its length will be |V |. If G does not contain a hamiltonian cycle, then there is a pair (u, v) of subsequent cities in any tour such that (u, v) ∉ E, thus the size of any tour will be at least (ρ|V | + 1) + (|V | − 1) = (ρ + 1)|V | > ρ|V |. Thus, we can decide whether G contains a hamiltonian cycle by approximately solving the instance of TS. If the size of the solution tour is ≤ ρ|V |, then G contains a hamiltonian cycle, else G has no hamiltonian cycles
What is the complexity for computing a minimum spanning tree for a graph?
Theorem. There is an algorithm, having polynomial complexity, for computing a minimum spanning tree for a graph. Proof can be conducted by presenting a concrete algorithm. There are two famous ones: of Kruskal and of Prim both of which are quite elementary and simple.
Input encoding
There can be many ways of encoding (to be suitable for a turing machine) an input for an algorithm. e.g. for an undirected graph with edges (v5, v7),(v1, v3,),(v1, v7) we could encode it as a word in the alphabet {v, 0, 1, 2, . . . , 9}: v5v7v1v3v1v7 or we could represent it as an adjacency matrix. Writing the lines of adjacency matrix in single line end-to-end we get an encoding of a graph as a word in the alphabet {0, 1}. Input encodings may differ in length but they shouldn't differ by more than a polynomial factor. For any computational problem Π, for any two reasonable encoding schemes S1, S2 there exists a polynomial p(n) such that for any input I the following inequalities holds: |S1(I)| ≤ p(|S2(I)|), where |Si(I)| denotes the length of the code Si(I). A scheme is "reasonable" if it is "compact" enough, that is, if it does not lead to artificial "blowing up" of the input. Blowing up can lead to an increase of the input so that an exponential time algorithm will turn into a polynomial time.
Theorem 2. The relation ∝T is ...
Transitive Proof is straightforward.
Class P examples
(1) "Even unary non-negative numbers" is in P, take n + 1 as polynomial p(n) from the definition; (2) "Palindromes" is in P, p(n) = cn2 for a large enough positive constant c; (4) "Primality" is in P. This is a problem of deciding for a given integer number N > 0, represented in decimal (or binary) form, whether N is prime. "Traveling Salesman" and "Satisfiability" are probably not in P, but no proofs for any of them are known.
Theorem. If P 6= NP, then
(1) NPI ≠ ∅; (2) there exist the problems A, B ∈ NP I such that neither A ∝ B, nor B ∝ A. Proof is nontrivial, we don't consider it in this unit (see the book by M. Garey and D. Johnson). The item (2) of the Theorem states that under the hypothesis P 6= NP the set NP I is divided into more that one equivalence classes with respect to polynomial transformation.
Theorem for the approximation algorithm for vertex covering
(1) The approximation algorithm for Vertex Covering is correct (i.e., indeed produces a covering). (2) The algorithm has polynomial complexity. (3) The algorithm has a ratio bound ρ(n) = 2. Proof. (1) Correctness of the algorithm is obvious since it loops until every edge is covered by some vertex. (2) Straightforward observation. (3) Let A denote the set of all edges that are chosen during the execution of the item (3) of the algorithm. No two edges of A have a common vertex, since once the edge (v, w) is chosen, all edges having either v or w as vertices are deleted. Thus, each execution of (step 3 in alg) adds exactly two new vertices to S and |S| = 2|A|. Any covering, in particular optimal S∗, must cover every edge in A, thus any covering has at least |A| vertices. (The optimum vertex cover must cover every edge in A. So, it must include at least one of the endpoints of each edge ∈ A, where no 2 edges in M share an endpoint) .So, |S∗| ≥ |A| = |S|/2. It follows that |S|/|S∗| ≤ 2.
Examples of NP problems
(1) Travelling Salesman ∈ NP. To show this, we need to construct a NTM for solving this problem with polynomial complexity. On the guessing stage the NTM produces an arbitrary permutation of cities, and on the verifying stage, working like an "ordinary" TM, checks whether the length of the guessed tour does no exceed the boundary B. An accepting computation for an input x exists if and only if there is a short tour. This computation consists in writing out a guess and finding the corresponding sum of all distances. Clearly the complexity of this computation is polynomial. According to the definition, a non-accepting computation does not contibute to the complexity. (2) Boolean satisfiability ∈ NP. NTM first guesses a satisfying assignment of truth values to variables, and then verifies the guess in polynomial time.
What is the lower bound for the problem of sorting by comparisons?
(see Examples of decision and computation trees pdf) n log n The complexity of the sorting problem in this model is the height of a tree as a function of the size of the input. Since the number of leaves of a tree for sorting of n elements should be equal to the number of all possible outputs, i.e., to n!, the height can not be less than Ω(n log n). for any tree if num of leaves = l and height h h≥log₂l so h≥log₂(n!)≥log₂((n/2)^(n/2)) = n/2 log₂(n/2) = n/2(log₂(n)-1) ≥c n log₂n = Ω(n log n)
Definition. Let L ⊂ Σ∗ be a language over an alphabet Σ. The set of words L(bar)
= Σ∗\ L is called the complement to L. Observe that L is not necessarily a language
Proof that q-clique problem is NP complete.
A Nondeterministic Turing Machine guesses a q-clique and then checks the choice in polynomial time. Therefore, q-clique ∈ NP. Now it is sufficient to prove that CNF Satisfiabilit ∝ k-clique by constructing an appropriate transformation function f. Let F = F1 ∧ · · · ∧ Fq be CNF-formula with Fi = (yi1 ∨ · · · ∨ yiki ). The value f(F) is, firstly, a graph G(V, E) whose vertices are in one-to-one correspondence with the set of all positions of literals, and, secondly, the integer q. For two vertices ij and kl there is an edge (ij , kl) ∈ E if and only if (1) i ≠ k and (2) yij ≠ ¬ykl. Obviously, the defined function f can be computed in polynomial time Let F be a satisfiable formula having a satisfying vector of truth values. The each conjunction member Fi containes a literal yij with value t. Then vertices ij for all 1 ≤ i ≤ q form a q-clique. Conversely set each vertex x in a q-clique to be true in the CNF formula and any set ¬x to be false. All other's can be set arbitrarily.
Can we modify the Turing machine to change the complexity of recognising palindromes?
A difficult theorem states that for any TM solving the palindrome problem the complexity is Ω(n²). That theorem remains true for all "reasonable" modifications of the concept of TM with one tape and one working head. If we, however, allow two tapes (with one working head per tape) it's trivial to construct a machine for recognizing palindromes in linear time O(n).
2 Tape-Turing machine (non-detailed explanation) and how it can solve the recognising palindrome problem in O(n)
A formal definition of 2-tape Turing Machine (2TM) straightforwardly follows the pattern of the definition of ordinary TM. Not going into the details, let us mention that the input code appears on the tape 1, on a given step the machine performs writing-erasing and shifting independently and simultaneously on each tape. Thus, the transition function is defined on triples (state, symbol on tape 1, symbol on tape 2), and its value is a 5-tuple (new state, new symbol on tape 1, new symbol on tape 2, direction of shift on tape 1, direction of shift on tape 2). A 2TM for palindrome recognition first copies the input code on tape 2 (O(n) steps), then returns the working head on tape 1 to the beginning of the input (O(n) steps), and then compares the words on both tapes reading the one on tape 1 from left to right, and the one on tape 2 from right to left (O(n) steps). The total amount of steps is O(n)
Complexity lower bounds
A function L(n) is called a (complexity) lower bound for a computational problem L in a given class M of models of computation (e.g. Turing machines) if for any algorithm from M which solves L with complexity T(n) the following relation holds: T(n) = Ω(L(n))
Optimal Travelling Salesman (OptTS) problem
A modification of Travelling Salesman problem as a search problem in which the output is the tour of the smallest length.
What class of problems does 1-CNF and 2-CNF fall into?
An obvious polynomial-time algorithm for deciding satisfiability of F is based on the observation that F is not satisfiable if and only if there exists a pair (xi , xj ) such that xi = ¬xj . The algorithm examines all the pairs and checks this property. For formulae in 2-CNF there is a much less trivial polynomial-time algorithm (called method of resolutions) (So both elements of P)
Complexity for Turing machines
Complexity or running time or working time of a TM which terminates on all inputs is a function T : N −→ N from natural numbers to natural numbers defined by equality: T(n) = max{m : there exists an input w consisting of n letters such that TM uses m steps.}
k-CNF
Definition. A Boolean formula in conjunctive normal form (CNF) such that each conjunction member ("clause") contains at most k literals is said to be in k-CNF
Polynomial equivalence
Definition. A language L1 is polynomially equivalent to a language L2 if L1 ∝ L2 and L2 ∝ L1. We will denote this relation by L1 p.e. L2.
NP-hard
Definition. A search problem L is NP-hard if for any language L' ∈ NP the following holds: L' ∝T L.
Turing transformation: definition
Definition. Let L1,L2 be search problems, herewith L2 is a problem of computing a function g of the form g : Γ∗ −→ Γ∗. We say that L1 is Turing transformable to L2 (notation: L1 ∝T L2) if there exists OTMg for solving L1 using as oracle the function g and having polynomial complexity
Yes-No problem / Language equivalence
Every Yes - No computational problem corresponds to a language over a certain alphabet, namely to the set of all input codes having output Yes. Conversely, for a given (by a formal grammar) language L over an alphabet A one considers a Yes - No membership problem: Input : A word w in A. Output : Yes if w ∈ L, else No
Approximation algorithm for Vertex Covering
First we recall the Vertex Covering in optimization version. Input : Graph G = (V, E). Output : A minimal S ⊂ V such that for every (v,w) ∈ E either v ∈ S or w ∈ S (here "minimal" means having the smallest possible number of elements). Approximation algorithm for Vertex Covering (1) S ← ∅; (2) E' ← E; (3) while E' ≠ ∅ do let (v, w) be arbitrary edge of E' S ← S ∪ {v, w} remove from E' every edge having either v or w as a vertex; (4) return S. An example of how this algorithm works is presented on a separate sheet.
if if L ∈ NP is the complement L(bar) ∈ NP?
For all problems from NP the complement is indeed a language, however it is not at all obvious that if L ∈ NP, then L ∈ NP. That is because the definition of NTM (unlike the definition of Yes - No TM) is non-symmetrical with respect to Yes - No output. Example. For the problem CNF SAT, the complement is the following Yes - No problem. Input: Boolean formula F in conjunctive normal form. Output: Yes if F is not satisfiable (i.e., identically false), else No. There is no apparent way of taking advantage of the non-determinism (i.e., guessing module) for solving this problem. It seems to be nothing essentially better than just examining one by one every true/false assignment to all variables and checking whether it gives the value false to F.
Examples of Turing Machine complexity for recognising palindromes
For each letter a of the input alphabet Σ TM has two states qa, q'a herein for two distinct letters a, b ∈ Σ the states qa, q'a, qb, q'b are pair-wise distinct. Let the input code be a1a2 · · · an. TM "remembers" a1 in the input code by adopting the corresponding state qa1, erases a1 and, remaining in the state qa1, moves to the end of the input code. The last letter an in the input is the one appearing immediately before the first occurrence of the blank b. If an does not coincide with a1 (does not match the state qa1), then TM terminates its work by adopting the state qN , else TM erases the last letter an = a1, "remembers" an−1 by adopting the state q'an−1, erases an−1, and moves to the beginning of the word on tape which now starts with a2. If a2 does not coincide with an−1, then TM adopts the state qN , else it erases a2, "remembers" a3, and the cycle continues in a similar way. It is easy to understand how TM should behave on the last steps in case the input is indeed a palindrome. The complexity of the described TM is O(n²).
What is the formula that has the interpretation G1: At any moment i the machine M is in the exactly one state.
G1 : Q[i, 0] ∨ Q[i, 1] ∨ · · · ∨ Q[i, r], 0 ≤ i ≤ p(n), ¬Q[i, j] ∨ ¬Q[i, j0], 0 ≤ i ≤ p(n), 0 ≤ j < j0 ≤ r.
six groups G1, . . . , G6 of disjunctions needed n the CNF boolean formula F (to prove cook's theorem)
G1: At any moment i the machine M is in the exactly one state. G2: At any moment i the working head observes the exactly one cell. G3: At any moment i every cell containes the exactly one letter from Γ. G4: At the moment 0 the computation is in the initial configuration of the verification stage with input x. G5: Not later than after p(n) steps the machine M adopts the state qY . G6: For any moment i, the configuration of M at the moment i + 1 is obtained from the configuration at the moment i by a single application of the transition function δ. It is clear that the disjunctions in groups G1, . . . , G6 are simultaneously true if and only if M accepts the input x.
What is the formula that has the interpretation G2: At any moment i the working head observes the exactly one cell.
G2 : H[i, −p(n)] ∨ H[i, −p(n) + 1] ∨ · · · ∨ H[i, p(n) + 1], 0 ≤ i ≤ p(n), ¬H[i, j] ∨ ¬H[i, j0], 0 ≤ i ≤ p(n), −p(n) ≤ j < j0 ≤ p(n) + 1.
What is the formula that has the interpretation G3: At any moment i every cell containes the exactly one letter from Γ.
G3 : S[i, j, 0] ∨ S[i, j, 1] ∨ · · · ∨ S[i, j, v], 0 ≤ i ≤ p(n), −p(n) ≤ j ≤ p(n) + 1, ¬S[i, j, k] ∨ ¬S[i, j, k0 ], 0 ≤ i ≤ p(n), −p(n) ≤ j ≤ p(n) + 1, 0 ≤ k < k0 ≤ v. v i size of the tape alphabet Γ.
Candidates for problems in NPI
Graph Isomorphism and Linear programming Graph Isomorphism is still unknown but Linear programming was found to be in P
Boolean satisfiability problem (and obvious algorithm)
Input : A Boolean formula F with variables x1, . . . , xn. Output : Yes if F is satisfiable, else No obvious "brute force" algorithm examines in arbitrary order all possible evaluations of variables. If one evaluation the value of F is true, then F is satisfiable and the output is Yes, else the output in No (2ⁿ steps)
Travelling salesman problem (and obvious algorithm)
Input : A finite set of "cities" {1, . . . , n} and a set of positive integer "distances" d(i, j) between i and j for each pair i < j. [d(i,j)=d(j,i)] Output : A permutation i1, . . . , in of cities 1, . . . , n such that the sum ∑(1≤j≤n−1) d(ij , ij+1) + d(i1, in) is as small as possible. This sum is the length of a tour starting in the city i1, visiting all the cities in a certain order and returning back to i1. Obvious algorithm is try every permutation of cities (n!)
Yes-No travelling salesperson problem
Input : A finite set of "cities" {1, . . . , n}, a set of integer "distances" d(i, j) > 0 between i and j for each pair i < j, and a natural number B > 0. Output : Yes, if there exists a permutation i1, . . . , in of cities 1, . . . , n such that ∑(1≤j≤n−1) d(ij , ij+1) + d(i1, in) ≤ B, else No
membership to a circle problem
Input : A point (x, y) ∈ R². Output : Yes if and only if (x, y) satisfies the inequality X² + Y² ≤ 1.
The following is k-Vertex Covering problem
Input : Graph G = (V, E) and natural number k. Output : Yes if G has a vertex covering consisting of k vertices, else No
Minimum spanning tree problem
Input : Graph G = (V, E), weight function w(u, v)≥ 0 for every edge (u, v) ∈ E. Output : Subgraph T of G with the set of vertices V , herein T is a tree and the value ∑((u,v)∈T) w(u, v) is minimal possible.
Travelling Salesman with triangle inequality The optimization version of Travelling Salesman problem if formulated as follows
Input : Integral matrix of distances ||dij||, 1 ≤ i, j ≤ m, dij > 0. Output : Permutation i1, i2, . . . , im (called "tour") of numbers 1, 2, . . . , m such that the value ∑(1≤j<m) dij ij+1 + din i1 is minimal possible. Travelling Salesman with triangle inequality has an additional property of the input: it is required that for any three indices i, j, k the following inequality holds: dij ≤ dik + dkj . It can be proved (not a part of this course) that Travelling Salesman with triangle inequality is NP-hard.
Linear Programming problem
Input : System of linear inequalities in several variables with integer coefficients. Output : Yes if the system is consistent (i.e., has a solution) in real numbers, else No. The status of Linear Programming was open for a long time until in late seventies a polynomial-time algorithm was discovered. Even before that it seemed quite unlikely for the problem to be NP-complete since the complement to Linear Programming is in NP (follows from so-called Duality Theorem in linear programming). If Linear Programming was NP-complete, then, by Theorem from previous section, NP = co−NP, which is probably wrong.
Graph Isomorphism problem
Input: Two graphs G = (V, E) and G' = (V', E'). Output: Yes if there is a bijective (one-to-one) map (function), called isomorphism, f : V −→ V' such that (v, w) ∈ E is equivalent to (f(v), f(w)) ∈ E' ; else No. The status of Graph Isomorphism is unknown. Despite many efforts no polynomial time algorithm was found. On the other hand, the problem seems to be too "rigid" to be NP-complete.
Isomorphism to Subgraph problem
Input: Two graphs G = (V, E) and G' = (V', E'). Output: Yes if there is a subgraph in G which is isomorphic to G'; else No. Isomorphism to Subgraph is NP-complete problem which is proved by polynomially transforming the k-clique problem to it (take as G' the complete graph with k vertices).
What is the formula that has the interpretation G6: For any moment i, the configuration of M at the moment i + 1 is obtained from the configuration at the moment i by a single application of the transition function δ.
Interpretation: if the working head at the moment i does not observe cell j, then the letter in the cell j will not change at the moment i + 1 (by (1) and (2) the disjunction is equivalent to (S[i, j, l] ∧ ¬H[i, j]) → S[i + 1, j, l]). G6.2 : For every 4-tuple i, j, k, l such that 0 ≤ i < p(n), −p(n) ≤ j ≤ p(n) + 1, 0 ≤ k ≤ r, 0 ≤ l ≤ v the following three disjunctions: ¬H[i, j] ∨ ¬Q[i, k] ∨ ¬S[i, j, l] ∨ H[i + 1, j + ∆], ¬H[i, j] ∨ ¬Q[i, k] ∨ ¬S[i, j, l] ∨ Q[i + 1, k0 ], ¬H[i, j] ∨ ¬Q[i, k] ∨ ¬S[i, j, l] ∨ S[i + 1, j, l0 ], where ∆, k0 , l0 are defined as follows: if qk ∈ Q \ {qY , qN }, then δ(qk, sl) = (qk0 , sl 0 , ∆), if qk ∈ {qY , qN }, then ∆ = 0, k0 = k, l0 = l. Interpretation: Passing from one configuration of M to the next is done according to transition function δ.
Theorem 1. For any two Yes - No problems (languages L1,L2), if L1 ∝ L2, then
L1 ∝T L2. Proof. Trivial, since as we had seen before, ∝ is a particular case of ∝T . Remark. It is not known whether the statement inverse to Theorem 1 is true or not for languages.
Theorem. k-Vertex Covering problem is NP-complete.
Lemma. S ⊂ V is a q-clique in G if and only if V \ S is (|V | − q)-vertex covering in G(bar). Proof. Let S be a q-clique in G, let v ∈ S. Then either there is no edge from v in G(bar), or there is an edge (v, w) from v such that w ∈ V \ S. Thus, V \ S is a vertex covering. Conversely, let V \ S be a vertex covering, then there are no edges (v, w)in G(bar) with v, w ∈ S. It follows that S is a clique. Proof of the Theorem. It is sufficient to show that q-Clique ∝ k-Vertex Covering. We construct the polynomial transformation f by setting f(x) = (G, |V | − q) for an argument x = (G, q)
a language
Let A be a finite alphabet. An arbitrary (finite or infinite) set of words (strings of letters) in A is called a language (over A). Remark. The set of words in the definition of the language is supposed to be recursively enumerable, i.e., generated by a formal grammar
Approximation algorithm for Travelling Salesman with triangle inequality
Let G = (V, E) be a complete graph with the set of vertices V = {1, 2, . . . , n} and weights w(i, j) = dij where dij are the distances from the input of Traveling Salesman. (1) Select a vertex r ∈ V to serve as a root for the future tree; (2) Build a minimum spanning tree T for G with the root r using a polynomial-time minimum spanning tree algorithm as a subroutine; (3) Construct the list L of vertices being a Preorder Walk of T; (4) Return L as an approximate tour in G.
q-clique problem
Let G = (V, E) be an undirected graph. A clique in G is a subset S ⊂ V such that for any two vertices v1, v2 ∈ S there exists an edge (v1, v2) ∈ E. The following is q-Clique problem: Input : Graph G = (V, E) and an integer q > 0. Output : Yes if G has a clique consisting of q vertices, else No.
Vertex covering
Let G = (V, E) be an undirected graph. Vertex covering of G is a subset S ⊂ V such that for any edge (v1, v2) ∈ E either v1 ∈ S or v2 ∈ S
Let L ∈ NP. Complexity relations between P and NP
Let L ∈ NP. Then there exist a polynomial p(n) and TM for solving L with complexity O(2^p(n)). Proof. Let N be an NTM for solving L with complexity TN (n) ≤ r(n) where r(n) is a polynomial. By the definition of NTM, for every accepted input x with |x| = n there is a guess word in the tape alphabet Γ of the length not greater than r(n) (because the NTM has to process it in less than r(n) steps) such that N adopts the state qY in no more than r(n) steps. The total number of possible guesses is less than |Γ|^r(n)+1. Now we construct a (deterministic) TM M as follows. M examines every possible guess in turn, and for each of them runs the verification stage of N up to r(n) steps. That will take O(r(n)|Γ|^r(n)) steps. By the definition of O-symbol, that's the same as O(2^p(n)) for a certain polynomial p(n). The theorem is proved.
Polynomial transformation: definition
Let L1 be a language over an alphabet A1, and L2 be a language over an alphabet A2. A∗i will denote the set of all words over Ai for i = 1, 2. Definition. We say that L1 is polynomially transformable to L2 if there exists a function f : A∗1 → A∗2 such that (1) f is computable in polynomial time; (2) for any x ∈ A∗1 x ∈ L1 if and only if f(x) ∈ L2. Notation: L1 ∝ L2.
Turing transformation: informal definition and generalisation for search problems
Let us first recall that for two languages L1,L2 the relation L1 ∝ L2 informally means that there is an algorithm such that: (1) for given input word x1 of the first problem the algorithm computes an input word x2 of the second problem, herewith x1 ∈ L1 if and only if x2 ∈ L2; (2) the algorithm uses a "subroutine" to solve the problem 2 on the input x2 and finds output y2; (3) y2 is also the answer for problem 1. The algorithm uses polynomial time for all its work, except the "subroutine". Now we generalize this informal definition for search problems L1 and L2: We say that L1 is Turing transformable to L2 if there exists an algorithm which for a given input x1 of L1 computes the required output using (zero, or one, or more times) an algorithm for the problem L2 as a subroutine. The running time of the algorithm is polynomial if each call of the subroutine is counted as one elementary step.
Theorem. Let L be an NP-complete problem such that L ∈ NP. Then
NP = co−NP. Proof is relatively straightforward. We don't consider it in this course.
Theorem. TS ∝T OptTS, therefore OptTS is...
NP-hard. Proof. The OTM for TS with threshold B first uses the oracle to solve the OptTS with the same set of cities and distances as the input of TS (one step of computation), then in polynomial time finds the length l of the produced optimal tour and checks whether B ≥ l. If the latter inequality is true, then the answer is Yes, else - No. Observe that this example does not use the relation ∝T in full force, addressing the oracle only once
Nondeterministic Turing Machine
Nondeterministic Turing Machine (NTM) consists of the same ingredients as "ordinary" Turing Machine, they differ in how they work. the working head observes cell 1; guessing head observes cell −1. NTM works in two stages: (1, guessing stage) Guessing head moves along the tape from right to left shifting to the next cell on each step. On each step guessing head writes a letter from Γ on the tape. The process may or may not terminate. During this stage the control module of NTM and its working head remain passive; NTM is not in any "state". When (if ever) the guessing state terminates, NTM moves to the second stage. (2, verifying stage) NTM adopts the initial state q0 and workes as ordinary TM with the combination of the "guessed word" and the original input word, as its input. During this second stage the guessing module and its guessing head are passive.
O-notation definition
O(g(n)) = {f(n) : there exist constants c > 0 and n₀> 0 such that 0 ≤ f(n) ≤ cg(n) for all n ≥ n₀} Convention. If f(n) ∈ O(g(n)), then we write: f(n) = O(g(n)).
Oracle Turing Machine (OTM)
Oracle Turing Machine (OTM) differs from ordinary (deterministic) TM with arbitrary output in the following way. Apart from the usual working tape, OTM has an additional oracle tape equipped with the oracle read-write head which is attached to the control module. OTM works like ordinary TM, but can at an arbitrary moment adopt (via "asking" state) a special asking mode in which: (1) the oracle head writes a word (which was generated during the previous normal work of the machine) on the oracle tape (oracle input); (2) after (1) is done, the oracle head writes a word on the oracle tape (oracle output). Passage from (1) to (2) is determined by a function g : Γ∗ −→ Γ∗ , where Γ is the tape alphabet. Each application of asking mode counts as one step for complexity. Since OTM depends on a concrete build-in function g, the notation OTMg might be more appropriate. The formal definition of OTM and its complexity is straightforward.
How is the hypothesis. NP ≠ co−NP stronger than P≠NP
P = co−P, thus if P = NP, then NP = co−NP.
Give two examples of equivalence classes of the relation 'polynomial equivalence'
P and NP-Complete i.e Any two problems in the class P are p.e and any two NP-complete languages are p.e.
Class P
P is the class of all Yes - No computational problems (languages) L such that there exists TM and a polynomial p(n) such that TM solves L with complexity T(n) ≤ p(n) for all n ≥ 1
Theorem. P ⊂ NP.
Proof. Let L be a Yes - No problem from P, and M be a (deterministic) TM for solving L in polynomial time. We get a NMT for solving L in polynomial time by immitating M on the verification stage and ignoring the guessing stage. Theorem is proved.
What is the formula that has the interpretation G4: At the moment 0 the computation is in the initial configuration of the verification stage with input x.
Q[0, 0], H[0, 1], S[0, 0, 0], S[0, 1, k1], S[0, 2, k2], . . . , S[0, n, kn], S[0, n + 1, 0], S[0, n + 2, 0], . . . , S[0, p(n) + 1, 0], where x = sk1sk2· · · skn.
What variables do we need in the CNF boolean formula F (to prove cook's theorem) (Use some parts of the machine M) As any NTM, the machine M has Γ = {s0 = b, s1, s2, . . . , sv}, the tape alphabet; Q = {q0, q1 = qY , q2 = qN , q3, . . . , qr}, the set of states; δ, the transition function such that δ(qk, sl) = (qk0 , sl 0 , ∆), where ∆ ∈ {−1, 0, 1} is playing the role of T ∈ {L, S, R} in our previous version of NTM. Let p(n) be a polynomial with integer coefficients such that the complexity TM(n) of M satisfies the inequality TM(n) < p(n) for all n ≥ 1.
Q[i, k], 0 ≤ i ≤ p(n), 0 ≤ k ≤ r. At the moment i the machine M is in the state qk. H[i, j], 0 ≤ i ≤ p(n), −p(n) ≤ j ≤ p(n) + 1. At the moment i the working head observes the cell j. S[i, j, k], 0 ≤ i ≤ p(n), −p(n) ≤ j ≤ p(n) + 1, 0 ≤ k ≤ v. At the moment i the cell j containes the letter sk.
What is the formula that has the interpretation G5: Not later than after p(n) steps the machine M adopts the state qY .
Q[p(n), 1].
How do you prove NP-completeness of a Yes - No problem?
Show that: (1) L ∈ NP; 2) for any problem L′, if L′ ∈ NP, then L′ ∝ L. OR (2) for at least one already known NP-complete problem L′ it holds that L′ ∝ L
Complexity of the membership to a circle
The complexity of this algorithm (the height of the tree) is 5 This is quite natural since we assume that the size of the input is a constant (= 2). (see Examples of decision and computation trees pdf)
Turing machine
Turing Machine consists of the following ingredients: (1) Tape alphabet Γ = {a1, . . . , al, b} (where b denotes blank); (2) Input alphabet Σ ⊂ Γ \ {b}; (3) Set of states Q = {q0, . . . , qm, qY , qN }, where q0 is the initial state, qY is the final Yes-state, qN is the final No-state; (4) Transition function δ : (Q \ {qY , qN }) × Γ −→ Q × Γ × {R, S, L} An input word is written on a tape that is infinite in both directions and is divided into cells. Each cell of the tape contains one letter from Γ. Before the computation starts an input word is written on the tape occupying subsequent cells from left to right, TM is in the initial state q0, and its "working head" observes the leftmost letter of the input word (or, in other words, the cell in which that letter is written) For a generic step: TM is in a state qi, its working head observes a certain cell containing a letteraj . At a generic step TM applies the transition function δ to the pair (qi, aj ) to obtain a triple (ql, ar, T) where T ∈ {R, S, L}. TM adopts the state ql, replaces aj by ar in the cell observed, and shifts the working head one cell to the right (if T = R) or to the left (if T = L) or does not shift the head at all (if T = S). The computation process terminates if and only if TM adopts either the state qY or the state qN . We will say that a TM accepts an input w of a Yes - No problem if the computation terminates on w with the state qY . Otherwise TM terminates on w with the state qN , and it does not accept w. In the first case we will also say that the output on w is Yes, in the second case we will say that the output on w is No.
P = co−P ?
Yes
NP-completeness definition.
Yes - No problem (language) L is called NP-complete if (1) L ∈ NP; (2) for any problem L′, if L′ ∈ NP, then L′ ∝ L.
Class NP
Yes - No problems (languages) L such that there exist polynomial p(n) and NTM for solving L with complexity T(n) ≤ p(n) for any n ≥ 1.
Is 3-CNF-Satisfiability NP-complete?
Yes; Theorem. CNF-Satisfiability is polynomially transformable to 3-CNF-Satisfiability. Thus, 3-CNF-Satisfiability is NP-complete. Proof: on sheet titles NP-completeness of some problems.
Algebraic computation tree T in variables X1, . . . , Xn is
a tree with the root v0 such that to every vertex v (except leaves) an arithmetic operation (addition, subtraction or multiplication) and a polynomial fv are attached. At the root v0 the corresponding arithmetic operation, say +, is performed on a pair of variables, say Xi , Xj , and fv0 = Xi+Xj is the result of this operation. Let v0, v1, . . . , vl be the sequence of vertices along the (unique) branch leading from the root v0 to vl. An arithmetic operation at vl is performed on a pair from X1, . . . , Xn, fv0, . . . , fvl−1, a ∈ R and fvl is the result of this operation. Every v has three children in T corresponding to the sign of fv (> 0, < 0, = 0). Let ∗i ∈ {>, <, =} for 0 ≤ i < l be the sign of fvi . Then to vl one can assign a semialgebraic set Uvl = {fv0 ∗0 0, fv1∗1 0, . . . , fvl−1∗l−1 0}. Finally, to each leaf w of T an output Yes or No is assigned. We call Uw an accepting set if to w the output Yes is assigned. We say that T tests the membership to the union of all accepting sets. The computation process works as follows. A specific point x ∈ Rn is taken as an input. Then the value fv0 (x) is computed and the sign of this value is determined. According to the sign, the algorithm goes to the corresponding son v1 of v0. Then the arithmetic operation attached to v1 (i.e., the value fv1) is computed and so on. If the process eventually arrives to a Yes-vertex, then x belongs to an accepting set, and, therefore, to the union of all accepting sets.
A palindrome in alphabet A
a word a1a2 · · · an such that ai ∈ A and ai = an−i+1 for any i, 1 ≤ i ≤ n, e.g. abba is a palindrome in alphabet {a, b}.
Theorem. If L1,L2 ∈ NP, L1 ∝ L2, and L1 is NP-complete, then L2 is
also NPcomplete. Proof. Let L′ be an arbitrary language from NP. Because L1 is NP-complete, L′ ∝ L1. From transitivity of ∝ then follows that L′ ∝ L2. Theorem is proved.
Theorem. Polynomial equivalence is an
an equivalence relation on the set of all Yes - No problems (languages), i.e., for any two problems L1, L2 the following holds: (1) L1 p.e. L1; (2) if L1 p.e. L2 then L2 p.e. L1; (3) relation p.e. is transitive. Proof. (1), (2) are trivial; (3) is the content of Theorem, Section 8
How are search problems "harder" than the corresponding Yes - No problems
any algorithm for a search problem automatically solves the Yes - No version
A tree
any connected graph without cycles. Sometimes a vertex is designated in the tree, called the root.
If we assume P ≠ NP hypothesis, then what serves as a lower bound for any NP-complete problem in the class of all Turing machines?
any polynomial
Definition. Let G = (V, E) be an undirected graph. Graph G(bar) = (V, E(bar)) is
called the complement to graph G if (v, w) ∈ E is equivalent to (v, w) ∉ E(bar).
Exercise. Prove that recognizing membership to an n-dimensional ball defined by the inequality X²₁ + · · · + X²₂ n ≤ 1 can be done on algebraic computation tree model with complexity 2n + 1.
draw out the tree
Theorem. A search problem L is NP-hard if and only if
for at least one (thus, for any) NP-complete language L' relation L' ∝T L (**) is true. Proof. If L is NP-hard, then for any language from NP, in particular, for any NPcomplete language L' the relation (**) is true. Conversely, if there exists an NP-complete language L' such that (**) is true, then for any language L'' ∈ NP we have L'' ∝ L' ∝T L, which, by Theorems 1 and 2 of Section 8, implies that L'' ∝T L, thus L is NP-hard by the definition.
We say that NTM solves the Yes - No problem L if
it accepts exactly all Yes inputs in L
Definition. A subset S ⊂ Rⁿ is called (basic) semialgebraic if
it is of the form S = {f1 = · · · = fk = 0, fk+1 > 0, . . . , fk+r > 0}, where fi, 1 ≤ i ≤ k + r are polynomials in n variables, i.e., S is a set of all solutions of a system of equations and strict inequalities.
What is the objective function of the following optimization search problems. (1) Travelling Salesman (2) Clique (3) Vertex Covering
length of a tour, size of a clique (number of vertices), size of a covering.
Full Walk
look on Diagrams for approximation algorithms (pdf) Note that Full Walk visits every EDGE of the tree exactly twice.
Let C∗ denote the value of the objective function at an optimal solution Approximation algorithm for an optimization problem has ratio bound ρ(n) if for any input of the size n the value C of the objective function at the approximate solution, produced by this algorithm, satisfies ...
max{C/C∗, C∗/C} ≤ ρ(n). Observe that for maximization problem 0 < C ≤ C∗ , thus max{C/C∗, C∗/C} = C∗/C, while for minimization problem 0 < C∗ ≤ C, thus max{C/C∗, C∗/C} = C/C∗. Obviously, ρ(n) ≥ 1. Equality ρ(n) = 1 means that approximate solution is optimal. Of course, a large ratio bound might mean that approximate solution is much worse than an optimal one.
Θ-notation definition
n. If for any two functions f and g holds f(n) = O(g(n)) and f(n) = Ω(g(n)) we write f(n) = Θ(g(n)). In other words, recalling that O(g(n)) and Ω(g(n)) are sets of functions, we define Θ(g(n)):= O(g(n))∩ Ω(g(n))
Preorder walk
obtained from Full Walk by deleting from the Full Walk every non-first occurrence of a vertex.
The complexity for algebraic computation tree model is
the height of the tree.
Let NTM accept x. Complexity of accepting x is
the number of steps in the shortest accepting computation for x. Complexity of NTM is the function T(n) = max{m : there exists x, |x| = n such that the complexity of accepting x is m}. i.e take the min accepting computation length for an x and take the max of this over all x with |x|=n If such x does not exist let T(n) = 1.
NTM accepts an input x if
there exists a computation (for x) ending with the state qY . Observe that if NTM accepts x there might also be computations (for x) ending with qN . There might be many computations (for x, because of each guess) leading to qY .
Why do we need a rigorous definition of an algorithm?
to prove that no algorithm exists we have to make clear what exactly does not take place.
co−P
{L : L ∈ P}
co−NP
{L(bar) : L ∈ NP};
For C, C∗ and n defined as above, the relative error is the fraction...
|C∗ − C|/C∗. Any function ε(n) with |C∗ − C|/C∗ ≤ ε(n) is called relative error bound. Easy computation shows that ρ(n) − 1 ≥ |C∗ − C|/C∗, i.e., ρ(n) − 1 is a relative error bound.
A → B is equivalent to
¬A ∨ B.
¬(A ∧ B) is equivalent to
¬A ∨ ¬B
Ω notation definition
Ω(g(n))= {f(n) : there exist constants c > 0, n0 > 0 such that 0 ≤ cg(n) ≤ f(n) for all n ≥ n