Formal Languages - Final Exam
Turing Machine Conventions
(a) one-way-infinite-to-the-right tape (b) difference between input alphabet and tape alphabet (They cannot be equal. At least one difference is the blank, which can't be part of the input.) (c) immediate halt upon entering accept or reject state (d) any transitions not shown in a state diagram are implicit transitions to the reject state (e) input begins on leftmost tape cell, end occurs at first blank space and rest of tape is blank (f) head not allowed to move off of left end (g) the accept state and reject state must be different states
Understand the difference between standard Turing Machines and common variants and when the computational power between them remains unchanged (e.g. 2-way infinite tapes, multiple tapes, 2D tapes, restricted head movement, limited tape, etc.).
2-way infinite tapes have the same power as a 1-way infinite tape. Restricted head movement, depending on what it is (if it can never go left, it's not Turing Universal). If it's limited or a finite tape, it is not Turing Universal. We can just say there is an input longer than the tape, so it can't do it.
Know the Recursion Theorem and understand its proof, and understand that it allows TM programs to access their own definitions.
A TM can find things out about itself, like how many states it has, and things like that.
Know the definitions of frequently used languages such as A_TM, A_DFA, etc., E_TM , E_DFA, etc., EQ_TM , EQ_DFA, etc.
A_TM→Accepting Turing Machine A_DFA→Accepting Deterministic Finite Automaton A_PDA→Accepting PDA E_TM→Empty Turing Machine (accepts empty language) E_DFA→Equal DFA EQ_TM→Equal Turing Machine (pairs of TMs) EQ_DFA→Equal DFA (pairs of DFA)
Hierarchy of formal languages
All - class of all possible languages by counting argument, there are more of these than there are TMs RE - can be enumerated or recognized There is either a machine that will spit out every string in that language (order or infinite number of times regardless) For every string in the language, there is some TM you can feed it to, it will eventually say yes. If it says no, it may say no or get stuck in a loop. DEC - subset that is always guaranteed to halt. Says yes or no correctly This is also where P and NP live CFL - Context free languages REG - Regular languages DFAs, NFAs, regular expressions
What is the time complexity class TIME(t(n))?
All languages that can be decided in time So time of n^2 is any language that has a TM that decides it in big-O of n^2 That is deterministic. (by default, we mean deterministic)
[___] means all physical methods of computing are equivalent.
Church-Turing Thesis All general purpose programming languages and computational models (lambda calculus, digital computers, etc.) are equivalent in computational power to Turing Machines.
What does it mean for a language to be Co-Turing-Recognizable?
Co means compliment. The compliment is the opposite. For accepting TMs, it's Turing Recognizable: it only stops to say if it accepts, but it is not decidable. It's compliment, set of strings which are not in it, is not recognizable, because they are by definition (says no or gets stuck in an infinite loop).
Understand the difference between "computational power" and "computational complexity".
Computational power: (can you do it at all?) If there is a comparison of computational power between a NTM and TM There is none. They can both do the same things. They can recognize the same languages, etc. Complexity: (how efficiently can you do it?) These are different. It's how efficiently they do it. We conjecture that there is an exponential separation between them.
Between sets: Recursively Enumerable (RE), Decidable (DEC), which is a subset of the other?
DEC is a subset of RE
A Turing Machine is a [___] if it halts on all inputs.
Decider. Such a TM decides its language conclusively. Meaning if there is some TM given a string in that language, it is guaranteed to stop and say it is either in or not in M. If the answer is no, it might get stuck in an infinite loop.
Know what a "minimal TM" is, what the language MIN_TM is, and understand why it is not Turing-Recognizable.
Enumerate TMs and simulate them, making the TM find its own description to see how big it is. Then it simulates a TM that is bigger than it, and ... just remember that the language that is the set of all TMs that are the Minimal machines for their language is not Turing Recognizable. Because TMs can access their own definition, and you can make a new TM that can break it. Simulate a TM that has more states than it does, sees how many states it has, simulates a new one, then it is now smaller than it.
T/F: The number of languages is countable
False The number of languages is not countable. That means there are languages that cannot be computed by any TM, because there are more languages than there are possible TMs to compute them.
Understand how every Nondeterministic Turing Machine can be simulated by a (deterministic) Turing Machine, and the structure of a computational tree of an Nondeterministic Turing Machine's computation.
For a deterministic simulation of an NTM, it's an exponential blow-up time-wise. This matters in complexity, but not in computability.
Know what an enumerator is, how it operates (in general), that the class of Turing-recognizable languages is equivalent to the class of enumerable languages, and how an enumerating TM can be used to build a recognizing TM (and vice versa).
If we have a recognizer, we can make it an enumerator, and vise-versa
Be familiar with the Cook-Levin Theorem which shows that SAT ∈ P iff P = NP , and thus that SAT is NP-complete.
Implication of this is, if there is an efficient solver for SAT, then there is an official solver for all of NP
If a string has no shorter representation of itself, then it is [___]
In-compressible
Understand time complexity relationships between different models of computation (e.g. standard TMs vs multi-tape TMs or NTMs).
It is exponential. For every NTM there exists a TM that simulates it in exponential time.
What does it mean for a function to be asymptotically faster than another?
It means that in big-O notation, it is faster than another. If you have two big-O measures, and the function of one is smaller than another, then the smaller is asymptotically faster than the other.
What does it mean for a Turing Machine to be a decider for its language?
It will always halt
Know the definition of the class P and how to show that a language is in P .
It's just all languages solvable in some polynomial. So n^k, these are polynomial time. P is the class of languages that can be decided in polynomial time.
Know how time complexity is measured (i.e. the number of computational steps of a deciding TM as a function of the size of the input, in the worst case).
It's the number of steps of a TM that decides a language. We're always talking about deciders now. We are also referring to worst case time. Our runtime, as it relates to complexity, is based on the length of the input. (if we say the runtime is , that means if the input length is n, big-O of time is needed. That means for all strings of that length, it is the worst case of that length.
Know what a polynomial time computable function is, and what it means for one language to be polynomial time reducible to another (as well as the notation).
Just means that the TM that outputs a thing on a tape runs in polynomial time polytime reduction means, we used a function where we give an instance of the input, what would be an input for one language, and we convert it to an input for another language, s.t. the original is in the language iff the output for the function is in the second language. It's a way of turning the problem in one domain into one in another domain. SAT to CLIQUE problem, for instance.
Know how to use TM configurations to form a computation history of a TM on an input.
Line them up one right after the other Starting config has the start state followed by the input then the next config would be if it moved right or what state it went into, make a list one after the other of the configs that logically followed, and that is the config history.
Decidable languages are closed under the operations union, concatenation, star, intersection, and complementation. Turing-recognizable languages are closed under the operations union, concatenation, star, and intersection. (Why the difference?)
Main difference is that decidable is closed under compliment, and recognizable is not. If it is recognizable, the compliment of the language is all the things that are not in it, for which the TM may be stuck in an infinite loop (can't recognize) You can make a TM flip its answers and it is still decidable though.
Know what the time complexity class N TIME(t(n)) is.
N time is just nondeterministic polynomial time. It has to be a polytime decider, but it can be a nondeterministic machine. If we have N TIME of and we have TIME of , which one is a subset of the other? TIME is a subset of N TIME If P=NP then they are subset equal
Know the definition of the class NP and how to show (both ways!) that a language is in NP . Understand what a verifier is, and what a certificate is.
NP is the class of languages that can be solved in Nondeterministic Polynomial Time Or they have a verifier Two definitions: 1) Can be decided in polynomial nondeterministic time gets the crazyness of nondeterminism to do exponential parallel things 2) Has a verifier, verifier is deterministic. Doesn't get the exponential splitting and trying of different things, but instead it gets a certificate, so it is a shortcut of sorts. Verifier, for anything that is in the language, there must exist some certificate that makes the verifier say yes. It doesn't mean that every certificate makes it say yes, but there must exist one. If you give it a bad answer, it may not work. For everything it should accept in the language, there has to be a certificate that says yes For everything that is not in the language, there cannot be ANY certificate that says yes. Every one of them must say no. A certificate is just an assignment of variables if the formula is not satisfiable, there cannot be any certificate that tricks the verifier. Both has to run in polytime If it is in the language, then for all, it's not true it needs to accept. It's just that there exists. If it's not, then everything makes it say no. We will see this exact problem It is on the quiz and the exam It's an important aspect of it [Make sure you check the quiz for this one]
Understand the definitions of NP-hard and of NP-complete and the two conditions necessary for a language to be NP-complete.
NP-hard means that for all languages in NP, given a language x that is NP-hard, all other languages in NP can be reduced to it. That tells us that x is at least as hard as all the other languages in NP So if we have a solver for it, it can't be easier than them, but it can potentially be harder than them. It sets an upper-bound. If it's in NP, then it is at least as hard as other things in NP To be NP-complete, you have to be NP-hard and also be in NP. If you're in NP and every NP problem can be turned into you, then you're exactly as all of them (up to polynomial time).
Which sets are countable?
Naturals, integers, rationals
Be able to determine big-O classifications of functions (and understand that they are upper bounds, and the difference between "tighter" and "looser" bounds). Being in a particular TIME class means a function is also in all larger TIME classes.
Normally we are interested in the tight one, but it is still not incorrect if the big-O is shown as larger. Big-O is a ceiling that says "this function runs faster than it"
Know what the PCP problem is, and understand the proof of why it is undecidable.
Post Correspondence Problem. Dominoes, building up strings know it is undecidable have a general memory why have these configs that follow each other dominoes can only possibly match if the TM have a config that lead to an accepting state.
Formal definition of a Turing Machine: A 7-tuple (Q, Σ, Γ, δ, q_0, q_accept, q_reject). What is Q?
Q is the set of states
A [___] is used to prove that languages are not decidable or recognizable.
Reduction For examples on reduction, review the proofs on E_TM and REG_TM and how they're undecidable. If empty TM was decidable, I can make a TM that uses it. Feed TM to it that answers ATM Take arbitrary TM and string as input make a new TM that has those built into it whatever input that TM receives, it accepts and rejects iff built in TM accepts the built in string. so if the built in TM accepts the built in string, the wrapper machine will accept everything, otherwise it will accept nothing. Take that TM and feed it to ETM, it can tell us if the language is empty that immediately gives us the answer if the built in TM accepts the built in language General idea is to check and see if this was decidable, then we can use this to break things.
Understand why the K complexity of a string x is never more than a constant value larger than the length of the string, and so is the K complexity of xx. Also understand why the K complexity of two strings xy is greater than just the sum of their individual K complexities and a constant value.
Reminder, we can concatenate, if we want two copies of a string that is pretty easy to do. The K-complexity is just a constant larger than the length of one copy of it. But if we want two different strings x and y, then the whole issue here is in coding. We need to know where does one or the other begin. We ended up getting to where we could add the log of the number of bits of the shorter of those two that can prefix and tell us where the first ends and the second begins. General ideas.
[___] proves that determining if a TM has a nontrivial property is undecidable. (Can't analyze TMs to learn anything about them because they're too powerful)
Rice's Theorem
Given a string x, know what the minimal description of x means, as well as the descriptive complexity (a.k.a. Kolmogorov complexity).
That is, the length of the representation of the pair of a TM and an Input to that TM s.t. when that TM runs on that input, we get the string of interest out. So what's the smallest TM and input pair that will give us that input back? The K complexity will never be a more than a constant longer than any string, because that constant could just be the definition of a trivial TM that does nothing but immediately halt and accept. You pair that with any input, then you have a TM that on that input halts and gives that string. So for any string, the k-complexity isn't more than a constant longer than the string itself. But for a lot of strings it is a lot shorter. You can compress some strings and make them very small.
How does a TM differ from other automata?
The main differences: 1) can read and write tape 2) head can move left or right 3) tape is infinite 4) accept or reject states take effect immediately Example: TM for B = {w#w | w∈ {0, 1}*} Given the following strings: Accept 0011010#0011010 Reject 0011010#00110100
Which sets are uncountable?
The reals and all interval of the reals are not countable Infinite binary sequences (languages over any alphabet with more than one symbol, because you can do the diagonalization argument).
TM Variants
There are several variants, and they're all equivalent. Base Model: → 1-way tape, {L, R} moves S-Model: → 1-way tape, {L, R, S} moves Multi-tape: → k-tapes → read write tapes for each: → δ:Q×Γ^k→Q×Γ^k×{L, R, S}^k
How does the TM SELF work?
This is just for the recursion theorem. It is possible for TMs to access their own definitions. Very powerful feature that a program can access its own code and print it, analyze it, use it, etc. Used this for a Kolmogorov complexity.
What does it mean for a function to be computable?
This means that there is a TM that gets an input, and rather than us caring about what it accepts or rejects, we care about what it left on the table. Computable function: there is some TM such that if you put it on the TM and it halts, the output is just on the table.
T/F: A language is Decidable if and only if some Nondeterministic Turing Machine decides it.
True
T/F: A language is Turing-Recognizable if and only if some Nondeterministic Turing Machine recognizes it.
True
T/F: Arbitrary mathematical objects can be encoded as strings which can be input to Turing Machines.
True
T/F: Every decidable language is Turing Recognizable
True
T/F: The collection of decidable languages (DEC) is closed under complementation
True
T/F: The collection of decidable languages (DEC) is closed under concatenation
True
T/F: The collection of decidable languages (DEC) is closed under intersection
True
T/F: The collection of decidable languages (DEC) is closed under star
True
T/F: Every Nondeterministic TM has a deterministic TM that is equivalent.
True This is a theorem that shows that nondeterminism doesn't give you more power, just more efficiency.
T/F: The number of Turing Machines is countable
True We know the number of TMs is countable because they're all finite strings.
A language is called [___], [___], or [___] if some Turing Machine decides it.
Turing-Decidable, Decidable (DEC), Recursive The TM will always halt, even if the answer is no For all inputs, it will halt.
A language is [___] if some Turing Machine recognizes it.
Turing-Recognizable a.k.a. Recursively Enumerable (RE) Recursively Enumerable is the biggest set of languages. If they can accept, they will. If they need to reject, they may reject, or they may get stuck in an infinite loop. So they never accidentally accept. To be Turing Recognizable is the same as being Enumerable An enumerator can spit out all the strings of a language, but it might not spit them out in order, and it can output any of them an infinite number of times. If you can explicitly enumerate them in order, then you can decide a language.
Understand that not all strings are compressible and the counting argument of why that is true. Know that in-compressible strings resemble random strings in many ways.
Understand that, for all lengths, there must be some strings that are in-compressible. We made a compressor and a de-compressor on HW and it came down to a counting problem. All strings of length , there are strings of length n, but the number of strings that are shorter is , so there are less shorter things, so we can't compress every one of them, because we can't get them back. Compress to a shorter thing, two of the strings we want to compress would map to the same thing, so we couldn't count it. Recall you can always make a compressor that compresses it to one bit if we want. It's hard coded. It just means we can't have a universal compressor for all strings of any length. Given length n, we can make a compressor and de-compressor, it will always be crummy on at least one string.
Know the definition of A_TM and how to show that it is not decidable via diagonalization (and therefore understand the proof technique of diagonalization).
Used diagonalization to show that (did more of the logical contradiction, ATM/Halting are undecidable, build new machine that uses the decider for the problem, put a negator that flips the answer. You can make that into a diagonalization argument, because when that machine simulates itself, the contradiction lies there, because it has been programmed to say the opposite of what it says. Know the general idea as to why that problem can't be solved).
Understand the (conjectured) differences between P and NP and their relationship.
We don't think they're subset equal, but if then they are. But NP is a subset of exponential time. We think it is potentially equal, but if P=NP then it is potentially smaller.
Be able to analyze an algorithm and determine its time complexity.
We have a problem were we are given the algo for a TM and we are asked to select a time complexity from a list. We should recognize where the loop will go over all the input, and it may do that because it's a nested loop or what in general, there is a multiplier for: instead if you're cutting off every other one, how will that grow? Understand when a loop requires iterations of it requires iterations. Nesting can make it require time that. Know basic analysis techniques.
Understand the relationship of NP to EXP TIME.
We think NP is with exponential time and there is a division between it and P.
Formal definition of a Turing Machine: A 7-tuple (Q, Σ, Γ, δ, q_0, q_accept, q_reject). What is q_0?
q_0 is the start state.
Formal definition of a Turing Machine: A 7-tuple (Q, Σ, Γ, δ, q_0, q_accept, q_reject). What is q_accept?
q_accept is the accept state.
Formal definition of a Turing Machine: A 7-tuple (Q, Σ, Γ, δ, q_0, q_accept, q_reject). What is q_reject?
q_reject is the reject state, where q_reject ≠ q_accept
Formal definition of a Turing Machine: A 7-tuple (Q, Σ, Γ, δ, q_0, q_accept, q_reject). What is Γ?
Γ is the tape alphabet, where _∈ Γ and Σ⊂Γ
Formal definition of a Turing Machine: A 7-tuple (Q, Σ, Γ, δ, q_0, q_accept, q_reject). What is Σ?
Σ is the input alphabet, does not include _ (blank)
Formal definition of a Turing Machine: A 7-tuple (Q, Σ, Γ, δ, q_0, q_accept, q_reject). What is δ?
δ is the transition function: Q×Γ→Q×Γ×{L, R}
Nondeterministic Turing Machines:
δ:Q×Γ→P(Q,Γ, {L, R}) Have multiple choices they can make. Like our other Nondeterministic machines, every comp. path is run in parallel, and the TM accepts if and only if at least one path accepts. Big change is you get the power-set on the right-hand side. States, cross symbols, cross directions.