CS325 - Parsing 2 (Grammar manipulation)
Please eliminate left recursion from the following grammar S -> Aa | b A -> Ac | Sd | Ɛ
(With order S, A) S -> Aa | b A -> Ac | Sd | Ɛ ===> (Substitute) A -> Ac | Aad | bd | Ɛ ===> (Elim left recur) A -> bd A' | Ɛ A' A' -> c A' | ad A' | Ɛ ===> (Final) S -> Aa | b A -> bd A' | A' A' -> c A' | ad A' | Ɛ
Name some ways to transform a grammar to make it suitable for top-down parsing (i.e. to reduce backtracking)
-Eliminate left recursion -Use a predictive grammar (a backtrack-free one) -Eliminate backtracking with left factoring
How do you eliminate Ɛ-productions?
-Identify nullable non-terminals in the RHS of each production -Create a new production, replacing the non-terminal with Ɛ e.g. S -> aA A -> Ɛ | a ===> S -> aA | a A -> a
Outline the general process for eliminating indirect left recursion
-Order all your nonterminals. -Go through them, replacing RHS nonterminals with the RHS of their productions. -Eliminate direct left recursion -Rinse and repeat until done
Eliminate left recursion from: A -> AR | AT | b
A -> bA' A' -> RA' | TA' | Ɛ
What is left recursion?
A grammar is left recursive if it has a nonterminal A such that there is some derivation A -*> Aa, for some string a i.e. if you can recursively choose the same nonterminal, and never terminate
Eliminate left recursion from: E -> E + T | T T -> T * F | F F -> ( E ) | id
E -> T E' E' -> + T E' | Ɛ T -> F T' T' -> * F T' | Ɛ F -> ( E ) | id
When does top-down parsing terminate?
Either: The fringe of the parse tree contains only terminal symbols, and the input stream is exhausted (success) Or: A mismatch occurs between the fringe of the partial tree and the input stream (either syntax error, or a wrong production was previously chosen)
nullable(a) = ? (where a is a nonterminal)
False
Why is left recursion bad for top-down parsers?
It can cause them to loop indefinitely, without generating a leading terminal symbol that the parser can match
How do you eliminate direct left recursion? i.e. eliminate left recursion from the following production: X -> X a | b
X -> b X' X' -> a X' | Ɛ Change the production to feature right recursion
Please eliminate Ɛ-productions from the following grammar: S -> XX | Y X -> aXb | Ɛ Y -> aYb | Z Z -> bZa | Ɛ
S -> XX | Y | X | Ɛ X -> aXb | ab Y -> aYb | ab | Z Z -> bZa | ba
Outline how bottom-up parsing works, with respect to a parse tree
Start at the leaves, and grow towards the root: -At each step, identify a contiguous substring at the parse tree's upper fringe which matches the RHS of a production -Built a node for the rule's LHS, and connect it to the tree
Outline how top-down parsing works, with respect to a parse tree
Start at the root, growing towards the leaves: -At each step, select some nonterminal node at the lower fringe of the tree -Extend it with a subtree, representing the RHS of a production that rewrites the nonterminal e.g. Expr ===> Expr | Factor
Out of the following grammar, which nonterminals are nullable? S -> XX | Y X -> aXb | Ɛ Y -> aYb | Z Z -> bZa | Ɛ
nullable(X) and nullable(Z) are trivially true nullable(Y) is true, since Y -> Z and nullable(Z) nullable(S) is true, since S -> Y and nullable(Y) ===> all of them!
nullable(abc.xyz) = ? (where "abc" and "xyz" are sequences of grammar symbols)
nullable(abc) AND nullable(xyz)
nullable(N) = ? (where N is a nonterminal)
nullable(n1) OR nullable(n2) OR ... where N has productions n1, n2, ...
nullable(Ɛ) = ?
True
When is nullable(A) true?
If A (nonterminal) can be expanded with Ɛ
What is the root of the parse tree?
The input grammar's start symbol!