Programming Language Concepts Chapter 4
addChar
Adds the character in nextChar to the end of lexeme.
Responsibilities of a syntax analyzer, or parser
Determine whether the input program is syntactically correct.Produce a parse tree.
State Diagram
Directed Graph (it recognizes names, integer literals, parentheses, and arithmetic operators)
Parsers are categorized by
Top Down Bottom up
A recursive-descent parser
is coded directly from the BNF description of the syntax of a language.An alternative is to use a parsing table rather than code.
Bottom-up parsers are often called
shift-reduce algorithms
Three ways to Implement Programming Languages
Compilation, Pure interpretation, and Hybrid implementation
lookup
Computes the token code for single-character tokens (parentheses and arithmetic operators).
Syntax Analyzer
Deals with large-scale constructs, such as expressions, statements, and program units.
Lexical Analyzer
Deals with small-scale language constructs, such as names and numeric literals.
getChar
Gets the next input character and puts it in a global variable namednextChar. Also determines the character class of the input character andputs it in the global variable charClass.
Efficiency
It becomes easier to optimize the lexical analyzer.
Which analyzer does the 3 use
Lexical and Syntax Analyzer
Mixed strings (terminals and/or nonterminals)
Lowercase Greek letters (α, β,γ, δ)
Terminal symbols
Lowercase letters at the beginning of the alphabet (a, b, ...)
Strings of terminals
Lowercase letters at the end of the alphabet (w, x, y, z)
Parsing
Often reffered to as Syntax Analysis
Bottom-up
Parsers build the tree from the leaves upward to the root.
Top-down
Parsers build the tree from the root downward to the leaves.
Simplicity
Removes the details of lexical analysis from the syntax analyzer which makes it smaller and less complex
Why do the compilers separate the analyzers?
Simplicity, Efficiency, and Portability
getNonBlank
Skips white space
Handle
The correct RHS to reduce
LL Parser
The first L in LL speci-fies a left-to-right scan of the input; the second L specifies that a leftmost deriva-tion is generated
Portability
The lexical analyzer reads source files, so it may be platform-dependent
State Diagram have:
The nodes are labeled with state names. The arcs are labeled with input characters. An arc may also include actions to be done when the transition is taken
Advantages of LR parsers
They can be built for all programming languages. They can detect syntax errors as soon as possible in a left-to-right scan. The LR class of grammars is a proper superset of the class parsable by LL pars-ers
Nonterminal symbols
Uppercase letters at the beginning of the alphabet (A, B
Terminals or nonterminals
Uppercase letters at the end of the alphabet (W, X,Y, Z)
simple phrase
a phrase that is derived from a non-terminal in a single step.
phrase
a string consisting of all of the leaves of the partial parse tree that is rooted at one particular internal node of the whole parse tree.
Lexemes
are recognized by matching the input against patterns.
Tokens
are usually coded as integer values, but for the sake of readability, theyare often referenced through named constants
A lexical analyzer collects
collects input characters into groups (lexemes) and assigns an internal code (a token) to each group
LR
he Lspecifies a left-to-right scan and the R specifies that a rightmost derivation is generated
pairwise disjointness test
used to test a non-left-recursive grammar todetermine whether it can be parsed in a top-down fashion. This test requirescomputing FIRST sets