LEXICAL ANALYSIS
DISADVANTAGES
1. Complexity 2. Limited error detection 3. Increased code size 4. Reduced Flexibility
How Lexical Analyzer works-
1. Input preprocessing 2.Tokenization 3.Token classification 4. Token validation 5. Output generation
The role of Lexical Analyzer in compiler design
1. read character streams from the source code, 2. check for legal tokens, 3. pass the data to the syntax analyzer when it demands. 4. Helps to identify token into the symbol table 5. Removes white spaces and comments from the source program 6. Correlates error messages with the source program 7. Read input characters from the source program
Lexical error
A character sequence which is not possible to scan into any valid token is a lexical error.
Efficiency
A lexer may do the simple parts of the work faster than the more general parser can. Furthermore, the size of a system that is split in two may be smaller than a combined system.
regular expression
An algebraic notation for describing sets of strings
Semantic error
An error in a program that makes it do something other than what the programmer intended.
Runtime error
An error that occurs while the program is running after being successfully compiled
Languages for Specifying lexical Analyzer
An example of such tools is LEX
Lexical Analysis can be implemented with the
Deterministic finite Automata
Advantages of lexical analyzer
Efficiency Flexibility Error Detection Code optimization
Reasons for keeping lexical and syntax analysis phases separate:
Efficiency: Tradition Modularity
Logical error
Error where the instructions given to the program does not accomplish the intended goal
tradition
Languages are often designed with separate lexical and syntactical phases in mind, and the standard documents of such languages typically separate lexical and syntactical elements of the languages.
Lexical analysis
Lexical Analysis is the first phase of the compiler. It converts input string into a sequence of Tokens each of which corresponds to a symbol in the programming language
Lexical analyzer/lexer
Programs that perform Lexical Analysis in compiler design are called lexical analyzers or lexers. A lexer contains tokenizer or scanner. Additionally, it will filter out whatever separates the tokens (the white-space), and comments
Modularity
The syntactical description of the language need not be cluttered with small lexical details such as white-space and comments.
Construction of Lexical Analyser
There are 2 general ways to construct lexical analyser: • Hand implementation • Automatic generation of lexical analyser
How to write a lexer by hand
You first read past initial white-space, then you, in sequence, test to see if the next token is a keyword, a number, a variable or whatnot.
Error Recovery techniques in Lexical Analyzer
error recovery techniques: * Removes one character from the remaining input * By inserting the missing character into the remaining input * Replace a character with another character * interchange two serial characters
The biggest drawback of using Lexical analyzer is that
it needs additional runtime overhead is required to generate the lexer tables and construct the tokens
Lexer generator
lexers are normally constructed by lexer generators, which transform human-readable specifications of tokens and white-space into efficient programs.