compilers (final exam)
NFA: Example if non-determinism, construct NFA from REs using Thompson's rules, what are the 5 rules, cycle of construction sequence
1: one initial state 2: one final state 3: the number of states is linear in the size of E 4: The number of transitions leaving any state is at most two. 5: Any RE can be converted into an NFA
Which errors are handled at what level
1P: syntax errors arithmetic errors , like 1/0 2P:
AD-hoc syntax directed translation and usage of symbol table,
A common method of syntax-directed translation is translating a string into a sequence of actions by attaching one such action to each rule of a grammar.
Constant folding and constant propagation.
Constant folding: Evaluate an expression and replace the expression with the result during compile time constant propagation: Constants assigned to a variable can be propagated through the flow graph and substituted at the use of the variable.
Responsibilities of each of the passes and phases
First Pass: is refers as (a). Front end (b). Analytic part (c). Platform independent Second Pass: is refers as (a). Back end (b). Synthesis Part (c). Platform Dependent
DFA: drawing the DFA of a string, acceptance criteria, REs vs DFAs (why DFA needed)
RE does not always follow a set order of states so we can not work though it properly. in DFA each symbol is a specific state so we can now write code to work through this .
Errors that can be identified by the semantic analysis componen
Semantic Errors: Type mismatch Undeclared variables Reserved identifier misuse
DFA minimization: Hopcroft's algorithm
The Hopcrofts algorithm splits the states in a partition until it can not anymore, when they can no longer be split we now have the equivalence classes.
Terminologies: Tokens, Lexemes, types, microsyntax
Tokens: representations of the lexemes to be given to next step. lexemes: words within the language Types: microsyntax: describes how elements of the alphabet are grouped together, e.g. Alphabetic letters are grouped from left to right to form a word and a blank space terminates a word.
Dead code elimination, unreachable code elimination, which one improves execution time, what is the relationship between dead code and unreachable code?
Unreachable code : code that never gets executed Does not improve execution time Dead code: code that has no effect on program output May improve execution time
Given an example, identify the line numbers where different variables are live.
a var is live when it has a value that might be needed later.
Instruction scheduling
a way to optimize code by scheduling cerine instructions together in order to minimize blocking and what not. think pipe-lining
What is the benefit of loop unrolling?
better utilization of Instruction level parallelism, and reducing the loop overhead
Peephole optimization
changing only a small set of expensive instructions into more efficient instructions; follows the principles of amdahls law
Terminologies: compile time vs runtime
compile time: period where src is converted to executable run time: period when executable is running
Input, output, and the responsibilities of semantic analysis
input:parse tree and symbol table purpose: type checking, label checking
Difference between a compiler and interpreter
interpreter: directly impliments code from source file compiler: translates code from source file into machine code.
Responsibilities, errors that can be handled by scanner
it will catch syntax errors and stop intermediate code generation and point to error
Left factoring vs left recursion
left factoring: factors out prefixes that are common in two or more productions left recursion: you derive the rhs back into the lhs i.e. Expression ::= Expression '*' Expression
The input to different passes, and phases of a compiler
lexical: takes in char string, outputs tokens syntactical: takes in tokens, outputs syntax errors and builds parse tree semantic: takes in parse tree and symble table, outputs type mismatch error, incompatible operands, a function called with improper arguments, an undeclared variable, etc. Intermediate Code Generation: end of first pass input:annotated syntax tree output: intermediate code Code Optimization: input: intermediate code output: optimized code that takes fewer resources Code Generation: input: optimized intermediate code output: machine code (mips or arm)
Examples of instructions that are expensive and their potential alternatives (e.g., moves instead of loads and stores
load and store are expensive as they operate directly on memory
Types of assembly instruction: memory, arithmetic, branch
memory: load and store are the only mem ops in mips arithmetic: operate only on registers branch: move the program counter around based on condition
What are the properties of every transformation (safety, profitability, applicability)? What is NOT a property of the compiler transformations?
saftey: it should not chnge the out put of the original code profitability: it should have some ammount of speed up or reduction of resourses needed applicability: it should be able to be used at least once, the more uses the better
Context free grammar (CFG):
set of grammar rules used to generate strings of terminals and non terminals
Given an example identify the edges where a variable is live (following the example on the slides)
simple code tracing
When is procedure inlining beneficial? Why is it beneficial?
small code body, function is called substantial number of times Reduced function calling overhead
What is spatial locality, what is temporal locality?
spatial: physical distance between pieces of data temporal:distance in time between the uses of pieces of data.
Why abstraction layers are important
they keep the complex aspects of compilation from view of the user and other portions of the project
What are the goals/benefits of instruction selection and scheduling?
to optimize the code being produced to use less resources, i.e. cycles and memory
Top-down parsing: with backtracking, definition, example, left recursion and elimination,
top down w/ backtracking gets rid of ambiguity by checking each terminal and its alternative until it finds a match and does not move on until it finds a match.
Types of parsing algorithm - top down and bottom up, which one is recursive, which one is table driven (bottom up)
top down: recursive bottom up: table driven
What are different types of compiler transformations (machine dependent, independence, high-level, low-level) and examples of each
when applied: high level: Loop interchange Tiling low level: Loop unrolling Procedure inlining Applicability: mach dependant: Register Allocation instruction scheduling mach independant: Dead code elimination Constant propagation and constant folding
How is precedence handled in CFG, rules to enforce precedence
you can enforce precedence by applying right and left recursion to operators.