Chapter 3: Describing Syntax and Semantics
What three extensions are common to most EBNFs?
1.) An optional part of the right-hand side can be placed in brackets. 2.) Braces can be used on the right-hand side to indicate that the enclosed part can be repeated indefinitely or left out altogether. 3.) Options (choices) can be placed in parentheses and separated by the OR (|) operator.
What does a compiler do?
1.) Contains a language recognizer that looks at a sequence of tokens to determine if it matches the syntax of the language 2.) Contains a tokenizer that breaks a string into lexemes in a process called lexical analysis 3.) Produces machine code
Explain the primary uses of a methodology and notation for describing the semantics of programming languages.
1.) Programmers need to know precisely what the statements of a language do before they can use them effectively in their programs. 2.) Compiler writers must know exactly what language constructs mean to design implementations for them correctly. 3.) If there were a precise semantics specification of a programming language, programs written in the language could be proven correct without test. Compilers could be should shown to produce programs that exhibit exactly the behavior given in the language definition (their correctness could be verified). 4.) A complete specification of the syntax and semantics of a programming language could be used by a tool to generate a compiler for the language automatically. 5.) Language designers could discover ambiguities and inconsistencies in their designs.
What are attribute computation functions?
Also known as semantic functions and associated with grammar rules, they are used to specify how attribute values are computed. A semantic function is applied as a rule is resolved.
If a grammar generates a sentence with more than one leftmost derivation or more than one rightmost derivation, then the grammar is what?
Ambiguous
What are predicate functions?
Associated with grammar rules, they state the static semantic rules of a language.
What does lexical analysis mean?
Breaking the code into tokens
What is a context-free grammar?
Chomsky's generative device for context-free languages
What happens when a programming language syntax is ambiguous?
Compiler writers will create different versions of compilers with different meanings.
BNF is almost identical to what class of grammars identified by Chomsky?
Context-free grammars
Parsing uses which grammars?
Context-free grammars
What does syntax do?
Defines all the legal sentences of a language; it must be precisely defined
Rules enforced at runtime are called what?
Dynamic
True or False: BNF is not a metalanguage.
False
What is a metasymbol?
Notational tools (brackets, braces, and parentheses)
What is a token?
A category of a language's lexemes (identifiers, including variables, methods, and classes; operators; and key words)
What is a leftmost derivation?
A derivation in which the replaced nonterminal is always the leftmost nonterminal in the previous sentential form
What is a rightmost derivation?
A derivation in which the replaced nonterminals is always the rightmost nonterminal in the previous sentential form
What is an attribute grammar?
A device used to describe more of the structure of a programming language than can be described with a context-free grammar; a BNF grammar plus attributes plus semantic functions
What is a grammar?
A formal language-generation mechanism for specifying syntax; a generative device for defining languages
What is BNF (Backus-Naur Form) grammar?
A formal notation for specifying programming language syntax developed by John Backus and modified by Peter Naur
What is an ambiguous grammar?
A grammar that generates a sentential form for which there are two or more distinct parse trees
What is a parse tree?
A hierarchical syntactic structure
What is a metalanguage?
A language used to describe other languages
A parser takes a stream of tokens and checks it against the language grammar by building what?
A parse tree
How is the order of evaluation of attributes determined for the trees of a given attribute grammar?
A parse tree of an attribute grammar is based on its underlying BNF grammar, with a possibly empty set of attribute values attached to each node.
Why is ambiguous grammar inappropriate for programming languages?
A programmer can write a compiler that will produce executable machine code, and another programmer can write a different compiler based on the same grammar and produce different executable code.
What is associativity?
A semantic rule used to specify which operator should have precedence when both operators have the same precedence
What is the difference between a sentence and a sentential form?
A sentence is a sentential form consisting only of terminals (lexemes). A sentential form is a string in a derivation; it may not consist of only terminals.
What is a sentence?
A sentential form consisting of only terminals, or lexemes
What is a derivation?
A sequence of rule applications that generate the sentences/statements of a language
What is a language?
A set of strings of characters from some alphabet
What is a start symbol?
A special nonterminal that indicates the beginning of a grammar
What is a nonterminal symbol?
A symbol that can be expanded or broken down further
What is a terminal symbol?
A symbol that cannot be expanded or broken down further; a character from the set of characters composing the language
What does an alphabet contain?
All the legal characters (digits, letters, and punctuation) in the language
What is the primary use of attribute grammars?
Describe more of the structure of a programming language than can be described with a context-free grammar
What does an unambiguous grammar remove?
Excessive recursion (Choose either left recursion or right recursion)
True or false: Programming language syntax can be ambiguous. There can be two meanings for one sentence.
False
True or false: Compilers do not look ahead at tokens coming in the stream to determine which rule to choose next.
False (Some compilers do look ahead.)
If all the attribute values in a parse tree have been computed, a parse tree is said to be what?
Fully attributed
What does static mean?
Happens at compile time
What does dynamic mean?
Happens at runtime
What are attributes?
Information that can be gathered about tokens; similar to variables in that they can have values assigned to them
Who are language descriptions for?
Initial evaluators, implementors, and users
Describe the operation of a general language generator.
It generates the sentences of a language. The particular sentence produced by a generator is unpredictable. A language generator can take a syntax as input and produce all possible legal sentences in the language.
What does decorating the parse tree mean?
It is the process of computing the attribute values of the parse tree.
Describe the operation of a general language recognizer.
It reads strings of characters from the language's alphabet. It indicates whether a given input string is or is not in the language. (It accepts or rejects the string.) A language recognizer can take a syntax and a sentence as input and tell you whether the sentence is legal.
Define a left-recursive grammar rule.
Its left-hand side also appears at the beginning of its right-hand side (specifies left associativity)
Define a right-recursive grammar rule.
Its left-hand side also appears at the end of its right-hand side (specifies right associativity)
Programming language recognition has what two components?
Lexical analysis and parsing
What takes regular expressions (regular grammar) as input and a stream of characters, and then recognizes tokens in the character stream and produces a stream of tokens from it?
Lexical analyzer (uses regular expressions to form a pattern matching process)
What two things make up the part of a compiler that recognizes if your program is syntactically valid?
Lexical analyzers and parsers
What are static semantics?
Meanings that are determined or enforced at compile time (operator precedence, associativity, operand evaluation order, automatic type conversions, and a variable must be declared before it is used)
Which two people, in unrelated research efforts, developed the same syntax descriptions for programming language syntax?
Noam Chomsky and John Backus
Every internal node of a parse tree is labeled with what kind of symbol?
Nonterminal symbol
What is a sentential form?
One of the strings in a derivation
Inherited attributes are used to do what?
Pass semantic information down and across a tree (The value of an inherited attribute on a parse tree node depends on the attribute values of that node's parent node and its sibling nodes.)
Synthesized attributes are used to do what?
Pass semantic information up a parse tree (The value of a synthesized attribute on a parse tree node depends only on the values of the attributes on that node's children nodes.)
What purpose do predicates serve in an attribute grammar?
Predicate functions state the semantic rules of a language.
In BNF, what is used to represent repetition?
Recursion
BNF describes what classes of grammars?
Regular and context-free
Which two classes of Chomsky's four language generator classes are useful for describing the syntax of programming languages?
Regular and context-free
Lexical analysis uses which grammars?
Regular grammars
What are Noam Chomsky's four classes of language generators?
Regular, context-free, context sensitive, and recursively enumerable
Distinguish between static and dynamic semantics?
Static semantics has to do with the legal forms of programs (syntax instead of semantics). The analysis required to check these specifications is done at compile time. Dynamic semantics is the meaning of expressions, statements, and program units. The analysis required to check these specifications is done at runtime.
What are intrinsic attributes?
Synthesized attributes of leaf nodes whose values are determined outside the parse tree
What does parsing mean?
Taking the tokens and trying to understand if together they make a legal sentence; parsing is compiling
Every leaf of a parse tree is labeled with what kind of symbol?
Terminal symbol
BNF is composed of what two types of symbols?
Terminal symbols (terminals-the lexemes and tokens) and nonterminal symbols (nonterminals-the abstractions in a BNF grammar indicated by angle brackets)
What is syntax?
The form of a programming language's expressions, statements, and program units; the rules for specifying which strings are in a language; rules that tell you if a sentence is not legal; the grammar of a language
What is a rule or production?
The left-hand and right-hand sides in a BNF grammar
What is semantics?
The meaning of a programming language's expressions, statements, and program units; the meaning of sentences
What is dynamic semantics?
The meaning of expressions, statements, and program units
What are identifiers?
The names of variables, methods, and classes
What is a lexeme?
The smallest (lowest-level) syntactic unit; the smallest syntactic unit that has meaning (numeric literals, variables, methods, classes, operators, and special words)
What are sentences?
The strings of a language; also known as statements
What is the left-hand side of a rule or production?
The text on the left side of the arrow; it is the abstraction begin defined and is a nonterminal/nonterminal symbol; it is the name of the rule
What is the right-hand side of a rule or production?
The text to the right of the arrow; it is the definition of the left-hand side (consists of a mixture of lexemes and tokens); contains a set of terminals and nonterminals; it is the expansion of the rule
True of false: A program is equivalent to a sentence.
True
True or False: All programming languages have a syntax that must be followed.
True
True or false: If the compiler can build a completed parse tree from the BNF rules, without any leftover tokens, then the sentence you gave it is valid.
True
True or false: Language developers must be able to specify the syntax so that others can understand it.
True
True or false: Many static semantics cannot be specified in BNF rules.
True
True or false: The lexical analyzer gives the compiler the token class and the string associated with it; it can also tell the compiler what line a token is on
True
When is a rule recursive?
When the rule's left-hand side appears in its right-hand side
What does left-associative mean?
When two operators have the same precedence, you work from left to right.
What does right-associative mean?
When two operators have the same precedence, you work from right to left.