ICS365 - Chapter 3

Ace your homework & exams now with Quizwiz!

Statements or Sentences

-The strings of a language -The syntax rules of a language specify which statements are in the language

Ambiguous grammar

<assign> → <id> = <expr> <id> → a | b | c <expr> → <expr> + <expr> | <id> a = a + b + c ›This grammar is ambiguous because the associativity of + is not defined (the compiler needs to know what to be computed first a + b first or b + c first)

Left Associativity

<assign> → <id> = <expr> <id> → a | b | c <expr> → <expr> + <term> | <term> <term> → <id> ›This grammar is not ambiguous anymore -There is only one parse tree for each assignment ›The operation + is left associative

BNF Rules

<assign> → <var> = <expression> ›<assign> abstraction is defined as an instance of the abstraction <var> followed by the lexeme = followed by an instance of the abstraction <expression> ›Example sentence whose syntactic structure is described by the rule is Total = subtotal1 + subtotal2 ›Nonterminals: the abstractions of BNF rules ›Terminals: lexemes and tokens of BNF rules ›BNF Grammar or description: a finite non-empty set of rules ›Nonterminals can have two or more distinct definitions <if_stmt> → if(<logic_expr>)<stmt> <if_stmt> → if(<logic_expr>)<stmt> else <stmt> ›Multiple definitions can be written as single rule separated by the symbol | <if_stmt> → if(<logic_expr>)<stmt> | if(<logic_expr>)<stmt> else <stmt>

Grammars Ambiguity Example 2

<expr> → <expr> <op> <expr> | const <op> → / | -

BNF Grammars Example

<program> begin <stmt_list> end <stmt_list> → <stmt> | <stmt>; <stmt_list> <stmt> → <var> = <expression> <var> → a | b | c <expression> → <var> + <var> | <var> - <var> | <var> ›Some of statements (sentences) generated by this grammar -begin a = b + c end -begin b = a - c; a = b end -begin b = a; c = a - b; a = a + b end

Language

A set of strings of characters from some alphabet

Left Associativity Example

For the assignment a = a + b + c the BNF derivations are: <assign> => <id> = <expression> => a = <expression> => a = <expression> + <term> => a = <expression> + <term> + <term> => a = <term> + <term> + <term> => a = <id> + <term> + <term> => a = a + <term> + <term> => a = a + <id> + <term> => a = a + b + <id> => a = a + b + c

BNF Derivations (Rightmost Derivations) example2

For the grammar <assign> <id> = <expr> <id> a | b | c <expr> <id> + <expr> | <id> * <expr> | ( <expr> ) | <id> To generate the sentence a = b * (a + c) A right most derivation of a program in this language follows <assign> => <id> = <expr> => <id> = <id> * <expr> => <id> = <id> * (<expr>) => <id> = <id> * (<id> + <expr>) => <id> = <id> * (<id> + <id>) => <id> = <id> * (<id> + c) => <id> = <id> * (a + c) => <id> = b * (a + c) => a = b * (a + c)

Right Associativity

Raising to power is right associative and highest precedence <E> → <E> + <T> |<T> <T> → <T> * <F> |<F> <F> → <G> ** <F> | <G> <G> → id

A lexeme

The lowest level syntactic unit of a program (operator, identifier)

Lexemes and Tokens

[Lexemes | Tokens] index | Identifier = | equal_sign 2 | int_literal * | mult_op count | identifier + | plus_op 17 | int_literal ; | semicolon

Language Generators

›A device used to generate the sentences of a language ›Language generator is unpredictable ›determine if the syntax of a particular sentence is syntactically correct by comparing it to the structure of the generator

Grammars Ambiguity

›A grammar is ambiguous if and only if it generates a sentential form that has two or more distinct parse trees ›The following grammar <assign> → <id> = <expr> <id> → a | b | c <expr> → <expr> + <expr> | <expr> * <expr> | ( <expr> ) | <id> Is ambiguous because for the sentence a = b + c * a there is two distinct parse trees

Parse Trees

›A hierarchical representation of a derivation ›Every internal node is labeled with a nonterminal ›Every leaf is labeled with a terminal ›The parse tree of the Grammar <assign> <id> = <expr> <id> a | b | c <expr> <id> + <expr> | <id> * <expr> | ( <expr> ) | <id> For the statement a = b + (a + c) is

EBNF Multiple Choice ()

›Alternative parts of RHSs are placed inside parentheses and separated via vertical bars BNF <term> → <term> * <factor> | <term> / <factor> | <term> % <factor> EBNF <term> → <term>(* | / | %) <factor>

Attribute Grammars

›Attribute grammar is a device used to describe more of the structure of a programming language than a context-free grammar ›Extension to a context free grammar ›Conveniently describes certain language rules

EBNF Example

›BNF <expr> → <expr> + <term> | <expr> - <term> | <term> <term> → <term> * <factor> | <term> / <factor> | <factor> <factor> → <exp> ** <factor> | <exp> <exp> → (<expr>) | id ›EBNF <expr> → <term> {(+ | -) <term>} <term> → <factor> {(* | /) <factor>} <factor> → <exp>{ ** <exp>} <exp> → (<expr>) | id ›In BNF the rule <expr> → <expr> + <term> is forced to be left associative ›In EBNF <expr> → <term>{+ <term>} does not imply the direction of the associativity ›This problem is overcome by a syntax analyzer based on an EBNF grammar

BNF Fundamentals

›BNF is Metalanguage -language that is used to describe another language ›Uses abstractions for syntactic structures ›Example: Simple Java assignment statement -might be represented by the abstraction <assign> <assign> → <var> = <expression> -Left Hand Side (LHS): the text on the left side of the arrow -Right Hand Side (RHS): the text on the right side of the arrow -Rule or production: is the assignment statement definition ›<var>, <expression> must be defined to make <assign> definition useful

Backus-Naur Form and Context Free Grammar

›BNF is a natural notation for describing syntax ›BNF is still the most popular method of concisely describing programming language syntax ›BNF is nearly identical to Chomsky's generative devices for context-free language ›BNF is called context-free grammars

BNF Derivations

›Derivation is a sequence of grammar's rules applications starting from start symbol and ending with a sentence in the language ›Each successive string is derived from the previous string by -replacing one of the nonterminals with one of that nonterminals definition ›Each of the strings in the derivation is called a -sentential form ›The replaced nonterminal is always the leftmost nonterminal ›Derivations that use this order are called leftmost derivations ›The derivation continues until the sentential form contains nonterminals ›The sentential form consisting of only terminals, or lexemes is the generated sentence

BNF Grammars

›Generative device for defining languages ›The sentences generated through a sequence of applications of the rules -Begins with a special nonterminal of the grammar called start symbol -This sequence is called a derivation -In a grammar for a complete programming language, the start symbol represents a complete program and is often named <program>

Backus-Naur Form BNF

›In 1958 John Backus introduced a new formal notation for specifying programming language syntax to describe ALGOL 58 ›In 1960 John Backus notation was slightly modified by Peter Naur for the description of ALGOL 60 -This revised method of syntax description is known as Backus-Naur Form or BNF

BNF Derivations (Leftmost Derivations) example1

For the grammar <program> → begin <stmt_list> end <stmt_list> → <stmt> | <stmt>; <stmt_list> <stmt> → <var> = <expression> <var> → A | B | C <expression> → <var> + <var> | <var> - <var> | <var> To generate the sentence begin a = b + c; b = c end A leftmost derivation of a program in this language follows <program> => begin <stmt_list> end => begin <stmt>; <stmt_list> end => begin <var> = <expression>; <stmt_list> end => begin a = <expression>; <stmt_list> end => begin a = <var> + <var>; <stmt_list> end => begin a = b + <var>; <stmt_list> end => begin a = b + c; <stmt_list> end => begin a = b + c; <stmt> end => begin a = b + c; <var> = <expression> end => begin a = b + c; b = <expression> end => begin a = b + c; b = <var> end => begin a = b + c; b = c end

Static Semantic Example

In Ada programming language the rule is the name on the end of an Ada procedure must match the procedure's name Syntax rule: <proc_def> → procedure <proc_name>[1] <proc_body> end <proc_name>[2]; Predicate: <proc_name>[1].string = <proc_name>[2].string

A token

The category of the lexemes

Static Semantic

›In BNF there are some characteristics of programming languages that are difficult or impossible to describe -Example in Java: floating point value cannot be assigned to an integer type variable but the opposite is legal ›This is difficult, requires additional non-terminal symbols and rules -The common rule all variables must be declared before they are referenced ›This rule cannot be specified in BNF ›Indirectly related to the meaning of program during execution -Syntax rather than semantics ›So named because the analysis required to check these specifications can be done at compile time ›Attribute grammar is one formal approach to describing and checking the correctness of static semantics rules of a program Attribute Grammars are context-free grammars with added 1.Attributes: associated with grammar symbol 2.Attribute computation functions (semantic functions): associated with grammar rules, used to specify how to compute attribute values 3.Predicate functions: states the static semantic rules of the language are associated with grammar rules

Unambiguous Grammar for if-else

›Java if-then-else grammar <stmt> → <if_stmt> <if_stmt> → if (<logic_expr>) <stmt> | if (<logic_expr>) <stmt> else <stmt> -Ambiguous ›An unambiguous grammar for if-then-else <stmt> → <matched> | <unmatched> <matched> → if (<logic_expr>) <matched> else <matched> | a non-if statement <unmatched> → if (<logic_expr>) <stmt> | if (<logic_expr>) <matched> else <unmatched>

Context Free Grammars

›Mid of 1950's ›Noam Chomsky, a noted linguist -Described four generative devices (grammars) that define four classes of languages -Two of these grammar classes are used for describing syntax of programming languages -Context-free grammar and Regular grammar ›Forms of tokens can be described by Regular grammars ›The syntax of the whole programming language can be described using Context-free grammar

Backus-Naur Form and Context Free Grammars

›Middle to late 1950's ›Noam Chomsky (Linguist) ›John Backus (Computer Scientist) ›Unrelated research efforts ›Developed the same syntax description formalism -Used widely in programming languages syntax

Language Recognizers

›One distinct way to formally define a language ›Suppose -L is a language -L uses an alphabet ∑ of characters › We need to construct a mechanism R, called recognition device to define L formally using the recognition method -R can read strings of characters from the alphabet ∑ -R will indicate whether a given input string was in L or not ›The recognizer R would either accepts or reject the given string -Works like filter -Separates legal sentences from the incorrectly formed sentences ›R accepts strings of characters over ∑ if and only if these strings are in the language L -Then R is a description of L ›The syntax analysis part of a compiler is a recognizer for the language that compiler translates -Determines whether a given program is in the language or not

EBNF Optional []

›Optional parts are placed in brackets [ ] BNF <if_stmt> → if (<logic_expr>) <stmt> | if (<logic_expr>) <stmt> else <stmt> EBNF <if_stmt> → if (<logic_expr>) <stmt> [else <stmt>]

EBNF Repetition {}

›Repetitions (0 or more) are placed inside braces { } BNF <ident_list> → identifier | identifier, <ident_list> EBNF <ident_list> → identifier { ,<ident_list>}

BNF Describing lists

›Syntactic lists are described using recursion <ident_list> → identifier | identifier, <ident_list> -This rule defines <ident_list> as either a single token identifier or -identifier followed by a comma and other instance of <ident_list> ›A rule is recursive if its LHS appears in its RHS

Extended BNF (EBNF)

›The extensions do not enhance the descriptive power of BNF ›It increases the readability and the writability ›Three extensions are included in the various versions of EBNF -Optional option [] -Repetition option {} -Multiple choice option ()

Operator Precedence

›When an expression includes two different operators x + y * z ›Order of evaluating the two operators is a semantic issue ›This semantic question can be answered by -Assigning different precedence levels to operators ›The lower operator in the parse tree has the higher precedence Unambiguous Grammar: <assign> → <id> = <expr> <id> → a | b | c <expr> → <expr> + <term> |<term> <term> → <term> * <factor> |<factor> <factor> → ( <expr> ) | <id> Now for the expression a = b + c * a there is only one parse tree:

Associativity of Operators

›When an expression includes two operators having same precedence ›A semantic rule is required to specify which should have precedence a = b + c + a Two types of associativity: ›Left associativity: when the leftmost is evaluated first -In BNF the left most of RHS should be the same as LHS <expression> → <expression> + <id> | <id> ›Right associativity: when the right most is evaluated first -In BNF the rightmost of RHS should be the same as LHS <expression> → id + <expression> | <id>


Related study sets

Strategies of Teaching Early Childhood Education

View Set

Chapter 22: Imperialism and Colonialism, 1870-1914

View Set

Canadian Government Review - Socials 10

View Set

Microbiology Chapter 6 Part 2 HW

View Set

Chapter 27: Growth and Development of the Preschooler - ML6

View Set