CS 424 Test 1

Ace your homework & exams now with Quizwiz!

Understand basic BNF and EBNF notation. (I will give you the meaning of EBNF meta-symbols) Be able to interpret simple grammars and say whether certain strings are part of the language described by the grammar Be able to write a simple grammar given a description Be able to do a leftmost or rightmost derivation of a string given a grammar definition Be able to draw a parse tree of a string given a grammar definition I WILL ADD THE HW QUESTIONS HERE

-

When given a DFSA, be able to tell whether or not it recognizes a given string Be able to show the moves it makes

-

What are lexical and syntax analyzers? What are some reasons to separate lexical and syntax analysis? How do they work together? NOT FINNISHED

- A low-level part called a lexical analyzer(mathematically, a finite automaton based on a regular grammar) • A lexical analyzer is a pattern matcher for character strings • A lexical analyzer is a "front-end" for the parser • Identifies substrings of the source program that belong together - lexemes - A high-level part called a syntax analyzer, or parser (mathematically, a push-down automaton based on a context-free grammar, or BNF) Reasons to Separate Lexical and Syntax Analysis: • Simplicity - less complex approaches can be used for lexical analysis; separating them simplifies the parser • Efficiency - separation allows optimization of the lexical analyzer • Portability - parts of the lexical analyzer may not be portable, but the parser always is portable • The lexical analysis and syntax analysis phases are typically interleaved - The lexical analyzer is usually a function that is called by the parser when it needs the next token

What is a lexeme and a token? What are the various token categories that are generally found in programming languages? What other things need to be recognized by a lexer? Be able to show how to split a statement into lexemes

- Identifies substrings of the source program that belong together - lexemes - Lexemes match a character pattern, which is associated with a lexical category called a token Token Categories: ▪ Identifiers▪ Literals▪ Keywords▪ Operators▪ Punctuation Other things lexer needs to recognize: ▪ Whitespace: (e.g., space or tab)▪ Comments: (e.g., // to end-of-line)▪ End-of-line▪ End-of-file

What is the difference between operator precedence and associativity? Be able to determine the precedence and associativity of operators in a BNF grammar by looking at at.

- Precedence: which operator is evaluated first, for example, in the expression "a + b / c" - Associativity: evaluation order for adjacent operators that have equal precedence, for example, in the expression "a - b + c" Slide 50 Ch3 p1

What do top-down and bottom-up parsers do?

- Top down - produce the parse tree, beginning at the root • Order is that of a leftmost derivation • Traces or builds the parse tree in preorder - Bottom up - produce the parse tree, beginning at the leaves • Order is that of the reverse of a rightmost derivation

What is the von Neumann bottleneck?

- When the connection speed is slower than the execution speed

What is the definition of a recognizer?

A device that reads input strings over the alphabet of the language and decides whether the input strings belong to the language

What is the definition of metalanguage

A metalanguage is a language used to define other languages. - A grammar is a metalanguage used to define the syntax of a language (it's a language generator)

What does it mean for a problem to be decidable? How does that relate to why context-sensitive and unrestricted grammars are not appropriate for developing translators?

A problem is decidable if you can write an algorithm that is guaranteed to solve the problem in a finite number of steps (guaranteed to halt) context sensitive and unrestricted grammars are too free-flowing or subject to unique meanings under special contexts to ever reach an exhaustive proof for a sentence's validity. Common language can be used to explain how a language will handle an ambiguity in the grammar, but the grammar itself cannot be deterministic (decidable) at these high levels of language

Which operator has the lowest precedence in the following grammar? Start symbol is <Expr> <Expr> → <Expr> + <Term> | <Term> <Term> → <Factor> * <Term> | <Factor> <Factor> → <Primary> - <Factor> | <Primary><Primary> → <Primary> / <Digit> | <Digit> <Digit> → 0 | 1 | ... | 9 A. + B. * C. - D. /

A. +

What is a disadvantage of LL parsers? A. Cannot handle left recursion B. Cannot handle right recursion C. Cannot handle BNF extensions D. None of the above

A. Cannot handle left recursion

Which stage of compilation takes source code as input and produces lexemes and tokens as output? A. Lexical analysis B. Syntax analysis C. Semantic analysis D. Code optimization E. Code generation

A. Lexical analysis

What is used first to determine which operators are evaluated first in an expression such as a + b -d / c? A. Precedence B. Associativity

A. Precedence

Which grammar is typically used to define the lexical syntax of a language? A. Regular grammar B. Context-free grammar C. Context-sensitive grammar D. Unrestricted grammar

A. Regular grammar

What is abstract syntax? In which phase is each type of syntax used?

Abstract Syntax: • An internal representation of the program, which favors content over form - Derived during parsing (syntax analysis) • Abstract syntax is used during the semantic analysis phase

Advantages and disadvantages of Hybrid Implementation Systems?

Advantages: Faster than pure interpretation, A high-level language program is translated to an intermediate language that allows easy interpretation Disadvantages: slower execution than compiled, slower translation than pure interpretation it's a compromise, so it gives part of one to receive the benefits of part of the other

Advantages and disadvantages of Pure Interpretation?

Advantages: No translation up front, Easier implementation of programs (run-time errors can easily and immediately be displayed) Disadvantages: Often requires more space, Slower execution (10 to 100 times slower than compiled programs)

Advantages and disadvantages of Compilation?

Advantages: fast execution Disadvantages: System dependent, Slow translation, Most user programs also need additional code to execute

What are the advantages of an abstract syntax tree representation over a parse tree?

An AST is used to analyze the source code and optimize it for compilation. A parse tree is used to check the syntactic correctness of the source code. AST Purpose: an intermediate form of the source code represented by a simpler grammar than concrete syntax don't have to follow the full bnf grammar, and can leave off a lot of the 'syntactic sugar' You can have ambiguity in abstract syntax trees. An AST is easier to comprehend and is more Expressive than a parse tree, as it contains more information about the source code. An AST is simpler than a parse tree, as it is generated from fewer operations. An AST is smaller than a parse tree, as it contains fewer symbols. An AST is more efficient than a parse tree, as it requires fewer operations to generate.

Which of the following is typically used in the syntax analysis phase of compilation? A. Lexical syntax B. Concrete syntax C. Abstract syntax D. None of the above

B. Concrete syntax

Ambiguity should never occur within a programming language's grammar. A. True B. False

B. False

What is a language used to define other languages called? A. Syntax B. Metalanguage C. Metadata D. Metacharacter

B. Metalanguage

What does the R is LR parser stand for? A. Does right-to-left scan of input B. Produces rightmost derivation C. Requires right recursive grammar D. None of the above

B. Produces rightmost derivation

Is the + operator in the following grammar left or right associative? <Expr> → <Expr> - <Term> | <Term> <Term> → <Digit> + <Term> | <Digit> <Digit> → 0 | 1 | ... | 9 A. Left B. Right C. Neither (non-associative)

B. Right

a. Define a context free grammar (using either BNF or EBNF) for a set of identifiers. The identifiers must start with either a letter, or an underscore _ followed by one letter The identifiers can only consist of _, lowercase letters, and numerical digits. Make sure your grammar includes all of the 4 parts of a valid grammar.

BNF Identifier -> Letter IdStr | _Letter IdStr | Letter | _Letter IdStr -> ValidChar IdStr | ValidChar ValidChar -> - | 0| . . . | 9 | Letter Letter -> A | . . . | Z | a | . . . | z EBNF IdStr -> (Letter | _Letter) {ValidChar} Letter -> A | . . . | Z | a | . . . | z ValidChar -> _ | 0 | . . . | 9 | Letter

Which of the following is an internal representation of a program, favoring content over form? A. Lexical syntax B. Concrete syntax C. Abstract syntax D. None of the above

C. Abstract syntax

Which of these is the leastpowerfulgrammar in the Chomsky hierarchy? A. Context-sensitive grammar B. Context-free grammar C. Regular grammar D. Unrestricted grammar

C. Regular grammar

Which regex matches floating point numbers with an optional sign? (? is 0 or 1; + is 1 or more; * is 0 or more) A. [+-][0-9]*\.[0-9]+ B. [+-]?[0-9]*\.[0-9]* C. [+-]?[0-9]+\.[0-9]+ D. [+-][0-9]+\.[0-9]+

C. [+-]?[0-9]+\.[0-9]+

What are the 3 implementation methods?

Compilation:• Programs are translated into machine language• Use: Large commercial applications Pure Interpretation: Programs are interpreted by another program known as an interpreter Use: Small programs or when efficiency is not an issue Hybrid Implementation Systems: A compromise between compilers and pure interpreters; includes JIT systems Use: Small and medium systems when efficiency is not the first concern

What is concrete syntax? In which phase is each type of syntax used?

Concrete Syntax: •The representation of a language's programs using lexical symbols as its alphabet - i.e., how the symbols are put together to write statements, expressions, and programs. • Concrete syntax is used during the syntax analysis (parsing) phase

Which type of grammar is typically used for syntax analysis

Context-free grammar (BNF)

b. Derive the following identifier, using your grammar, with a leftmost derivation: _foo_

EBNF IdStr -> _Letter ValidChar ValidChar VC -> _f ValidChar ValidChar VC -> _f Letter ValidChar VC -> _fo ValidChar VC -> _fo Letter VC -> _foo VC -> _foo__

AFTER THIS IS AN OLD TEST, NOT ALL OF THIS WILL BE ON THE TEST BECAUSE DIFF THINGS WERE COVERED

HEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE

What are the 4 main language categories? What is a feature of each?

Imperative and Object-Oriented - Functional - Logic - Markup/Programming Hybrid

How does a recursive descent parser work?

It is a kind of Top-Down Parser. A top-down parser builds the parse tree from the top to down, starting with the start non-terminal. The basic idea of recursive-descent parsing is to associate each non-terminal with a procedure. The goal of each such procedure is to read a sequence of input characters that can be generated by the corresponding non-terminal, and return a pointer to the root of the parse tree for the non-terminal. The structure of the procedure is dictated by the productions for the corresponding non-terminal. The procedure attempts to "match" the right hand side of some production for a non-terminal. To match a terminal symbol, the procedure compares the terminal symbol to the input; if they agree, then the procedure is successful, and it consumes the terminal symbol in the input (that is, moves the input cursor over one symbol). To match a non-terminal symbol, the procedure simply calls the corresponding procedure for that non-terminal symbol (which may be a recursive call, hence the name of the technique)

What is JIT?

Just-in-Time Compilation Systems Initially translate programs to an intermediate language Then compile the intermediate language of the subprograms into machine code when they are called Machine code version is kept for subsequent calls JIT systems are widely used for Java programs

What is the difference between LL and LR parsers? Know advantages and disadvantages of each

LL is a top-down parser - Recursive descent - a coded implementation - A table driven implementation - Less complex and can be hand written - In an LL parse, it's usually pretty easy to emit useful compiler errors - in LL parsing, error recovery is a lot easier LR is a bottom-up parser LR parsing can handle a larger range of languages than LL parsing, and is also better at error reporting LR parsers detect errors fast. Drawback: it is too much work to construct an LR parser by hand

How does ambiguity in a grammar cause problems? How do different languages solve that problem?

Languages will avoid ambiguity by restructuring the language to avoid the ambiguity. The grammar can be ambiguous as long as the language isn't. Ambiguity can result in different compilers resolving the ambiguity differently which causes programs to work differently on different systems. Dangling else ambiguity Algol 68, Modula, Ada: use explicit delimiter to end every conditional (e.g., if...fi) Java: rewrite the grammar to limit what can appear in a conditional using 2 different kinds of if statements Algol 60, C, C++, Pascal: Provide an informal English-language description of how the attachment should be made Dangling else ambiguity

What is lexical syntax? In which phase is each type of syntax used?

Lexical Syntax: • Describes the rules for basic language symbols (names, operators, literals, punctuation, etc.) • Lexical syntax is used during the lexical analysis phase

How does the imperative paradigm mirror the von Neumann architecture?

NOT IN SLIDES

a. What is a binding? b. Give an example of a binding. c. For your example, identify a typical time in the life of a program the binding may occur. If the binding time differs for different programming languages, state that it depends on the programming language, and discuss examples of times at which it can be bound.

NOT IN SLIDES

List the 4 languages in the Chomsky Hierarchy. Which 2 are typically used in programming language design?

Regular grammar (used in lang design), Context-free grammar (used in lang design), Context sensitive grammar, Unrestricted Grammar

Which type of grammar is typically used for lexical analysis

Regular grammar - least powerful

What are the steps in the compilation process? What each phase does, including the input and output of each phase

Source program down Lexical analysis: converts characters in the source program into lexical units. Input is Source program, output is Lexical Units Syntax analysis: transforms lexical units into parse trees which represent the syntactic structure of program. Input is Lexical Units, output is Parse Trees. Semantic analysis: generate intermediate code. Input is Parse Trees, Output is Intermediate code. Code generation: machine code is generated. Input is Intermediate code, Output is Machine language. That Machine language goes into a computer along with input data which outputs the results. There is also a symbol table that gets input from Lexical analysis and Syntax analysis and outputs to the Semantic analysis and the Code generation. There is finally an optional Optimization that gets input from the Semantic analysis and outputs to the intermediate code.

a. Write a regular expression that recognizes real (floating point) numbers. The regular expression should recognize numbers with digits before and after the decimal point (like 10.53) with digits after the decimal point but not before (like .36) The grammar should not accept as valid a decimal point by itself (like .) a number with no digits after the decimal point (like 9.)

[0-9] *\. [0-9]+

An advantage of a(n)_____________ language is increased performance /efficiency. a. Compiled b. Interpreted c. None of the above

a. Compiled

___________ refers to when an expression in an assignment operation is evaluated to a value, which is copied to the target of the assignment. a. Copy semantics b. Reference semantics c. None of the above

a. Copy semantics

____________ allows the programmer to be concerned mainly with the interface between a function and what it computes, ignoring the details of how the computation is accomplished a. Procedural abstraction b. Data abstraction c. Object decomposition d. None of the above

a. Procedural abstraction

______________ is the late binding of a call to one of several implementations of a method in an inheritance hierarchy. a. Runtime polymorphism b. Parametric polymorphism c. None of the above

a. Runtime polymorphism

A design technique for imperative programming is _____________ a. Stepwise refinement b. Object decomposition c. None of the above

a. Stepwise refinement

______________ is the representation of a language's programs using lexical symbols as its alphabet. a. Lexical syntax b. Concrete syntax c. Abstract syntax

b. Concrete syntax

____________ is a mechanism that allows logically related constants, types, variables, methods, and so on, to be grouped into a single unit. a. Inheritance b. Encapsulation c. Information hiding d. Data abstraction

b. Encapsulation

An advantage of _________ is increased flexibility for the programmer. a. Early binding b. Late binding c. None of the above NOT ON THIS TEST

b. Late binding

During ___________, each source program instruction is decoded and performed one at a time when the program is executed. a. Compilation b. Pure Interpretation c. Hybrid Interpretation

b. Pure Interpretation

The phase of compilation that takes a set of tokens as input and produces a parse tree or abstract syntax tree as output is____________ . a. Lexical analysis b. Syntactic analysis c. Semantic analysis d. Code generation

b. Syntactic analysis

_________________ is an internal representation of a program, favoring content over form. a. Lexical syntax b. Concrete syntax c. Abstract syntax

c. Abstract syntax

One of the purposes of ____________ is to protect the rest of a program from implementation changes (the implementation may change, but interface doesn't). a. Inheritance b. Encapsulation c. Information hiding d. Data abstraction

c. Information hiding

_____________ are the local variables of a class. a. Global variables b. Class variables c. Instance variables d. Local variables

c. Instance variables

_______________ is used in object-oriented programming to model the is-a relationship. a. Data abstraction b. Encapsulation c. Information hiding d. Inheritance

d. Inheritance

____________ is a distinguishing characteristic of imperative programming languages. a. Statements are commands b. Program state is modified c. Statements closely reflect machine language d. a and c both e. a, b, and c

e. a, b, and c

a. Draw a parse tree for the expression: 6 + 8 % 2

no

#4 I dont have the fancy version of quizlet so no diagram pictures, pg 9 on the practice exam

peepee

b. Draw a Deterministic Finite State Automata that accepts the same real numbers as your grammar.

srry

What is the Chomsky Hierarchy of grammars? Which two grammars are typically used for programming languages and why?

the Chomsky hierarchy is a containment hierarchy of classes of formal grammars. Regular grammar (regular expressions) - least powerful •Used to define lexical syntax (used during lexical analysis) Context-free grammar (BNF) •Used to define concrete syntax • Used in syntax analysis Concrete syntax uses lexems derived from the lexical syntax alphabet to form sentences Context-sensitive grammar •Able to express some type rules Unrestricted grammar - most powerful •Can express all features of a language Context sensitive and unrestricted grammars are too fuzzy to utilize for the finite automaton required for programming languages and compilers

What is the definition of syntax

the form or structure of the expressions, statements, and program units

What is the definition of semantics

the meaning of the expressions, statements, and program units

Whats feature of the Imperative and Object-Oriented language category?

• Central features are variables, assignment statements, and iteration • Include languages that support object-oriented programming• Include scripting languages • Include the visual languages • Examples: C, Java, Perl, JavaScript, Visual BASIC .NET, C++

Whats feature of the Functional language category?

• Main means of making computations is by applying functions to given parameters • Examples: LISP, Scheme, ML, F#, Haskell

Whats feature of the Markup/Programming Hybrid language category?

• Markup languages have extensions to support some programming • Examples: Java Server Pages Standard Tag Library (JSTL), eXtensible StylesheetLanguage (XSLT)

Whats feature of the Logic language category?

• Rule-based (rules are specified in no particular order) • Example: Prolog, SQL

What is the definition of generators

•A device that generates sentences of a language •One can determine if a particular sentence is syntactically correct by comparing it to the structure of the generator

List three types of tokens that a typical lexical analyzer (lexer) will identify.

▪ Identifiers▪ Literals▪ Keywords▪ Operators▪ Punctuation

Be able to draw a DFSA when given a description of the strings it should recognize

-

Be able to write a regular expression, given a description of what it should match Be able to interpret a regular expression and say whether certain strings would cause a match (Given the regular expression notation)

-

What are the definitions of the three types of semantics? Advantages and disadvantages of each? (e.g., operational semantics is complex if used formally)

1. Operational Semantics - Describe the meaning of a program by executing its statements on a machine, either simulated or actual. - The change in the state of the machine (memory, registers, etc.) defines the meaning of the statement - To use operational semantics for a high-level language, a virtual machine is needed - A better alternative: A complete computer simulation • Uses of operational semantics - Language manuals and textbooks - Teaching programming languages • Evaluation - Good if used informally (language manuals, etc.) - Extremely complex if used formally (e.g.,VDL) 2. Denotational Semantics • Based on recursive function theory • The most abstract semantics description method • The meaning of language constructs are defined by only the values of the program's variables • Can be used to prove the correctness of programs • Provides a rigorous way to think about programs • Can be an aid to language design • Has been used in compiler generation systems • Because of its complexity, it is of little use to language users 3. Axiomatic Semantics • Based on formal logic (predicate calculus) • Original purpose: formal program verification • Axioms or inference rules are defined for each statement type in the language (to allow transformations of logic expressions into more formal logic expressions) • The logic expressions are called assertions • An assertion before a statement (a precondition) states the relationships and constraints among variables that are true at that point in execution • An assertion following a statement is a postcondition • A weakest precondition is the least restrictive precondition that will guarantee the postcondition • Developing axioms or inference rules for all of the statements in a language is difficult • It is a good tool for correctness proofs, and an excellent framework for reasoning about programs, but it is not as useful for language users and compiler writers

List two different ways to solve ambiguity that may occur in a programming language grammar. Give a very brief description (one or two sentences) of each.

1.) In the grammar - rewrite the grammar rules to remove ambiguity 2.) Outside the grammar - explain how the ambiguity should be resolved in left or some other way

List the 3 main characteristics of an object oriented programming language

1.) Supports an encapsulation mechanism with information hiding by defining abstract data types. (classes) 2.) Dynamic method binding / polymorphism 3.) Inheritance


Related study sets

Focus on Vocabulary - Why are you happy?

View Set

NURS417 Ch10: Therapeutic Communication and Relationships

View Set

APES Unit 3 Quizlet Based on Resources

View Set

Sport Finance Final Exam Questions

View Set