CSCE-221H Midterm
Balanced search trees vs. basic binary search tree
AVL + Red-Black provide worst case logarithmic time for the main operations, while the basic BST has simpler insert + remove algorithms
Big-Omega Notation
BEST-CASE
Big-Theta Notation
BEST-CASE + WORST-CASE are the SAME
How does the "search-tree sort" algorithm work?
By inserting the items to be sorted into a search tree and then outputting the items in the tree using inorder traversal
Applications of different traversal on any kind of search tree
Inorder -> output the elements of the tree in sorted order Postorder -> compute the height of each node in the tree (need height of nodes' children to define its own height) Preorder -> comput the depth of each node in the tree
[Q4] What is the main data structure used in class to convert postfix expression to an expression tree?
Stack
depth + height of a Tree + it's nodes
TREE: > depth = max. depth of any node > height = height of the root NODES: > depth = number of edges between root and node > height = length of longest path from node to a leaf
[Q2] What is the asymptotic BEST running-time for binary search, assuming A is already in memory? (use big-theta notation)
Theta(1) > occurs if 'x' is in the middle position
[Q2] Recall that Euclid's Algorithm takes as input two integers 'm' + 'n' and outputs gcd(m,n); what is the asymptotic WORST running-time as a function of 'n'? (use big-theta notation)
Theta(log n)
[Q2] What is the asymptotic WORST case running-time for binary search, assuming 'A" is already in memory (use big-theta notation)
Theta(log n)
Big-Oh Notation
WORST-CASE
[Q2] What is one reasonable built-in data structure that can be used to implement the list abstract data type
array (or linked list?)
AVL Tree
binary search tree (BST) in which the height of the left subtree and the right subtree of any node differs by at most 1 Height: O(log n)
Red-Black Tree
binary search tree (bst) in which every "missing child" is replaced with a dummy sentinel node, every data node has exactly two children, and the nodes are colored according to a set of rules (described later)
[Q1] What syntactic feature in C++ captures the idea of an ADT?
classes
WORST-CASE Running-Time of Binary Search Tree (BST)
contains(x) -> Theta(log n) findMax() -> Theta(n) findMin() -> Theta(n) insert(x) -> Theta(n) remove(x) -> Theta(n)
Running-Time of Linked-List vs. Array
findPos(k) better in array, insertFront(x) + removeFront() better in linked-list
What is asymptotic analysis?
focusing on the "abstract" measure of time as the input size grows infinitely
Queue ADT used for ___
graph breadth-first search
[Q1] What are the two informal simplifications done as part of the asymptotic analysis of a function?
ignore ... (1) multiplicative constants (2) lower-order terms
What is a vector?
improved version of an array that supports resizing & bounds-checking for indexing
Why use asymptotic analysis?
input size growing infinitely results in multiplicative constants and lower order terms becoming negligible
Running-Time of AVL Tree
insert(x) -> O(log n) remove(x) -> O(log n)
Handling Hash Table Collisions with Separate Chaining
keep a singly-linked list of all elements that hash to the same value (each index holds a pointer to the head of a linked list holding the keys that coincide with that position)
Implementations of Queue?
linked-list array with both head + tail indices circular-array
Queue ADT implemented by _____
list
A linked-list is built with ...
nodes (holds data, pointer to next node, and sometimes pointer to prev node)
[Q3] What is the main advantage of postfix notation over infix notation?
parantheses are not necessary to indicate order of operations
O(n) of Linked-List
printAll() -> O(n) findVal(x) -> O(n) findPos(k) -> O(n) insertFront(x) -> O(1) insertBack(x) -> O(1) removeFront() -> O(1) removeBack() -> O(n) (singly-linked) / O(1) (doubly-linked list)
Methods of Stack?
push(x) pop() top() size()
Running-Time of Red-Black Tree
search -> O(log n) insert -> ??
Binary Tree
special case of Tree in which every node has at MOST 2 children
Abstract Data Type (ADT) covers ....
state of an object + operations that act on the object
[LW] Time complexity of failed binary search?
still O(log n)
B-Trees
store more than one data item in each tree node (NOTE: barely mentioned)
Definition + Application of Inorder Traversal of a Tree
useful in binary trees visit left child, then current node, then right child app: use a binary tree to represent an arithmetic expression
Definition + Application of Postorder Traversal of a Tree
visit each node AFTER recursively traversing the subtrees app: calculating the total amount of disk space used by a directory in a file system
Definition + Application of Preorder Traversal of a Tree
visit each node BEFORE recursively traversing the subtrees app: printing names of files + subdirectories in a file system with indentation to indicate structure
What are data structures?
ways to organize data (information)
How is C++ container "list" implemented?
with a doubly-linked list
[Q2] What property must be true of list A in order for binary search to work correctly
'A' must be in sorted order
[Q2] What are two likely explanations for why experimental results might show faster running times than an asymptotic analysis?
(1) asymptotic analysis was too pessimistic (2) analysis is just an estimation and experimental results show the full spectrum of possibilities
Rules of Red-Black Tree coloring
(1) every node is colored red or black (2) the root is black (3) the children of a red node are black (4) every leaf (sentinel node) is black (5) every path from the root down to a leaf has the same number of black nodes
What is an iterator?
- abstraction of a pointer - crucial for using containers
List ADT implemented by _____
- array - linked list
[Q1] What is the justification for focusing primarily on worst-case behavior of an algorithm instead of best-case or average-case?
- best-case is too optimistic - average-case is too difficult to figure out
Priority Queue ADT implemented by ____
- binary heap > (tree or array) - skip list
Search Tree ADT implemented by ____
- binary search tree (bst) - AVL tree - B-tree - red-black tree ... (etc.)
Dictionary ADT used for ____
- database systems - file systems - operating system page table - compiler symbol table ... (etc.)
Search Tree ADT used for _____
- dictionary
Why C++?
- efficient (built on C, close to hardware) - powerful (built-in data abstraction) - flexible (supports generic programming) - standard template library
Applications of Stack?
- evaluating postfix arithmetic expressions, - checking for syntax errors related to balancing punctuation in a program - converting infix to postfix - runtime system handling of function calls
Stack ADT used for _______
- expression evaluation - run-time system
Main functions of search trees
- hold a collection of data items - insert new items - remove existing items - find item in the collection
Applications of Queue?
- job queues to access sservices such as printers - output stream - fundamental graph theory algorithms
Stack ADT implemented by ______
- list
Dictionary ADT implemented by _____
- search tree - hashing schemes
[LW] Advantages of linked list?
- size easily adjusted while program runs - avoids overallocating space - inserting can be done in constant time
Priority Queue ADT used for ____
- sorting - selection - discrete event simulation
List ADT used for _____
- stack - queue - sorting
[Q1] What are the two parts of an Abstract Data Type (ADT)
- state - operations
Creating hash functions
=> table size should be a prime number => info about the actual items needs to be stored => function should depend on the entire key, not just part of it
Array-Based Lists vs. Linked-Lists
> array-based lists are faster at direct access (indexing), but require extra space to avoid frequent copying of the array to resize it > linked-lists are faster at inserting/removing from arbitrary positions in the list, but require extra space for the pointers
[LW: T/F ] Searching for two different values of 'x,' neither of which is present in a[1..N], always takes the same number of steps to determine failure
False
[Q2: T/F ] The specification of an abstract data type should indicate which implementation is the fastest
False
What is a Stack?
Last IN, First OUT a list in which elements can be added to the end only and removed from the end only
Asymptotic Order of Big-O
O(c) O(log(n)) ... O(n) O(n*log(n)) O(n^2) ... O(2^n) O(3^n) ...
[LW] Time complexity of successful binary search?
O(log n) > cuts the tree/list/etc. in half with each element checked
Average-Case Running-Time of Binary Search Trees (BST)
O(log n), if sequences in which 'n' data elements are inserted are equally likely AND no deletions are done
[LW] f(n) = 3^(log_2(n)) is ....
O(n^2), NOT Omega(n^2) or Theta(n^2)
Binary Search Tree (BST)
a binary tree with data items drawn from a totally ordered set - leftChild < curNode - rightChild > curNode
What happens when the number of elements stored in the array exceeds its capacity
a new array with typically double the capacity is allocated and the old data is copied into this new array (average running-time is constant)
What is an Algorithm?
a sequence of well-defined steps to solve a problem
What is psuedocode?
a style of describing algorithms that is easy to translate into any particular computer language
Little-Oh Notation
analog for "<" (less than)
Little-Omega Notation
analog for ">" (greater than)