CSE2100 Final Exam Review
applications of doubly linked list
- A great way to represent a deck of cards in a game. - The browser cache which allows you to hit the BACK button (a linked list of URLs) - Applications that have a Most Recently Used (MRU) list (a linked list of file names) - A stack, hash table, and binary tree can be implemented using a doubly linked list. - Undo functionality in Photoshop or Word (a linked list of state) -time sharing of a computer among several users
AVL tree
-AVL trees are balanced -an AVL tree is a binary search tree such that for every internal node c of T, the heights of the children of v can differ by most 1
Updates with linear probing
-DEFUNCT: replaces deleted elements -remove(k): -we search for an entry with key k -if such an entry (k, o) is found, we replace it with the special item DEFUNCT and we return the element o -else, we return null -put(k,o) -we throw an exception if the table is full -we start at celll h(k) -we probe consecutive cells until one of the following occurs: -a cell i is found that is either empty or stores DEFUNCT, or -N cells have been unsuccessfully probed -we store (k,o) in cell i
Queue
-FIFO -insertions are at the rear, removals are at the front -can get first element of queue with first() but cannot get last
hash codes
-Java objects' memory address all have default hash codes -good for generic, except for numeric and string keys
deterministic selection
-O(n) worse-case time -recursively use the selection algorithm itself to find a good pivot for quick select: -divide S into n/5 sets of 5 each -find a median in each set -recursively find the median of the "baby" medians
(2,4) trees
-a (2,4) tree is a multi-way search tree with the following properties: -Node-Size Property: every internal node has at most four children -Depth Property: all the external nodes have the same depth -depending on the number of children, an iinternal node of a (2,4) tree is called a 2-node, 3-node, or a 4-node
height of (2,4) tree
-a (2,4) tree storing n items has height O(logn) proof: let h be the height of a (2,4) tree with n items -since there are at least 2^i items at depth i=0, ..., h-1 and no items at depth h, we have: n>= 1 + 2 + 4 + ... + 2^(h-1) = 2^h - 1 -thus, h <= log(n+1) -searching in a (2,4) tree with n items takes O(logn) time
Binary Search Tree
-a binary search tree is a binary tree storing keys( or key-value entries) at its internal nodes and satisfying the following properties: -let u, v, and w be three nodes such that u is in the left subtree of v and w is in the right subtree of v...we have: key(u) <= key(v) <= key(w) -external nodes do not store items -an inorder traversal of a binary search tree visits the keys in increasing order
divide-and-conquer
-a general algorithm design paradigm: -divide: divide input data in two disjoint subsets S2 and S2 -recur: solve the subproblems associated with S1 and S2 -ConquerL combine the solutions for S1 and S2 into a solution for S -the base case for the recursion are subproblems of size 0 or 1
map
-a map models a searchable collection of key-value pairs -the main operations of a map are for searching, inserting, and deleting items -multiple entries of the same key are not allowed -we can implement it with a doubly linked list of arbitrary order
randomized algorithm
-a randomized algorithm performs coin tosses(ie random bits) to control its execution -it contains statements of the type: b=random() if b = 0 do A... else //b=1 do B... its running time depends on the outcomes of the coin tosses -we analyze the expected running time of a randomized algorithm under the following assumptions: -the coins are unbiased, and -the coin tosses are independent -the worst-case tunning ime of a randomized algorithm is often large but has very low probability(eg, it occurs when all the coin tosses give "heads") -we use a randomized algorithm to insert items into a skip list
quick select
-a randomized selection algorithm based on the prune-and-search paradigm: prune: pick a random element x(called pivot) and partition S into : L: elements less than x E: elements equal to x G: elements greaeter than x Search: depending on k, either answer is in E, or we need to recur in either L or G
search table
-a search table is an ordered map implemented by means of a sorted sequence: -we store the items in an array-based sequence, sorted by key -we use an external comparator for the keys
skip list summary
-a skip list is a data structure for maps that uses a randomized insertion algorithm -in a skip list with n entries: -the expected space used is O(n) -the expected search, insertion, and deletion time is O(logn) -using more complex probabilistic analysis, one can show that these performance bounds also hold with high probability -skip lists are fast an simple to implement in practice
iterator and methods
-a software design pattern that abstracts the process of scanning through a sequence of elements, one element at a time -hasNext() - returns true if there is at least one additional element in the sequence, false otherwise -next() - returns the next element in the sequence -Java defines a parameterized interface, named Iterable, that includes the following single method: -iterator() - returns an iterator of the element in the collection -an instance of a typical collection class in Java, such as an ArrayList, is iterable(but not itself an iterator); it produces an iterator for its collection ad the return value of the iterator() method
radix sort
-a specialization of lexicographic sort that uses bucket sort as the stable sorting algorithm in each dimension -radix sort is applicable to tuples where the keys in each dimension are integers in the range [0, N-1] -radix sort runs in O(d(n+N)) time
array-based stack
-add elements left to right -throws FullStackException
set methods
-add(e)- adds the element e to S(if it does not already exist) -remove(e)- removes the element e from S if it is present -contains(e) - returns whether e is an element of S -iterator() - returns an iterator of the elements of S -there is also support for traditional mathematical set operations of union, intersection, and subtraction of two sets S and T -addAll(T)- updates S so that it contains all elements of set T, effectively replacing S by S U(Union) T -> S or T -retainAll()- updates S so that it contains only elements that are in both S and T, effecetively replacing S with S ^ T -> S and T -removeAll(T)- updates S removing all of the elements that also occur in T, effectively replacing S by S-T
applications of map
-address book -student record book
effect of growth rate by changing hardware/software
-affects T(n) by a constant factor -does not alter the growth rate
lower bound of comparison based sorting
-any comparison-based sorting algorithm takes at least log(n!) time - so any comparison-based sorting algorithm must run in Omega(nlogn) time since log(n!) >= log(n/2)^(n/2) = (n/2)log(n/2)
applications of binary tree
-arithmetic expression trees -decision tree
Doubly linked list
-can be traversed forward and backward -you can add in between nodes
binary search
-can perform nearest neighbor queries on an ordered map that is implemented with an array -at each step, the number of candidate items is halved -terminates after O(logn) steps
radix sort for binary numbers
-consider a sequence of b-bit integers x= x(b-1) ... x1 x0 -we represent each elements as a b-tuple of integers in the range [0,1] and apply a radix sort of N=2 -this application of the radix sort algorithm runs in O(bn) time -for example, we can sort a sequence of 32 bit integers in linear time
performance of BST
-consider an ordered map with n items implemented by means of a binary search tree of height h: -the space used is O(n) -methods get, put, and remove take O(h) time -the height h is O(n) in the worst case and O(logn) in the best case
seven common runtime functions
-constant -1 -log - log(n) -linear -n -n-log-n - nlog(n) -quadratic -n^2 -cubic - n^3 -exponential- 2^n
AVL tree performance
-data structure uses O(n) space -a single reconstructing takes O(1) time, using a linked-structure binary tree -searching takes O(logn) time, height of tree is O(logn), no restructures needed -insertion takes O(logn) time: -initial find is O(logn) -restructuring up the tree, maintaining height is O(logn) -removal takes O(logn) time -initial find is O(logn) -restructuring up the tree, maintaining heights is O(logn)
underflow and fusion
-deleting an entry from a node v may cause an underflow, where node v becomes a 1-node with one child and no keys -to handle an underflow at node v with parent u, we consider two cases: Case1: the adjacent siblings of v are 2-nodes: -Fusion operation: we merge v with an adjacent siblind w and move an entry from u to the merged node v' -after a fusion, the underflow may propagate to the parent u Case2: an adjacent sibling w of v is a 3-node or a 4-node: Transfer operation: 1) we move a child w to v 2) we move an item from u to v 3) we move an item from w to u -after a transfer, no underflow occurs
double hashing
-double hashing uses a secondary hash function d(k) and handles collisions by placing an item in the first available cell of series: (i + jd(k)) mod N for j = 0, 1, ..., N-1 -the secondary hash function d(k) cannot have zero values -the table size N must be a prime to allow probing of all cells -common choice of compression functionfor the secondary hash function: d2(k) = q - k mod q where q < N q is a prime The possibe values for d2(k) are 1, 2, ... ,q
positional list implementation
-doubly-linked list
four thinfs a node object stores(for binary tree)
-element -parent node -left child node -right child node -implemented by the Position ADT
three things a node object stores(for tree)
-element -parent node -sequence of children nodes -implemented by the Position ADT
Queue methods
-enqueue(object)- adds elements -dequeue() - removes first object in queue -first()- returns element at the front without removing it -size() - returns the number of elements stored(throws IllegalStateExceotion if array-based queue is full) -isEmpty() - indicates whether no elements are stored
positional list methods
-first() - returns the position of the first element of L(or null if empty) -last() - returns the position of the last element L(or null if empty) -before(p) - returns the position of L immediately before position p(or null if p is the first position) -after(p)- returns the position of L immediately after position p(or null if p is the last position) -isEmpty() - returns true if list L does not contain any elements -size() -addFirst(e) - inserts new element at the from of the list, returning the position of the new element -addLast(e) - inserts a new element at the back of the list, returning the position of the new element -addBefore(p, e) - inserts a new element e in the list, just before position p, returning the position of the new element -addAfter(p,e): inserts a new element in the list just after position p, returning the position of the new element -set(p,e) - replaces the element at position p with element e, returning the element formerly at position p -remove(p) - removes and returns the element at position p in the list, invalidating the position
skip list
-for a set of S distinct (key, element) items, it is a series of lists S0, S1, ... Sk such that: -each list Si contains the special keys +infinity and -infinity -list S0 contains the keys of S in nondecreasing order -each list is a subsequence of the previous one -list Sk contains only the two special keys
Euler Tour traversal
-generic traversal of a binary tree -includes a special cases the preorder, postorder, and inorder traversal -walk around the tree and visit each node three times: -on the left(preorder) -from below(inorder) -on the right(postorder)-
map methods
-get(k) -if the map M has an entry with key k, return its associated value; else, return null -put(k,v) - insert entry (k,v) into the map M; if key k is not already in M, then return null; else, return old value associated with k -remove(k) - if map M has an entry with key k, remove it from M and return its associated value; else, return null -size() -isEmpty() -entrySet() - returns an iterable collection of the entries in M -keySet()- returns an iterable collection of the keys in M -values()- returns an iterator of the values in M -put, get, and remove take O(n) time since in the worst case you will traverse through each element of the map and not find an element with the given key
multimap methods
-get(k)- returns a collection of all values asscoiated with key k in the multimap -put(k,v) - adds a new entry to the multimap associating key k with value v, without overwriting any existing mappings for key k -remove(k, v) - removes an entry mapping key k to value v from the multimap if one exists -removeAll(k) - removes all entries having key equal to k from the multimap -size()- returns the number of entries of the multiset, including multiple associations -entries() - returns a collection of all entries in the multimap -keys() - returns a collection of keys for all entries in the multimap, including duplicates for keys with multiple bindings -keySet() - returns a nonduplicative collection of keys in the multimap -values() - returns a collection of values for all entries in the multimap
Singly linked list
-has head and tail node -can only go to the next element, not back -removes elements from the front, there is no constant time way to remove elements at the tail because they need to have the new tail point to the previous node -can add to the beginning and end of list
trinode reconstruction
-in notebook and slides, difficult to visualize
AVL insertion
-insertion is as in a binary tree -always done by expanding an external node -but it is then followed by trinode reconstruction
entry and methods
-key-value pair in queue -priority queues store entries to allow for efficient insertion and removal based on keys -it is an interface -getKey() returns the key for this entry -getValue() - returns the value associated with this entry
total order relations of priority queue
-keys in a priority queue can be arbitrary objects on which an order is defined -two distinct entries in a priority queue can have the same key -mathematical concept of total order relation -comparability property: either x<= y or y <= x -antisymmetric property: x <= y and y <= x -> x = y -transitive property: x <= y and y <= z -> x <= z
Stack
-last in, first out
additional methods of binary tree(compared to tree)
-left(p) - returns the position of the left child of node p -right(p) - returns the position of the right child of node p -sibling(p) - returns the right child of node p's parent if node p is the left child of its parent, and returns the left child of node p's parent if p is the right child of its parent -these methods return null if there is no left, right, or sibling of p
separate chaining
-let each cell in the table point to a linked linst of entries that map there -separate chaining is simple, but requires additional memory outside of the table
generic merge
-merging two sets takes O(nA + nB) time, where nA and nB are the number of elements in set A and B respectively -this is given that the auxiliary methods for the sets run in O(1) time -auxiliary methods: -aIsLess -bIsLess -bothAreEqual -interesection and union can be implemented using generic merge
throw vs throws
-methods will "throws" exceptions -lines of code in the method will "throw" the exception -a try catch statement can be used to handle exceptions -if a method throws an exception, you need a catch block -"throw" will instantiate an exception -"throws" can be implicit to the method
properties of proper binary trees
-n - number of nodes -e - number of external nodes -i - number of internal nodes -h - height properties: - e = i + 1 - n = 2e - 1 (two most important properties) - h <= i - h <= (n-1)/2 - e <= 2^h - h >= log(base 2) e - h >= log(base 2) (n+1) - 1
linear probing
-open addressing: the colliding item is placed in a different cell of the table -linear probing: handles collisions by placing the colliding item in the next (circularly) available table cell -each table cell inspected is referred to as a "probe" -colliding items lump together, causing future collisions to cause a longer sequence of probes
multi-way search tree
-ordered tree such that: -each internal node has at least two children and stores d-1 key-element items(ki, oi), where d is the number of children -for a node with children v1 v2 ... vd storing keys k1 k2 ... kd-1 -keys in the subtree of v1 are less than k1 -keys in the subtree of vi are between ki-1 and ki(i=2, ..., d-1) -keys in the subtree of vd are greater than kd-1
applications of trees
-organization charts -file systems -programming environments
applications of stacks
-page-visited history in a web browser -undo sequence in a text editor -chain of method calls in the Java Virtual Machine(this is what allows for recursion) -parentheses matching -HTML tag matching indirect applications: -auxiliary data structure for algorithms -component of other data structures
Stack methods
-push(object)- inserts an element -pop() - removes and returns the last inserted element -top() - returns the last inserted element without removing it -size() - returns the number of elements stored -isEmpty() - indicates whether no elements are stored
quick-sort
-randomized sorting algorithm based on the divide-and-conquer paradigm: Divide: pick a random element x(called pivot) and partition S into: L: elements less than x E: elements equal to x G: elements greater than x -Recur: sort L and G -Conquer: join L, E, and G -partition steps take O(n) time
removal of AVL tree
-removal begins as in a binary search tree, which means the node removed will become an empty external node. Its parent, w, may cause an imbalance -trinode reconstruction
tree terminology
-root- node without a parent -internal node - node with at least one child -external node(leaf) - node without children -ancestors of a node- parent, grandparent, grand-grandparent, etc. -depth of a node - number of ancestors -height of a tree- maximum depth of any node -descendant of a node - child, grandchild, grand-grandchild -subtree- tree consisting of a node and its descendants
search table performance
-searches take O(logn) time, using binary search -inserting a new item take O(n) time, since in the worst case we have to shift n items to make room for the new item -removing an item takes O(n) time, since in the worst case we have to shift n items to compact the items after the removal -the lookup table is effective only for ordered maps of small size or for mpas on which searches are the most common operation, while insertions and removals are rarely performed (eg: credit card authorization)
performance of hashing
-searches, insertions, and removals on a hash table take O(n) time -the worst case occurs when all the keys inserted into the map collide -the load factor a = n/N affects the performance of a hash table -assuming that the hash values are like random numbers, it can be shown that the expected number of probes for an insertion with open addressing is: 1/(1-a) -the expected running time of all the dictionary(Java map) ADT operations in a hash table is O(1) -in practice, hashing is very fact provided the load factor is not close to 100%
multi-way searching
-similar to search in a binary tree -each internal node with whildren v1 v2 ... vd and keys k1 k2 ... kd-1: -k=ki(i=1, ..., d-1): the search terminates successfully -k < k1: we continue the search in child v1 -ki-1 < k < ki(i=2, ... , d-1): we continue the search in child vi -k > kd-1: we continue the search in child vd -reaching an external node terminates the search unsuccessfully
list methods
-size() -isEmpty() -get(i) - gets element at index i; error is i is not on range -set(i,e) - replaces element at index i with element e; error if index i i is out of range -add(i, e) - inserts a new element into the list at index i, moving all subsequent elements one index later in the list; error occurs if i is not in range -remove(i) - removes and returns element at index i; error if i is not in range
tree methods
-size() -isEmpty() -iterator() -positions() - returns an Iterable of positions -root() -parent(p) - returns the position of node p's parent -children(p)- returns an iterable of the node at position p's children -numChildren(p) - returns the number of children of the node at position p -isInternal(p) - returns true if the node at position p is internal, false otherwise -isExternal(p)- returns true if the node at position p is external, false if otherwise -isRoot(p) - returns true is the node at position p is the root of the tree (ie if it has no parent), and false otherwise -parent(p),children(p), numChildren(p), isInternal(p), isExternal(p), and isRoot(p) all "throws" IllegalArgumentException
applications of hash tables
-small databases -compilers -browser caches
array-based list
-space is O(n) -add and remove run in O(n) -in the case when the array is full when add() is called, we can replace the array with a larger one -best done by doubling the list(amortized time of push would be O(1) as apposed to if we incremented by c it would be O(n))
stack performance
-space used is O(n) -each operation runs in O(1)
applications of priority queues
-standby flyers -auctions -stock market
priority queue and methods
-stores a collection of entries -each entry is a pair (key, value) -insert(k, v) - inserts an entry with a key and a value -removeMin() - removes and returns the entry with smallest key, or null if the priority queue is empty -min() - returns but does remove an entry with smallest key, or null if the priority queue is empty -size() -isEmpty()
analysis of merge-sort
-the height h of the merge-sort tree is O(logn): -at each recursive call we divide in half the sequence -the overall amount of work done at the nodes of depth i is O(n): -we partition and merge 2^i sequences of size n/2^i -we make 2^(i+1) recursive calls -thus the total running time of merge-sort is O(nlogn)
height of an AVL tree
-the height of an AVL tree storing n keys is O(logn) -proof(by induction): let us bound n(h) : the minimum number of internal nodes of an AVL tree of height h. -we easily see that n(1) = 1 and n(2) = 2 -for n>2, an AVL tree of height h contains the root node, one AVL subtree of height h-1 and another of height h-2 -knowing n(h-1) = 1 + n(h-1) + n(h-2). So n(h) > 2n(h-2), n(h) > 4n(h-4), n(h) > 8n(n-6 ... (by induction) n(h) > 2^i n(h-2i) -solving the base case we get : n(h) > 2^(h/2 - 1) -taking logarithms: h < 2log n(h) + 2 -thus the height of an AVL tree is O(logn)
height of skip list
-the running time of the search and insertion algorithms is affected by the height h of the skip list -we show that with high probability, a skip list with n items has height O(logn) -we use the following additional probabilistic fact: -fact 3: if each of n events has probability p, the probability that at least one event occurs is at most np -consider a skip list with n entries: -by fact 1, we insert an entry in list Si, with probability 1/2^i -by fact 3, the probability that list Si has at least one item is at most n/2^i by picking i =3logn, we have that the probability that S3logn has at least one entry is at most: n/2^(3log n) = n/n^3 = 1/n^2 -thus a skip list with n entries has height at most 3logn with probability at least 1 - 1/n^2
search and update times in skip list
-the search time in a skip list is proportional to : -the number of drop-down steps plus -the number of scan-forward steps -the drop-down steps are bounded by the heigh of the skip list and thus are O(logn) with high probability -to analyze the scan-forward steps, we use yet another probabilistic fact: -fact 4: the expected number of coin tosses required in order to get tails is 2 -when we can scan forward in a list, the destination key does not belong to a higher list: -a scan-forward step is associated with a former coin toss that gave tails -by fact 4, in each list the expected number of scan-forward steps is 2 -thus, the expected number of scan-forward steps is O(logn) -we conclude that a search in a skip list takes O(logn) expected time -the analysis of insertion and deletion gives similar results
space usage of skip list
-the space used by a skip list depends on the random bits used by each invocation of the insertion algorithm -we usse the following two basic probabilistic facts: -fact 1: the probability of getting i consecutive heads when flipping a coin is 1/2^i -fact 2: if each of n entries is present in a set with probability p, the expected size of the set is np -consider a skip list with n entries: -by fact 1, we insert an entry in list Si, with probability 1/2^i -by fact 2, the expected size of the list Si is n/2^i -the expected number of nodes used by the skip list is (equation in slides, skip lists slide 8) -thus, the expected space usage of a skip list with n items is O(n)
worst case runtime of quick sort
-the worst case for quick-sort occurs when the pivot is the unique minimum or maximum element -one of L and G has size n -1 and the other has size 0 -the running time is proportional to the sum -thus the worst-case running time of quick-sort is O(n^2)
insertion into a skip list
-to insert an entry(x,o) into a skip list, we use a randomized algorithm: -we repeatedly toss a coin until we get tails , and we denote with i the number of times the coin came up heads -if i >= h, we add to the skip list new lists Sh+1, ..., Si+1, each containing only the two special keys -we search for x in the skip list and find positions p0, p1, ... ,pi of the items with largest key less than x in each list S0, S1, .... ,Si -for j=0, ... ,j=i , we insert item(x,o) into list Sj after position pj
insertion BST
-to perform operation put(k,o) , we search for key k (using TreeSearch) -assume k is not already in the tree, and let w be the leaf reached by the search -we insert k at node w and expand w into an internal node
deletion BST
-to perform operation remove(k), we search for key k -assume key k is in the tree and let v be the node storing k -if node has a leaf child w, we remove v and w from the tree operation removeExternal(w), which removes w and its parent -we consider the case where the key k to be removed is stored at a node v whose children are both internal: -we find the internal node w that follows v in an inorder traversal -we copy key(w) into node v -we remove node w and its left child z(which must be a leaf) by means of operation removeExternal(z)
deletion in skip list
-to remove an entry with key x from a skip list, we proceed as follows: -we search for x in the skip list and find the positions p0, p1, ... , pi of the items with key x,, where position pj is in the list Sj -we remove positions p0, p1, ... ,pi from the lists S0, S1, .... ,Si -we remove all but one list containing only the two special keys
Searching BST
-to search for a key k, we trace a downward path starting at the root -the next node visited depends on the comparison of k with the key of the current node -if we reach a leaf, the key is not found -the algorithms for the nearest neighbor queries are similar
array-based queue
-use array of size N in a circular fashion -two variables keep track of the front and size (f and sz) -when the queue has fewer than N elements, array location r = (f + sz) mod N is the first empty slot past the rear of the queue
applications of queues
-waiting lists -access to shared resource(eg: printer) -multiprogramming -round robin scheduler by enqueuing the element that was just dequeued each time indirect applications: -auxiliary data structure for algorithms -component of other data structures
applications of singly linked list
-waiting lists(add new items to the end of the LL) -bucket for collisions in hashmaps
multi-way inorder traversal
-we can extend the notion of indorder traversal from binary trees to multiway search trees -naely, we visit item(ki, oi) of node v between the recursive traversals of the subtrees of v rooted at children vi and vi+1
set implementation
-we can implement a set with a list -this would take O(n) space
implementation of skip lists
-we can implement a skip list with quad-nodes -a quad-node stores: -entry -link to the node prev -link to the node next -link to the node below -link to the node above also, we define special keys PLUS_INF and MINUS_INF, and we modify the key comparator to handle them
overflow and split
-we handle an overflow at a 5-node v with a split operation: -let v1...v5 be the children of v and k1...k4 be the keys of v -node v is replaced by nodes v' and v": - v' is a 3-node with keys k1 k2 and children v1v2v3 -v" is a 2-node with keys k4 and children v4v5 -key k3 is inserted into the parent of u of v(a new root may be created) -the overflow may propagate to the parent node u -we perform O(log n) splits
insertion in (2,4) tree
-we insert a new item (k,o) at the parent v of the leaf reached by searching for k -we preserve the depth property but -we may cause an overflow(ie, node v may become a 5-node -runs in O(logn) time because locating the node and we perform O(logn) splits
polynomial accumulation
-we partition the bits of the key into a sequence of components of fixed lent(eg 8, 16, or 32 bits) a0 a1 ... an-1 -we evaluate the polynomial: p(z) = a0 + a1 z + a2 z2 + ... + an-1zn-1 at a fixed value z, ignoring overflow -especially suitable for strings(eg, the choice z = 33 gives at most 6 collisions on a set of 50,000 English words) -polynomial p(z) can be evaluated in O(n) time using Horner's rule: -the following polynomials are successively computed, each from the previous one in O(1) time: p0(z) = an-1 pi (z) = an-i-1 + zpi-1(z) (i = 1, 2, ..., n -1) We have p(z) = pn-1(z) slide 9 of HashTables
Component sum
-we partition the bits of the keys into components of fixed length (eg 16 or 32 bits) and we sum the components(ignoring overflow) -suitabble for numberic keys of fixed length greater than or equal to the number of bits of the integer type(eg long and double in Java)
deletion in (2,4) tree
-we reduce deletion of an entry to the case where the item is at the node with leaf children -otherwise, we replace the entry with its inorder successor(or, equivalently, with its inorder predecessor) and delet the latter entry -deleting also takes O(logn) time, to find the node and then the number of fusion operations, which each take O(1) time
Integer cast
-we reinterpret the bits of the key as an integer -suitable for keys of length less than or equal to the number of bits of the integer type(eg. byte, short, int, and float in Java)
Improved times of location aware heap
Improved times thanks to location-aware entries are highlighted in red Method Unsorted List Sorted List Heap size, isEmpty O(1) O(1) O(1) insert O(1) O(n) O(log n) min O(n) O(1) O(1) removeMin O(n) O(1) O(log n) remove O(1) O(1) O(log n) replaceKey O(1) O(n) O(log n) replaceValue O(1) O(1) O(1) good view in slides, PriorityQueues 10
iterator as for-each loop
Iterator<ElementType> iter = collection.iterator(); while(iter.hasNext()) { ElementType variable = iter.next(); loopBody }
how nodes of binary tree are stored in an array(A)
Node v is stored at A[rank(v)] -rank(root) = 0 -if node is the left child of parent(node), rank(node) - 2 * rank(parent(node)) + 1 -if node is the right child of the parent(node), rank(node) = 2 * rank(parent(node)) + 2
expected runtime of quick sort
The expected running time is O(nlogn) good call: the sizes of L ang G are each less than 3s/4 where s is size of sequence bad call: one of L and G has a size greater than 3s.4 -a call is good with probability 1/2 -1/2 of the possible pivots cause good calls -probabilistic fact: the expected number of coin tosses required in order to get k heads is 2k -for a node of depth i, we expect: -i/2 ancestors are good calls -the size of the input sequence for the current call is at most (3/4)^(i/2)n -for a node of depth 2log(base 4/3) n, the expected input size is one -the expected height of the quick-sort tree is O(logn) -the amount of work done at the nodes of the same depth is O(n) -thus the expected running time of quick sort is O(nlogn)
Big-Theta Notation of f(n)
There is a c' > 0 and a c" > 0 and an integer constant n0 >= 1 such that c'g(n) <= f(n) <= c''g(n) for n >= n0 -ie, f(n) is asymptotically equal to g(n)
array-based heap implementation
We can represent a heap with n keys by means of an array of length n For the node at rank i: -the left child is at rank 2i + 1 -the right child is at rank 2i + 2 Links between nodes are not explicitly stored Operation add corresponds to inserting at rank n + 1 Operation remove_min corresponds to removing at rank n(after being swapped with rank 0) Yields in-place heap-sort
merge-sort
a sorting algorithm based on the divide-and-conquer paradigm -like heap-sort: -it has O(nlogn) running time -unlike heap-sort: -it does not use an auxiliary priority queue -it accesses data in a sequential manner (suitable to sort data on a disk) Merge-sort on an input sequence S with n elements consists of three steps: Divide: partition S into two sequences S1 and S2 of about n/2 elements each Recur: recursively sort S1 and S2 Conquer: merge S1 and S2 into a unique sorted sequence
Trick to solving Big-Oh Notation
add up the coefficients of the variables and have that be your c, then chose the lowest n that will work
compare(x, y) returns
compare(x, y): returns an integer i such that i < 0 if a < b, i = 0 if a = b i > 0 if a > b An error occurs if a and b cannot be compared.
expected running time of quick select
consider a recursive call of quick select on a sequence of size s good call: the sizes of L and G are each less than 3s/4 bad call: one of L and G has a size greater than 3s/4 a call is good with probability 1/2: -1/2 of the possiible pivots cause a good cause O(n) time
hash table
consists of two things for a given key: -a hash function h -an array(called a table) of size N -when implementing a map with a hash table, the goal is to store item (k,o) at index i = h(k)
primitive operations and their runtimes
constant runtime -evaluating an expression -assigning a value to a variable -indexing into an array -calling a method -returning from a method
compression functions
division: -h2(y) = y mod N -the size N of the hash table is usually chosen to be prime multiply, add, and divide(MAD): -h2(y) = (ay + b) mod N -a and b are nonnegative integers such that a mod N != 0 -otherwise, every integer would map to the same value b
declaration of array
elementType[] arrayName = {initialValue0,...initialValueN-1}; or elementType[] arrayName = new elementType[N];
applications of postorder traversal
evaluating arithmetic expressions: -recursive method returning value of subtree -when visitiing an internal node, combine the values of the subtree
selection problem
given an integer k and n elements x1,x2, ... ,xn taken from a total order, find the k-th smallest element in this set -of course, we can sort the set in O(nlogn) time and then index the k-th element, but is there a faster way?
Big-Oh Notation of f(n)
given f(n), we say that f(n) is O(g(n)) for some function g(n) is there are positive constants c and n0 such that : f(n) <= cg(n) for n >= n0 -this is like an upper bound for the algorithm runtime -ie, this will give you the worst case runtime
height of heap theorem
height of a heap storing n elements is O(logn) Let h be the height of a heap storing n keys Since there are 2i keys at depth i = 0, ... , h - 1 and at least one key at depth h, we have n >= 1 + 2 + 4 + ... + 2h-1 + 1 Thus, n >= 2h , i.e., h <= log n
in place quick sort
in-place: no external data structure -in the partition step, we use replace operations to rearrange the elements of the input sequence such that: -the elements less than the pivot have rank less than h -the elements equal to the pivot have rank between h and k -the elements greater than the pivot rank greater than k -the recursive calls consider: -elements with rank less than h -elements with rank greater than k
inOrder traversal
inOrder(v) if left(v) != null inOrder(left(v)) visit(v) if right(v) !- null inOrder(right(v))
bucket-sort
let S be a sequence of k (key, element) items with keys in the range [0, N-1] -bucket sort uses the keys as indices into an auxiliary array B of sequences(buckets) -phase 1: empty sequence S by moving each entry (k,o) into bucket B[k] -phase 2: for i=0, ... ,N-1, move the entries of bucket B[i] to the end of sequence S
lexicographic sort
lexicographic sort sorts a sequence of d-tuples in lexicographic order by executing d times the algorithm stableSort, one per dimension -HAVE TO SORT FROM d TO 1, cannot go from the front or else the order will not be correct -runs in O(dT(n)) time, where T(n) is the running time of stableSort
arrayMax() frowth rate
linear
interface
main structural element in Java that enforces an application programming interface(API) -collection of method declarations with no data and no bodies -do not have constructors -cannot be directly instantiated -when a class implements and interface, it must implement all of the methods declared
hash function h
maps keys of a given type to integers in a fixed interval [0, N-1] -a hash function uses a hash code and then a compression function -the hash code turns the key into an integer -the compression function turns the integer into an integer that is within [0, N-1]
abstract data type (ADT)
model of a data structure that specifics the type of data stored, the operations supported on them, and the types of parameters of the operations -an ADT specifies what each operation does, but not how it does it -cannot be instantiated but can define general methods
analysis of bucket sort
phase 1 takes O(n) time phase 2 takes O(n + N) time bucket sort takes O(n + N) time -keys are used as indices and cannot be arbitrary objects -no external comparator stable sort property- the relative order of any two items with the same key is preserved after the execution of the algorithm
postorder traversal
postOrder(v) for each child w of v postOrder(w) visit(v)
preorder traversal
preOrdeR(v) visit(v) for each child w of v preorder(w)
inorder traversal applications
printing arithmetic expressions: -print operand or operator when visiting node -print "(" before traversing left subtree -print ")" before right subtree
recursion
review English ruler in slides and puzzle solve
Big-Omega Notation of f(n)
same thing as Big-Oh, only f(n) >= cg(n) -this is more like a lower bound for the algorithm runtime -ie, this will give you the best case run time
runtime of algorithms
selection-sort on an unsorted sequence On^2) insertion-sort on a sorted sequence O(n^2)
set, multiset, multimap
set- an unordered collection on elements without duplicates that typically support efficient membership tests: -elements of a set are like keys of a map but without any auxiliary values multiset(aka bag) - set-like container that allows duplicates multimap- similar to a traditional map, in that it associates values with keys; however, in a multimap the same key can be mapped to multiple values: -for example, the index of a book maps a term to one or more location at which the term occurs
Big-Oh and Growth Rate
slide 21 in chapter 4
position instance
supports only the following method: P.getElement() - return the element stored at position p -position is unaffected by changed elsewhere in the list -the only way in which a position becomes invalid is if an explicit command is issued to delete it
search with linear probing
supposed we have a hash table A that uses linear probing -get (k): -we start at cell h(k) -we probe consecutive locations until one of the following occurs: -an item with key k is found or -an empty cell is found or -N cells have been unsuccessfully probed
hash value
the value of a hash function h(k)
binary tree
tree with two properties: -each node has at most two children(exactly two for proper binary tree) -the children of a node are an ordered pair -these trees have left and right children recursive definition, either a: -tree consisting of a single node, or -tree whose root has an ordered pair of children, each of which is a binary tree
partition of quick sort
we partition an input sequence as in the quick sort algorithm: -we remove in turn each element y from S and -we insert y into L,E, or G, depending on the result of the comparison with the pivot x -each insertion and removal is at the beginning or at the end of a sequence, and hence takes O(1) time -thus, the partition step of quick select takes O(n) time
searching in a skip list
we search for a key x in a skip list as follows: -we start at the first position on the top list -at the current position p, we compare x with y=key(next(p)): - x=y we return element(next(p)) - x > y we "scan forward" - x < y we "drop down" to the next list -if we try to drop down past the bottom list we return null because the key is not in the list
casting rules
you cannot narrow cast an object into another class that implements the same parent (eg: you cannot cast RaceHorse as (FarmHorse) even though they are both subclasses of the class Horse)