Test 2 CSC 221

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Perfect Balance

- Could require a complete tree after ever operation

AVL Trees

- Height balanced binary search trees - Used to bound the worst case performance for Binary Search trees to O(logN) -They are always balanced -Calculates balance factor at every node; heights can differ by no more than 1

Trie

- Ordered tree data structure used to store an associative array where the keys are usually strings - Specialized tree for word searches, spell checking, spelling correcting, predictive typing - No node in the trie stores the key associated with that node - The node's position in the tree shows its key -Descendents of any node have a common prefix -Values are normally not associated with every node

Map implementation

-Use a binary search tree or a hash table as the internal storage container. -In HashMap implementation, keys are not in sorted order -Tree map implementation: AVL tree used to implement the map, stores keys in sorted order

Trie implementation

2D array Linked List Binary search tree

Maximum Number of Nodes in a Perfect Binary Tree

2^(h+1)-1

Full binary tree

A binary tree in which each node has exactly 0 or 2 children

Complete binary tree

A binary tree in which every level, except possible the deepest, is completely filled. At depth n, the height of the tree, all nodes are as far left as possible

Perfect Binary tree

A binary tree with all leaf nodes at the same depth. All internal nodes have exactly two children Has 2^(n+1) nodes

Binary Heap

A complete binary tree with a structure property and an ordering property. REMEMBER: On bottom level, each node is filled from left to right

Imperfect Hash function

A hash function which doesn't have a one to one mapping of keys to has values

Priority Queue

A queue that allows line jumping A collection of data with each time having a priroty

Binary Search Tree

A tree where each node has at most 2 children, and every node's left subtree holds values less than the node's value and every right subtree holds values greater than the node's value - An in order traversal will provide the elements in ascending order -Average case of O(logN) for add access and remove -Worst case os (ON^2)

Height balanced

A tree where there difference between heights of the left and right subtree is not more than 1

Binary tree

A tree with at most 2 children for each node

Tree

Abstract data point -Only point is the root -Nodes are leafs or internal nodes

Shifting

An alternative to folding Uses << which shifts the bit value over a certain amount. This is a good (but more complicated) alternative to folding because, for example, God and Dog would have different values if each letter was given a hash value

Overcrowding when Hashing

An insert using probing techniques cannot work with a load factor of 1 or more -Quadratic probing can fail if λ>½ -Linear probing and double hashing slow if λ>½ Separate chaining becomes slow once λ>1 To relieve the pressure on the hash, REHASH

Ancestors

Any node for which this node is a descendant

Descendants of a node

Any nodes that can be reached via one or more edges from another node

Implementation of Binary Heap

Can be represented using an array. This is better than using pointers because it takes up less space, multiplying and dividing by 2 are easy to do, and there is a simple way to find parents from children and vice versa

How to Rehash

Create a larger hash table, hash the current values on the larger table

Chaining

Each element of the hash table can be another data structure -Linked list, balanced binary tree -More space, but somewhat easier -Everything goes in its spot Reize at a given load factor or when any chain reaches some size limit

Properties of a trie

Each node has between 1 and k descendants Each link of the tree has a matching character Each leaf node corresponds to the final word which can be collected on a path from the root to this node Use O(N) space Search/insert/delete in O(DxM) time where d is length of the parameter string and m is the size of the alphabet

Heap operations

Find min insert(val): percolate up deleteMin: Percolate Down

Problem with Linked List

Finding an item takes O(N) Using a binary tree reduces access to O(log(N))

Ways to Build Heaps

Floyd's method - Add all elements arbitrary and form a complete tree. Next, fix the heap order property by percolating nodes up or down as necessary Can be constructed in O(N) time

Load factor in double hashing

For any λ<1, double hashing will find an empty slot (given appropriate table size and hash2) Search cost approaches optimal Costly as λ nears 1

Load Factor in Linear Probing

For any λ<1, linear probing will find an empty slot Performance degrades quickly for any λ>1/2

Binary Heap order property

For every non-root node X, the value in the parent of X is less than or equal to the value in X

Tries Applications

Full text search Storing word lists Search engine indices Kept in memory, therefore fast Biological applications (DNA, Genome sequencing) Game applications (Boggle)

Internal node

Has 1 or more children Is the parent of its child nodes

Leaf

Has no children

Ways to Use Priority Queues and Heaps

Heapsort: Add all items to a heap, and then remove them one at a time in sorted order To find Median: Add N elements into an array. Apply build heap algorithm to the array. Then delete perform n/2 deleteMin operations. The last item extracted from the heap is the median.

Balance factor

Height of left subtree - height of right subtree

Insert in Heap (Percolate Up)

Idea: Put val at next available leaf position, percolate up by repeatedly exchanging node until no longer needed

Probing

If a bucket isn't empty, search forward or backward for an open space Linear Probing: - Move forward one spot, check, next spot, check. -When deleting/removing, insert a blank (null if never occupied, blank if once occupied Quadratic probing - Check spot 1, then spot 2, then 4, then 8, then 16

Load factor in quadratic probing

If table size is prime and λ ≤ ½, quadratic probing will find an empty slot; for greater λ, may not

Priority Queue Operations

Insert deleteMin (Min is an arbitrary choice)

Word matching tree

Insert words into trie Each leaf stores occurrences of word in the text

Mapping

Integer values or things that can easily be converted to integer values. Transform the hashed key value into a legal index in the hash table Place in table by taking result of hash function (key) and taking remainder of dividing this result by the size of the table Prime numbers work best Hash table uses an array as its underlying storage container (array is like a series of buckets) See Ex. under hash functions

Map as a Dictionary

Key is a word, each is unique Value is the definition Together, they form a pair stored in the map One value may be represented by many keys (ie. synonyms)

Formula for Leaves

L = n(k-1)+1 L = leaves, n = internal nodes, k = k-ary tree

Collision resolution function (Probing)

Linear: F(i) I quadratic: f(i) = I^2

Advantages of tries

Looking up keys is faster (O(m) to find a key of length m vs. o(logN) for BST with n elements Lookups depend on depth of keys Require less space because they contain a large number of short strings Faster in the worst case, O(m) than an imperfect hash table which may have collisions; there are no collisions in a trie

Quadratic probing

Main idea: Spread out the search for an empty slot - Increment by I^2 instead of I

Calculating Rolling/Running Medians

Maintain 2 Heaps: A max heap and a min heap -Values <= the median are stored in the max heap -Values >= the median are stored in the min heap -Maintain balance: Number of values the two heaps can differ by at most 1

Hashing techniques

Mapping Folding Shifting

Big O of priority Queue

O(1) to insert into unsorted list, and O(n) to delete O(n) to insert into sorted list, and O(1) to delete

Hash Code Contract

Objects that are equal must have the same hash code within a running process Does not imply either of these misconceptions -Unequal objects will have different hash codes -Objects with the same hashcode must be equal Leads to these guidelines -Whenever you implement equals, you should also implement hashCode -Don't use hashCode directly as a key -Don't use hashCode in distributed applications

Collisions

Occurs when two different keys has to the same value - Keys 18 and 35 both map to 1 with table size 17

Offline vs. Online Algorithms

Offline: Algorithms that compute a property of a static collection of values Online: Algorithms that compute some property of a changing sequence of values

Folding

Partition key into several parts so that the integer values for the various parts are combined The parts may be hashed first Combine using addition, multiplication, or shifting

Depth of a node

Path length from the root to a specific node

Binary tree traversals

Preorder traversal In order traversal Post order traversal Level order traversal To determine the result, draw a path around a tree

Applications of PQueue

Print jobs in order of decreasing length Select most frequent symbols for compression

Methods to Resolve Collisions

Probing (Closed Hashing) Separate Chaining (Open Hashing)

In order traversal

Process LEFT subtree, then ROOT, then RIGHT subtree Always gives elements of a BST in increasing order

Preorder traversal

Process ROOT, then sub trees from LEFT to RIGHT

Post order traversal

Process the LEFT subtree, then RIGHT subtree, then ROOT

Hashing with Chaining

Put a pointer at each entry -Choose type as appropriate -Common chain is an unordered linked list Properties -Performance degrades with the length of chains -λ can be greater than 1

Delete in Heap (Percolate Down)

Remove the root Put last leaf at root Find the smallest child of the node Swap the node with its smallest child if needed Repeat the swap until no swaps are needed

Load factor with separate chaining

Search cost - Unsuccessful search: whole chain, average length of chain is λ -Successful search: Half a chain: Average is λ/(2+1) Optimal load factor -Zero! But between ½ and 1 is fast and makes good use of memory

Maps (Map <K,V>)

Search table, dictionary or associative array Data structure optimized for specific kind of search/access Access by asking "Give me the value associated with a key" -Keys are unique, but one value may be represented by multiple keys

Drawbacks of tries

Slower in some cases than hash tables Not easy to represent all keys as strings, such as numbers Tries are less space efficient than hash tables

Double Hashing (Probing)

Spread out the search for an empty slot by using a second hash function Near optimal when load factor is near 1/2

Level order traversal

Starting from the root of a tree, process all nodes at the same depth from left to right, then proceed to the nodes at the next depth

Height of AVL Trees

Storing n Keys is O(log n)

Patricia Trie

Substitue a chain of one child nodes with an edge labeled with a unique string Each non leaf node except root has at least 2 children

Hash Functions

Takes a large piece of data and reduce it to a smaller piece of data, usually an integer There are different types of hash functions. Ex: Take the 3rd letter of a name, divide by 6 and take the remainder. Normally a 2 step process -Transform key into an integer value -Map the integer into a valid index in the hash table -Locations in a hash table are often called buckets

Edge

The link from one node to another

Height of a node

The maximum distance (path length) of any leaf from this node -A leaf has a height of 0 -The "height of a tree" is the height of the root of the tree

Path length

The number of edges that must be traversed to get from one node to another

Search and Insertion in Trees

The search algorithm follows the path from the root towards the leaf, and can result in the word being found or not found New string insertion checks if the current character is at the current level of the tree, starting from the root. If yes, it proceeds down that branch labeled with the character. If not, it inserts a new branch at that level

Rotations

These can be used to rebalance AVL trees Outside cases require single rotation, inside cases require double -Left left case -Right right case -Right left case -Left right case

Siblings

Two nodes that have the same parent

Hash Tables

Uses a hash function to compute an index which is used as a key to identify its respective element in the table The location where elements are stored are often called buckets

Drawbacks of Linear Probing

Works until array is full, but as number of items N approaches TableSize (as λ gets close to 1) access time approaches O(N) Very prone to cluster information -If key hashes anywhere into a cluster, finding a free cell involves going through entire cluster and making it grow Often, table becomes empty except for a few clusters which doesn't distribute keys uniformly

Perfect hash functions

Yield a one to one value of keys to hash values

Left left

a \ b \ c b becomes the new root. a takes ownership of b's left child as its right child, or in this case, null. b takes ownership of a as its left child. b / \ a c

Right Left Double rotation

a \ c / b First, perform a right rotation on the right subtree. After performing a rotation on our right subtree, we have prepared our root to be rotated left. Here is our tree now: a \ b \ c Looks like we're ready for a left rotation. Let's do that: b / \ a c

Left right Double rotation

c / a \ b First, make our left subtree left-heavy. We do this by performing a left rotation our left subtree. Doing so leaves us with this situation: c / b / a This is a tree which can now be balanced using a single right rotation. We can now perform our right rotation rooted at C. The result: b / \ a c

Right Right

c / b / a b becomes the new root. c takes ownership of b's right child, as its left child. In this case, that value is null. b takes ownership of c, as its right child. b / \ a c

Hash Tables in java

hashCode is a method in java hashCode and equals -If two objects are equal according to the equals (object method) then calling the hashcode method on each of the two objects must produce the same integer result -If a class overrides equals, it should also override hashCode

Load factor of Hash Table

λ symbol Ratio of the number of values in the hash table to table size. Way to measure the efficiency of hash table implementations


Ensembles d'études connexes

Final Review - CIST1601-Information Security Fund

View Set

Module 1.01: Natural Rights - Quiz

View Set

Adult Health Final Exam (EXAMS 2-5)

View Set

Chapter 26: Disorders of Blood Flow and Blood Pressure Regulation Patho Prep U

View Set

Pathology-Chapter 3, Musculoskeletal System Conditions

View Set