3114 Final Exam

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Basic rules for shaping a 2-3 tree

1. A node contains one or two keys. 2. Every internal node has either two children (if it contains one key) or three children (if it contains two keys). Hence the name. 3. All leaves are at the same level in the tree, so the tree is always height balanced

How to find pivot for quickosrt

1. Divide pointer by two and round down if needed. That element is THE ONE. 2. Median between first middle or last. 3. Take three fixed points and get the median.

Good characteristics of hashing functions

1. Evenly distribute keys in the table. 2. If we do not know about the data it is impoartant not too put too much weight onhigh/low order bits. 3. We would like to pick a hash function that maps keys to slots in a way that makes each slot in the hash table have equal probablility of being filled for the actual set keys being used.

What is a b-tree of order m

1. Every node has at most m children. 2. A non-leaf node with k children contains k - 1 keys (One more child than key for any given node) 3. The root has at least two children if it is not a leaf node. 4. All leaves appear in the same level

Steps for inserting in a B-Tree

1. Find the leaf node where it will fit. 2a. If the leaf can accommodate another item simply insert it in the correct location. 2b. If node is full: Split node in two with smaller values in one and larger values in the other. Promote the median to the parent and continue to do this until no more splitting needs to occur. If it reaches the root node it will cause the height to grow by one.

Quadratic Probing Pros and Cons

1. Helps decluster and easy to implement. 2. There is one problem with quadratic probing: Its probe sequence typically will not visit all slots in the hash table. Unfortunately, quadratic probing has the disadvantage that typically not all hash table slots will be on te probe sequence.

Algorithm

A method or process to solve a problem

Decision Tree

A model for behavior for an algorithm

What is the best case input for Shellsort?

A sorted array because each sublist is sorted in linear time

Problem

A task to be performed. Description must not include "how" only what and constraints. Maps inputs to outputs

What is the average-case time for Mergesort to sort an array of nn records?

nlogn

What is the worst-case time for Mergesort to sort an array of nn records?

nlogn

What are the stable sorting algorithms?

Bubble Insertion Merge

Kruskal's Algorithm

Choose smallest edge weights first. Does not have to be connected to one that is already chosen.

Prims Algorithm

Choose the smallest path always (not total path)

BST search, insert, and delete operations typically run in time O(d). What is d?

Depth bitch

How to balance an external sort

Ensure that as little I/O is invoked as possible. Also load as many items into memory as possible while sorting and before writing to the disk.

External vs. Internal Sorts

External: When the files are too large to store in main memory. Internal: When all data is in main memory.

Open hashing has the advantage that it can answer range queries or questions like what is the largest key in the database.

False

How full should a hash table be at any given time?

Half full

TO STUDY

Hashing Searching, Insertion, Deletion Collision handling: open hashing, chaining (probing techniques) Hashing functions: good characteristics, common functions, perfect functions,Cichelli's technique, Cuckoo hashing, Algorithm Analysis Asymptotics Sorting: internal and external Secondary storage Buffer Pools 2-3 Trees B-Trees AVL-Trees Splay Trees: amortized-analysis, bottom-up splaying Graphs: terminology, representations, traversals, spans, topological orderings

Asymptotic Analysis of Algorithms

Measures the size and efficiency of an algorithm or its implementation as a program as the input size becomes large.

Bucket Hashing vs Alertanate Bucket Hashing

Modulo the entry by the number of buckets and put in that specific bucket else pout in overflow Alterante: Mod input by 10 and put in bucket with that last number or put in overflow.

Is heapsort Stable?

No

The order of the input records has what impact on the number of comparisons required by Radix Sort (as presented in this module)?

None

Lower Bounds and Theta Notation

O: Zero Theta: Constant Omega: Infinity

Open vs Closed Hashing

Open hashing uses stores data in linked lists that are pointed to by each slot. In closed hashing data is stored directly in the hash table.

Pros and Cons of linear Probing

Pros: Easy to implement Cons: Causes Primary clustering

Simple ALternatives to Linear (that are not much better)

Random Probing (best of two) Different linear step sizes in probing

Empirical Comparison

Running two programs and comparing their running times.

What are the unstable sorting algorithms?

Selection Quicksort Heapsort

Flyweight

Something used in programming to decrease memory use by having many objects point to one reference for their data.

Stable Sorting function

Sorting function that does not reorder initial order of duplicate values.

MultiWay Merge

Start by checking index 0 of each input buffer then take smallest and put in the array. Keep going and refilling each input buffer as they empty out until all is merged. May take multiple passes

How the **** does a quicksort work?

TODO

Improved String Hash Function

Taks a string and interperest it one chubnk at a time. It creates an integer values based of off the ASCII values for four haracters at a time. Order of characters sometimes matters, unlike the simple string hasher.

Advantage of Alternate Bucket Hashing Approach

The advantage of this approach is that initial collisions are reduced, because any slot can be a home position rather than just the first slot in the bucket.

Cuckoo Hashing

The basic idea of cuckoo hashing is to resolve collisions by using two hash functions instead of only one. This provides two possible locations in the hash table for each key. Lookup requires inspection of just two locations in the hash table, which takes constant time in the worst case (see Big O notation). This is in contrast to many other hash table algorithms, which may not have a constant worst-case bound on the time to do a lookup. Deletions, also, may be performed by blanking the cell containing a key, in constant worst case time, more simply than some other schemes such as linear probing.

How much auxilliary space or overhead (beyond the array holding the records) is needed by Heapsort?

Θ(1)

The {best|average|worst} case time complexity of Binsort is:

Θ(n+MaxKeyValue)

Assuming that the number of digits used is not excessive, the worst-case cost for Radix Sort when sorting nn keys with distinct key values is:

Θ(nlogn)

What is the worst-case time for Heapsort to sort an array of n records that each have unique key values?

Θ(nlogn)

Which is the divide-by-twos increment series for an array of 29 elements?

The first increment must be a power of 2 that generates sublists with two elements. Each succeeding increment must be half the size of the previous one.

The binning hash function makes use of:

The high order digits or bits in the key

Why do bucket hashing systems not work well on the disk?

The overflow table is expensive to process when on the disk.

Insertion Sort

The sort where u start at index 1 and compare with left item until left is smaller than right.

(Assuming no duplicate key values:) The order of the input records has what impact on the number of comparisons required by Heapsort (as presented in this module)?

There is a constant factor difference

What are B+ Trees?

Trees that have internal nodes with keys and only leaf nodes hold data.

Binary Tree != BST

True

Radix sort processes one digit at a time, and it uses a Binsort to sort the nn records on that digit. However, any stable sort could have been used in place of the Binsort.

True

The lower bound of the sorting problem is O(n \log n)O(nlogn) because we can prove that this is the best cost that any sorting algorithm could reach.

True

Measuring algorithm/data structure efficeincy

Typically you will analyze the time required for an algorithm (or the instantiation of an algorithm in the form of a program), and the space required for a data structure.

Folding method hash function

Uses ASCII values of strings (giving even weight to all characters) to determine where the slot placement should be. It sums all of the ASCII values and then calls the modulus on themin order to create the slot number.

AVL Tree

Uses balancing factors by subtracting right subtree length by left subtree length and then rotating when necessary

Cichellis Method

Uses this equation: Hash value = key length + value of first letter + value of last letter

How are the sublists sorted in shell sort?

Using Insertion Sort because Shellsort generally makes each subarray reasonably close to sorted

Visitor Design Pattern

Using generic functions in which specific step instructions are passed in.

Shellsort

Using increments to make "better sorted" lists

Open Hashing

having each position be a linked list so certain slots can hold all the collisions

You must merge 2 sorted lists of size mm and nn, respectively. The number of comparisons needed in the worst case by the merge algorithm will be:

m + n - 1

The total number of pairs of records among nn records is

n(n - 1) / 2

Sequential search is what efficiency

n^2

5 Properties of Algorithms

1. Must be correct. 2. Composed of doable steps. 3. There can be no ambiguity as to what the next step is. 4. Composed of finite steps. 5. Must terminate (no infinite loops).

Insertion of a Node on a 2-3 Tree Simple Promotion

1. traverse to leaf 2. Middle value gets promoted 3. Pointers redirected

Time Complexity of a B-Tree

ALWAYS: O(logn) Insertion, deletion and serach

Strategy design pattern

Allows for new operation types while objects stay similar thanks to having a separate class that can extract data from types being added to the function

Composite Design Pattern

Allows new object types while operations stay similar

Which of these inputs will cost the most for Shellsort when using divide-by-twos increments on an array with a size where n is a power of 2?

An array where even positions store values 1 to n/2n/2 and odd positions store values n/2+1n/2+1 to nn

Instance of a problem

Any specific set of parameters inputted into the "function"

Difference between 2-3 trees and b-trees

In 23 trees there are either 2 or 3 children and 1 or two keys B trees can be of different order (varying amounts of children/keys) A 2-3 tree is a B tree of order 3

B tree vs B+ tree

In a B+ tree data is only stored in the nodes.

Function

In programming it is the mapping of parameters to a specific output, aka the answer to the problem.

Consider what happens if someone accidently calls sort on a file that is already sorted. Which of the following sorting methods will be the most effecient if the input is already in sorted order. Bubble Sort Mergesort Insertion Sort Selection Sort

Insertion

Can deletion from hash tables using tombstones affect the efficieny?

It causes a slight increase in the average time to insert and search

Program in terms of algorithms

It is an instantiation of an algorithm in a programming language.

What does it take for a shellsort to be valid?

It must end in 1 and be strictly decreasing (no same numbers)

The three cases to consider in 2-3 Deletion

When deleting a record from the 2-3 tree, there are three cases to consider. The simplest occurs when the record is to be removed from a leaf node containing two records. In this case, the record is simply removed, and no other nodes are affected. The second case occurs when the only record in a leaf node is to be removed. The third case occurs when a record is to be removed from an internal node. In both the second and the third cases, the deleted record is replaced with another that can take its place while maintaining the correct order, similar to removing a node from a BST. If the tree is sparse enough, there is no such record available that will allow all nodes to still maintain at least one record. In this situation, sibling nodes are merged together. The delete operation for the 2-3 tree is excessively complex and will not be described further. Instead, a complete discussion of deletion will be postponed until the next section, where it can be generalized for a particular variant of the B-tree.

What is open hashing better for?

When the hashtable is kept in main memory. If it were stored on a disk the lists may be stored in separate disk blocks which would be incredibly inefficient when taking multiple disk accesses

What is closed hashing better for?

When the hashtable is kept on the disk. It will not be on seperate disk blocks

What is an inversion?

When there is a value bigger than the current value being looked at to the left of it in an array.

In which cases are the time complexities the same for Selection Sort?

Worst, Average and Best

Is mergsort stable?

Yes

What is the best-case cost of Shellsort for an array whose size nn is a power of 2 when using divide-by-twos increments?

Θ(nlogn) Think about whether the sorted list is a best-case input or not. Sorted input is the best case. How many increments are there for the divide-by-two increment series? With divide-by-two increments, there are \log nlogn increments. \log nlogn passes over an array of length nn must cost n \log nnlogn.

What is the average case cost of Shellsort for a good increment series?

Θ(n√n)


Ensembles d'études connexes

Organization of political parties

View Set

CH 3: Medication Administration and the Nursing Process of Drug Therapy

View Set

Environmental Studies Chapters 13-19

View Set

CFP-101 Unit 8: Professional Conduct and Fiduciary Responsibility

View Set

Weathering, Erosion, and Geologic Time

View Set

A&P Lab: Set 5: Body Orientation and Direction

View Set