CS310

Ace your homework & exams now with Quizwiz!

Define separate chaining as used in reference to hash tables:

A strategy for open hashing when the data is stored in linked lists.

What is a balanced binary tree?

A tree whose height = O(log n)

Define open hashing as used in reference to hash tables:

Any method where the data is stored external to the table itself.

Define clustering as used in reference to hash tables:

Collisions tend to occur in close proximity. This occurs for several reasons, such as the fact that data is seldom truly random.

Define closed hashing as used in reference to hash tables:

Data is stored within the table itself.

Both Hashtables and balanced trees are often used for Dictionary ADT structures. What are the advantages of Hashtables for this purpose?

O(1) performance for search, insert, and delete.

What is the average runtime complexity for Heap

O(n log n)

What is the average runtime complexity for Merge

O(n log n)

What is the average runtime complexity for Quick

O(n log n)

What is the best runtime complexity for Heap

O(n log n)

What is the best runtime complexity for Merge

O(n log n)

What is the best runtime complexity for Quick

O(n log n)

What is the worst runtime complexity for Heap

O(n log n)

What is the worst runtime complexity for Merge

O(n log n)

What is the average runtime complexity for Radix/Bucket

O(n)

What is the average runtime complexity for Selection

O(n^2)

What is the best runtime complexity for Bubble

O(n^2)

What is the best runtime complexity for Selection

O(n^2)

What is the worst runtime complexity for Bubble

O(n^2)

The array to be sorted has 1,000,000 elements. The array is in almost perfect non-ascending order (only a few elements are out of place). You must sort it in ascending order. There are no duplicate keys in the array. Indicate which sorting algorithm you would choose, and why. The machine has sufficient memory to hold approximate 10,000,000 array elements.

Reverse the array in a loop,which is O(n) and then use InsertionSort, almost O(n) in this case.

What is the minimum number of nodes in an AVL tree of height 7? Hint: The minimum number of nodes is given by the recursive formula S(h) = S(h-1) + S(h-2) +1. For h=0, S(h) = 1. For h=1, S(h) = 2

S(2) = 4 S(3) = 7 S(4) = 12 S(5) = 20 S(6) = 33 S(7) = 54

Both Hashtables and balanced trees are often used for Dictionary ADT structures. What are the disadvantages of balanced trees for this purpose?

Slower than hash tables

Define secondary clustering as used in reference to hash tables:

When strategies other than linear probing are used, clustering tends to occur at the secondard location for the data.

Define collision as used in reference to hash tables:

When two or more elements hash to the same index.

Hashtables are usually implemented using either probing or chaining. Describe each method including the advantages and disadvantages of each one.

With probing you store data items in the table array. If a collision has occurred and the index needed is not available, then some heuristic is used to identify a new location for the data item. Probing is somewhat faster than chaining as long as the load factor is low. It is not very useful for applications that require many insertions or deletions from the table. Many applications require fast lookups from a collection that doesn't change (i.e. a spell checker) and probing is a good strategy for these applications. With the chaining strategy, the array in the table becomes an array of linked lists. When data is stored, it is inserted into the linked list at the indicated index. This solves the collision problem by allowing more than one data item to be stored at any index.

Insertion sort can be improved by using binary search to find the next insertion point. However, this does not change the overall complexity of the algorithm. Why?

You must still shift elements to insert. This is O(n). The shifting dominates. This is O(log n) + O(n) = O(n). We do this for each element in the array, and thus, with binary search the cost is ( O(log n) + O(n) ) * O(n) = O(n2).

Red/Black trees require that some additional information, beyond what is required for standard binary trees, be stored in the Node. Show the fields that would be required in red/black tree Node class

class RBNode<K,V> { private K key; private V value; private RBNode<K,V> leftChild; private RBNode<K,V> rightChild; private boolean isRed; public RBNode(K k, V v) { key = k; value = v; isRed = true; leftChild = rightChild = null; } }

Define open addressing as used in reference to hash tables:

elements are stored directly in the table. NOTE: Open Addressing = Closed Hashing and Closed Addressing = Open Hashing.

Is Heap sort Stable?

no

Is Merge sort in place?

no

Is Quick sort Stable?

no

Is Radix/bucket sort in place?

no

Is Selection sort Stable?

no

Is Shell sort Stable?

no

Rewrite the following Java code for InsertionSort so that it will sort any Object that implements the Comparable interface: public static int[] insertionSort(int array[]) { int [] n = array; int in, out, temp; for(out = 1; out < n.length; out++) { temp = n[out]; in = out; while(in > 0 && n[in-1] >= temp) { n[in] = n[in-1]; in--; } n[in] = temp; } return n; }

public static <E> E[] insertionSort(E[] array) { E[] on = array; int in, out; E temp; for(out = 1; out < on.length; out++) { temp = on[out]; in = out; // while(in > 0 && on[in-1] > temp) { while(in > 0 && ((Comparable<E>)on[in-1]).compareTo(temp) > 0) { on[in] = on[in-1]; in--; } on[in] = temp; } return on; }

What is the maximum number of nodes that would fit into a binary search tree with 8 levels? What is the minimum number?

2^ (height+1) - 1 = (2^8) - 1 = 255

Both Hashtables and balanced trees are often used for Dictionary ADT structures. What are the disadvantages of Hashtables for this purpose?

Data is not stored in key order

Both Hashtables and balanced trees are often used for Dictionary ADT structures. What are the advantages of balanced trees for this purpose?

Data is ordered by key

Describe the selection sort algorithm is plain English.

Find the largest element in an array. Swap this element with the element in last place. Then find the largest element in the section of the array minus the end. Swap the max element with the next-to-last one. Continue this process. With each pass of the loop, the largest element in the unsorted section is swapped with the element is last place in the unsorted partition. The unsorted partition becomes 1 less and the sorted partition one larger.

What is a hash function? Describe at least three commonly used hash methods.

Folding, multiplication, addition, xor, exponentiation and shifting.

Priority Queues are often implemented using binary heaps. What advantage does this have over an ordered array implementation?

Heaps provide O(log n) enqueue and dequeue operations. With arrays, either the enqueue or dequeue operation must be O(n), depending upon whether or not an ordered array is used. This is because the of the shifting that must be done after an insertion or deletion

Define the terms "in place" and "stable" as they are used to describe sorting algorithms.

In Place means the algorithm does not require extra storage in terms of the input size n. Stable means the algorithm preserves the ordering of duplicate elements.

'A number of arrays have to be sorted. Most are quite small (n < 100), but a few have as many as 1000 elements.' Indicate which sorting algorithm you would choose, and why. The machine has sufficient memory to hold approximate 10,000,000 array elements.

Insertion Sort

To sort an array in almost sorted order, which algorithm would you choose? Why?

Insertion sort. It is O(n) when the array is already sorted.

'You know that the array to be sorted cannot have more than 10 elements.' Indicate which sorting algorithm you would choose, and why. The machine has sufficient memory to hold approximate 10,000,000 array elements.

It doesn't really matter, though avoid Bubble Sort.

If you were required to write an in place HeapSort algorithm, would you use a min heap or a max heap? Explain your choice.

Max heap. If you remove MAX from the heap, then it goes at the end of the array. As the heap shrinks as elements are removed, this creates a space at the end of the array.

You must sort an array of 50,000 elements. The array has many duplicate keys and you must preserve the ordering of duplicate keys. Indicate which sorting algorithm you would choose, and why. The machine has sufficient memory to hold approximate 10,000,000 array elements.

Merge Sort.

If you were required to sort a very large file that would not fit in memory, what algorithm would you choose? Describe the steps you would perform to sort the file.

Merge sort. Read from the file as many elements as will fit in memory. Sort the elements and then write to a temporary file Repeat until the whole file has been processed. Merge the temporary files back and overwrite the original file.

The array to be sorted has 8,000,000 elements, and nothing is known about the initial arrangement of the elements. Indicate which sorting algorithm you would choose, and why. The machine has sufficient memory to hold approximate 10,000,000 array elements.

Modified quick sort, or heap sort.

'You must write a sort routine that will be included in a library that other programmers will use. You have no information about the size or ordering of arrays that your routine will be called to sort' Indicate which sorting algorithm you would choose, and why. The machine has sufficient memory to hold approximate 10,000,000 array elements.

No mention is made of the need for stability. It is always a good idea to use stable algorithms for default implementations. The only fast algorithm that is stable is merge sort. If stability is unimportant, the modified quick sort is the fastest algorithm.

Suppose a programmer has to write a spell checker application. The words in the dictionary will be stored in a hashtable. For the hash method, the programmer decides to sum the ascii codes of the characters for each word. He reasons that since each word is distinct (no dups), this method will generate a unique hash for each word. Is his reasoning correct?

No. This is a bad idea. Many words will generate the same same hash code. ate and eat for instance will generate the same code.

Because deletion from a balanced binary tree can be computationally intensive, sometimes lazy deletion is used. Describe this technique

Nodes are marked as deleted, but not removed from the tree.

What is the best runtime complexity for Insertion

O(n)

What is the best runtime complexity for Radix/Bucket

O(n)

What is the worst runtime complexity for Radix/Bucket

O(n)

What is the average runtime complexity for Bubble

O(n^2)

What is the average runtime complexity for Insertion

O(n^2)

What is the worst runtime complexity for Insertion

O(n^2)

What is the worst runtime complexity for Quick

O(n^2)

What is the worst runtime complexity for Selection

O(n^2)

What is the worst runtime complexity for Shell* *give the estimated range

O(n^3/2)

What is the average runtime complexity for Shell* *give the estimated range

O(n^4/3)

What is the best runtime complexity for Shell* *give the estimated range

O(n^6/5)

For tables implemented using probing, sometimes quadratic probing is used. Describe this method.

Quadradic probling means using a non-linear probling strategy. If an index is unavailable, then with quadratic probing, you move down the array in the sequence i+1, i+22, i+32, i+42 and so on until an available slot is found.

Although they are difficult to code, red/black trees are often the implementation of choice for balanced trees. Why are red/black trees chosen instead of AVL trees?

Red/Black trees have a looser balanced condition which means that fewer adjustments to the structure are required, while still preserving O(log n) behavior. AVL trees also have more overhead. The left and right height for each node must either be stored in the node, and maintained, or calculated on the fly for each node along the insertion path

List the structural and ordering properties for red/black trees

The Red/Black tree is a balanced binary search tree. Each node in the tree contains zero, one or two children. For each node n in the tree, all entries to the left of n must be smaller than n, and all entries to the right must be larger than n. The balanced condition of the tree is dictated by the following rules: -Every node is colored either red or black. -The root node is always black. -New insertions (except for root) are always colored red. -Every path from root to leaf must contain exactly the same number of black nodes. -No path can have two red nodes in a row. That is, if a node is red, it cannot have a red child or a red parent. -Null children are always black. A red violation (two reds in a row along the insertion path) indicates that the tree is out of balance and must be adjusted. Adjustments are always made to the grandparent of the node that caused the violation according the following two rules: 1) If the node that caused the violation has a red aunt, then perform a color flip. 2)If the node that caused the violation has a black aunt, then rotate

You have written a sorting method that takes an array of Object as its parameter. What limitations are imposed on the users of this method?

The class of objects to be sorted must implement Comparable<E>.

Of all the sorting algorithms we have studied, bubble sort is consistently the worst performer. What makes it the worst of the O(n2) algorithms?

The excessive number of swaps makes Bubble Sort the worst performer.

Define primary clustering as used in reference to hash tables:

The first instance of clustering that occurs from converting the hash code to an index.

Why can't a random number generator be used in a hash function?

The hash code must be repeatable. Every time you call hashCode() on an object, it must return the same value.

The standard quick sort algorithm is O(n2) in the worst case. What is the worst case? What modifications can be made to the algorithm to provide better behavior in this case?

The worst case is an algorithm that is in sorted or reverse sorted order. In these cases, the partition size is 1 and n-1, which means n*n = O(n2). In the worst case, standard quick sort usually runs out of stack space and crashes. The reason for this is because if the array is in already sorted order, the pivot will be the largest element in the section of the array being processed. During the partitioning phase, you will end up with one partition holding one element, and the other partition holding section size-1 elements. When processing the array recursively, you will end up partitioning the array n-1 times. Stratagies for dealing with this problem vary, but always involve a strategy to insure that the largest (or smallest) element in the section is not chosen for the pivot. Any strategy must be efficient--doing a lot of processing to determine the pivot will slow the algorithm down, which is not desirable. A useful strategy is to swap the element at the right end with the one in the middle. This does not guarantee that you will avoid worst case behavior, but it does handle the case of sorted (or reverse sorted) arrays quickly.

'An array of 8,000,000 elements contains only integer keys in the range 1..50,000 inclusive. You need not preserve ordering of duplicates. Your algorithm must run in O(n) time. ' Indicate which sorting algorithm you would choose, and why. The machine has sufficient memory to hold approximate 10,000,000 array elements.

This is a bit tricky. Make an array of size 50,000. Then loop through the array and count the number of times each integer occurs. Rewrite the original array using the counts in your auxiliary array. n + n = O(n).

Hashtables do not keep data ordered by keys. This makes some operations more difficult. Suppose that you are required to write the following method for a hashtable that uses chaining: public Object [] getRange(String first, String last) This method returns an ordered array of all keys in the table that fall within the range first..last inclusive. Your method must be as efficient in terms of time and storage complexity as possible. Describe how you might implement the method.

This is best done with the table's iterators, though this is not the most efficient way. The most expensive part of the process is the sort. Therefore, the most efficient way is to go through the table and collect all data that is within the required range and then sort it.

Hashtables are generally made somewhat larger than the maximum number of elements to be inserted. Why?

This is done to minimize collisions. We use more space than is needed to optimize performance.

If a hashtable requires integer keys, what hash algorithm would you choose? Write Java code for your hash algorithm.

f the key can be any integer, not some range of integers, then the key itself may be used as the hash code. The following code is from the Java API for class Integer: private final int value; ... public int hashCode() { return value; }

Is Bubble sort Stable?

yes

Is Bubble sort in place?

yes

Is Heap sort in place?

yes

Is Insertion sort Stable?

yes

Is Insertion sort in place?

yes

Is Merge sort Stable?

yes

Is Quick sort in place?

yes

Is Radix/bucket sort Stable?

yes

Is Selection sort in place?

yes

Is Shell sort in place?

yes


Related study sets

Anatomy and Physiology Chapter 7

View Set

Chapter 3 - Intermediate Accounting Spiceland 9e

View Set

Pediatric Nursing HESI Remediation

View Set

Chapter 2: Learning module & Mastery activities

View Set

NSG 330 Ch 67- Management Cerebrovascular Disorders

View Set

The Superkids Take Off Chapter 19

View Set

Red Hat Product: Fundamentals Quizlet

View Set

AUBF LAB Module 3: Chemical Analysis of Urine Part 2

View Set