Algorithms Final Exam
Priority Queue optional operations
Decrease_key(n, k) - in min priority queues with indexed values, this decreases the priority value of item n to priority k
what is counting sort?
Direct access sorting + chaining. Make sure chaining method retains input order when dealing with collisions - just use a dynamic array and append to end
Are all destructive algorithms in-place?
False
what is radix sort?
Radix Sort is a method of sorting numbers by looking at one digit at a time. It starts with the smallest digit (like the ones place) and works its way up to the largest digit (like the hundreds or thousands place). It groups numbers into "buckets" based on the current digit being sorted, then combines the groups in order. This process repeats for each digit until the list is sorted.
Abstract Data Structures: Queue
Simple FIFO data structure (i.e. waiting in line)
Abstract Data structures: Stack
Simple LIFO data structure (i.e. deck of cards)
Steps to implement bubble sort
Start at the Beginning: Begin at the first element of the list. Compare Adjacent Elements: Compare the current element with the next element. Swap if Necessary: If the current element is greater than the next element, swap them. Move to the Next Pair: Shift to the next pair of adjacent elements and repeat steps 2 and 3. Repeat for the Entire List: Continue this process until you reach the end of the list. Repeat for the Remaining Elements: After each pass, the largest element will be in its correct position. Ignore this element in subsequent passes. Stop When Sorted: Repeat the above steps until no swaps are needed in a full pass through the list.
Abstract data structure: Sets and Maps
Stores a set of items with unique keys. When keys have an associated value, we call it a map
Abstract Data Structures: Priority Queue
Stores items with some ordering according to some priority value
data structure refers to...
a particular implementation of that interface which may have different run times for the operations involved
What is a stable sort?
a sort which does not alter the order of equal items vs. the input array
Most common method for amortized analysis
accounting method
what is the array based storage method
contiguous blocks of memory. Frequently we need to copy data either by shifting values or by allocating a new block and copying all values over to the new location
merge sort runtime
depends; multiple methods to solve for runtime: substitution method recursion tree method master method
Quick select runtime
expected O(n)
what is the accounting method?
for each operation, pay the cost of that operation, but you can also pay some additional into an account that will be used to pay debts when needed
Hashing
hash function: choose a size some multiple of the number of elements, n m = O(n). find a mapping from k E 0...u or whatever the key space is, to 0..m-1. most easily done using modulo. h(k) = k%m. the general term for this mapping is a hash function
Asymptotic time
how much does the time grow as a function of the input size
***SOMEONE FILL IN MASTER METHOD
idk
Is quick sort in place or stable?
in place, which makes it faster than merge sort
Data structures purpose
store data in a way that efficiently supports how we want to access and change that data
Abstract data structure: Sequence
stores a sequence of items
what is an abstract data structure
the set of operations it allows (eg append, get_min, remove(i), etc) sometimes also known as the interface or API
Amortized Analysis Accounting example
x = [] - O(1) x.append(1) - 1 operation - but we pay 2, so it costs 1 so our bank account now has 1 x.append(2) 2 operations bc we had to make a new array and copy the values, cost is 2, we pay 2, bank still at 1 x.append(3) 1 operation bc size 4, we pay 2, bank now at 2 x.append(4) 1 operations bc size 4, we pay 2, bank now at 3 x.append(5) 4 operations bc we have to make a new array and copy values. we pay 2, bank account is still in the green w 1 etc. we never go below a bank account balance of 0, but we never pay into the bank acct more than 2, which is O(1) sooooo, append/ insert_last is amortized
How to choose a pivot for quick sort?
§ First element in array § Last element in the array § Random element in the array § Median of 3 random elements in the array
what is insertion sort?
· Maintain the invariant that the subarray on the left is in sorted order
How to implement the Recursion tree method
• Visualize the recursion tree and count the number of operations at each level and how many levels
Correctness
- must have a loop or recursion to prove for any size -proof by induction
Abstract data structure: Sets and maps
- stores a set of items with unique keys, when keys have an associated value we call it a map Operations: -build(x,y,z...) - no order to items unlike sequences - contains_keys(k) - get(k) - if a map, returns the value associated with key k, otherwise returns none or errors if no key k in the map -insert(x) - insert an item, often insert(k,x) for maps, rarely but sometimes called push - delete(k) - deletes item at key k
Abstract Data Structure Refers to:
- the set of operations it allows (eg append, get_min, remove(i), etc) sometimes aka the interface
A collision in hashing:
- two unique keys ki and kj could map to the same location in the array
How fast can search be? Using comparison computational model:
- we can only differentiate items via comparisons <,>,<+,>+,!=,==
Linked List Queue: Implementation
- with a linked list, insert_first and delete_last can be O(1) if we keep a tail point, easy solution - enqueue(x), dequeue() - O(1) but while theoretically identical, dequeue_vec array backed implementation is slightly faster
Stack operations
-Build() - Push(x) - push item to the top of the stack - Pop() - get and remove item from the top of the stack
Sequence optional dynamic operations
-Delete(n) - delete item at position n -insert(n, x) - insert item at position n without losing current value at position n and generally retains order otherwise -Delete_last(x) - often called pop - Insert_last(x) - often append or push - Delete_first() - Insert_first() *** All O(n)
What is divide and conquer
-Divide the problem into two or more subproblems recursively until they are small enough to solve -The subproblems then combined into a solution of the larger problem
Steps to implement selection sort:
-Find the smallest element in the array, put it at position 0 -Find the next smallest array, put it at position 1, etc etc, until the array is sorted o Insert(x, n) - O(n) o Swapping items can be done in O(1) time, so instead use swap(A, n, j), which takes the array A and swaps elements n and j o Use temporary variable in order to not overwrite the elements in the array
what is a Monte Carlo algorithm
-Fixed or bound runtime, but its correctness is a function of the randomness
implementation of Array Sequence
-Sequence backed by static array - a consecutive chunk of memory - Benefit: I can compute the memory address of any index of the array
Array Queue: Implementation
-With a deque_vec backing, insert_first and delete_first with O(1) time - enqueue(x), dequeue() - O(1)
Sequences core operations
-build - O(n) -get(n) - read value at position n - O(1) -set(n, x) - set value at position n to x -iter() - iterate through items one by one in sequence order -len() - number of items stored
Pseudo: Bubble Sort
1: procedure BUBBLE_SORT(A) 2: for iteration <-- 0 to n do 3: any_change <-- false 4: for i <-- 0 to n-1 do 5: if A[i+1] < A[i] then 6: swap(A, i, i+1) 7: any_change <-- true 8: if not anychange then 9: break
Pseudo: Any Duplicates
1: procedure CONTAINS_DUPS(arr) 2: for i in 0..len(arr) do 3: for j in 0..len(arr) do 4: if i /= j and arr[i] == arr[j] then 5: return true 6: return false
Pseudo: Counting Sort
1: procedure COUNTING_SORT(A) 2: u <-- max([x.key for x in A]) 3: D <-- [[] for i in 0..u] //direct access array of chains 4: for x in A do 5: D[x.key].append(x) 6: i <-- 0 7: for chain in D do 8: for x in chain do 9: A[i++] <-- x
Pseudo: Base Digit Tuple
1: procedure DIGIT_TUPLE(x, n, c) 2: high <-- x.key 3: digits <-- [] 4: for i in 0..c do 5: low <-- high % n 6: high <-- high //n 7: digits.append(low) //this builds the digits array from lowest significants to highest 8: return(digits)
Pseudo: Insertion Sort
1: procedure INSERTION_SORT(A) 2: for i <-- 1 to n do 3: tmp <-- A[i] 4: j <-- i-1 5: while A[j] < A[i] do 6: j <-- j-1 7: for k <-- i + 1 down to j do 8: A[k] <-- A[k-1] //shift elements to the right 9: A[j+1] <-- tmp //insert item in its sorted order in the left subarray
Pseudo: Merge
1: procedure MERGE(L, R, A) 2: l, r, a <-- 0 //pointers for each array 3: while l < len(L) or r < len(R) do 4: if L_pointer == len(L) then 5: A[a] <-- R[r++] //copy R[r] into A[a] then increment pointer r 6: else if R_pointer == len(R) then 7: A[a] <-- L[l++] 8: else 9: if L[L_pointer] <= R[R_pointer] then 10: A[a] <-- L[l++] 11: else 12: A[a] <-- R[r++] 13: a += 1
Pseudo: Merge Sort
1: procedure MERGE_SORT(A) 2: if len(A) == 1 then 3: return 4: L, R <-- A[0:len(A)/2], A[len(A)/2:len(A)] 5: merge_sort(L) 6: merge_sort(R) 7: merge(L, R, A) //merge L and R into A
Pseudo: Partitioning
1: procedure PARTITION(A, pivot) 2: swap(A, pivot, 0) //put pivot 3: pivot_val <-- A[0] 4: low <-- 1 5: high <-- n-1 6: while low <= high do 7: while A[low] <= pivot_val do 8: low += 1 9: while A[high] > pivot_val do 10: high -= 1 11: if high > low then //only swap if the high pointer is still to the right of the low pointer 12: swap(A, low, high) 13: swap(A, 0, high) //place the pivot in its final location
Pseudo: Quick Select
1: procedure QUICK_SELECT(A, k) 2: pivot <-- rand_int(0,n) 3: pivot_pos <-- partition(A, pivot) 4: if pivot_pos == k - 1 then 5: return A[pivot_pos] 6: else if pivot_pos > k - 1 then 7: return quick_select(A;0:pivot_pos], k) 8: else 9: return quick_select(A[pivot_pos+1:len(A)], k-pivot_pos)
Pseudo: Quick Sort
1: procedure QUICK_SORT(A) 2: pivot <-- rand_int(0,n) //this is just one way of choosing a pivot 3: pivot_pos <-- partition(A, pivot) 4: quick_sort(A[0:pivot_pos]) 5: quick_sort(A[pivot_pos + 1:n])
Psuedo: Selection Sort
1: procedure SELECTION_SORT(A) 2. for pointer <-- 0 to n-1 do 3: mindex <-- pointer 4: for i <-- pointer to n do 5: if A[i] < A[mindex] then 6: mindex <-- i 7: swap(A, mindex, pointer)
Pseudo: Swap
1: procedure SWAP(A, i, j) 2: tmp <-- A[i] 3: A[i] <-- A[j] //if we didn't have tmp, we would be losing the value at A[i] 4: A[j] <-- tmp
Pseudo: Radix Sort
1:procedure RADIX_SORT(A) 2: u <-- max([x.key for x in A]) 3: c <-- [logn(u)] 4: D <-- [Obj() for i in 0..u] 5: for i in 0..n do 6: D[i].digits <-- digit_tuple(A[i], n, c) 7: D[i].item <-- A[i] 8: for i in 0..c do 9: for j in 0..n do 10: D[j].key <-- D[j].digits[i] 11: counting_sort(D) 12: for i in 0..n do 13: A[i] <-- D[i].item runtime: O(n)
Two main storage methods:
Array based pointer based
steps to implement Insertion sort
At each step, insert the next element into the left subarray in its sorted position, shift remaining elements from that subarray to the right. • To begin, the left subarray of length 1 is in sorted order by definition
Implementation of Dynamic Array Sequence
- An array based sequence implementation that is efficient for insert_last and delete_last - Allocate more memory than you need - Capacity > length(n)
Priority Queue core operations
- Build() - Enqueue(x) - add item to queue - the priority value is often the item value - Dequeue() - get and remove an item in priority order
Queue Operations
- Build() - Enqueue(x) - add item to queue, sometimes called push - Dequeue() - get and remove an item in FIFO order, sometimes called pop
Sets and Maps operations
- Build(x, y, z...) - no order to items - Contains_key(k) - Get(k) - if a map, returns the value associated with key k - Insert(x) - inset an item, often insert(k, x) for maps - Delete(k) - deletes item at key k
Dynamic Array Stack: Implementation
- Great bc insert_last and delete_last are O(1) in a dynamic array/vector - push/pop - O(1)
what is a Las Vegas algorithm?
- Involve randomness, but guarantee an optimal solution -Runtime depends on randomness -Quick sort
Common Asymptotic Runtimes
- O(1); Constant time - O(logn); logarithmic - O(n); linear - O(n log n); log linear - O(n^2); quadratic - O(n^c); polynomial - O(2^n); exponential
Implementation of Linked List Sequence
- Pointer based data structure - Each element has a value and a memory address pointing to the next element - Can change order, insert, or delete items just by changing pointers
Asymptotic Bounds
- Upper bound (O) - most used - max growth rate of f(n) - Lower Bound(Ω) - min growth rate of f(n) - Tight Bound (Θ) - f(n) grows @ same rate as g(n)
What is amortized analysis?
- a method of analyzing the costs associated with a data structure that averages the worst operations out over time
Data Structure refers to
- a particular implementation of that interface(abstract data structure) which may have different run times for the operations involved
What is an algorithm?
- a series of steps that takes in an input and generates a correct output for a computational problem - a function of inputs and outputs - input size arbitrary length, algorithm fixed length
Dynamic Array Sequence aka list(Python):
- array based sequence implementation - efficient for intert_last and delete_last - you will allocate more memory than you need(typically double the current capacity when you run out of space) real implementation would look like: - start w 1 - insert_last, allocate a new array with size 2 and copy over - you get two more O(1) insertions before another O(n) insertion
Sorted array backed set: Implementation
- build() - O(n log n) - get(k)/contains_key(k) - binary search! O(logn) - insert(K), delete(k) cost O(n)
Amortized Analysis Accounting Method
- easiest amortized analysis - for each operation, i will pay the cost of that operation but I can also pay some additional into account that will be used to pay my debts when I need to - build - O(n) - get(i) - O(1) - insert_last(x)/delete_last() - O(1) amortized - insert_first(x)/delete_first() - O(n) insert(i,x)/delete(i) - O(n)
Direct Access Array
- first step, imagine our keys are unsigned integers or otherwise are values that can be mapped onto word-length unsigned integers - relax this constraint later - define max allowed key as u, or universe, - allocate an array of size u and store elements ki at array[ki] - suddenly constains_key(k) and get(k) run in constant time - wasting a lot of memory
Asymptotic Notation Rules
- if an algorithm is a constant time factor or slower, we ignore that in asymptotic notation - if an algoirthm has 2 terms, one that dominates the other, we drop the lower order term - three possible types of asymptotic bounds
Linked List Stack: Implementation
- if we keep a tail pointer, insert_last and delete_last are also O(1) -push/pop - O(1) - constant factor is higher than in a dynamic array, so while theoretically equivalent, not as good
Constant time operations
- integer and floating point arithmetic; addition, subtration, multiplication, division - logical operations; and, or, not, ==, >, <, <=, >= - bitwise operations; &, |, <<, >>, etc
Simple array backed set: Implementation
- just an array with keys; not the way to do things - get(k), delete(k), contains_key(k) must search the entire array giving O(n) time build is O(n) bc we have stipulated that keys are unique
Amortized Analysis:
- killing off some debt - many methods for amortized analysis - some prove worst-case average runtime - others make use of probabilities to argue average runtime- such as quicksort analysis - easiest amortized analysis is accounting method
What is a computational problem
- mapping from inputs to correct outputs given some property which makes that output correct for that input - arbitrary input size
what is partitioning?
Choose a "pivot" value, and moving all values less than that pivot to the left and everything greater to the right
What is bubble sort?
Compares elements next to one another and swapping them if the right element is less than the left element
steps to implement the substitution method
Identify the Recurrence Relation: Start with the recurrence relation, e.g.,T(n)=a⋅T(nb)+f(n)T(n) = a \cdot T\left(\frac{n}{b}\right) + f(n)T(n)=a⋅T(bn)+f(n). Make an Initial Guess: Propose a guess for T(n)T(n)T(n), such as T(n)=O(nklogmn)T(n) = O(n^k \log^m n)T(n)=O(nklogmn). Substitute the Guess: Plug your guess into the recurrence relation. Verify the Inequality: Prove that your guess satisfies the relation. Usually, you show T(n)≤(some bound matching your guess)T(n) \leq \text{(some bound matching your guess)}T(n)≤(some bound matching your guess). Adjust the Guess if Necessary: If the substitution doesn't work, refine your guess and try again. Finish the Proof: Use mathematical induction to pr
Two categories of randomized algorithms
Las Vegas algorithms Monte Carlo algorithms
functions needed for merge sort:
Merge_Sort(L) and Merge(L, R, A)
Insertion sort runtime
O(n^2)
what is selection sorts runtime?
O(n^2)
bubble sort runtime
O(n^2) worst case. but it can break early if we get lucky
What is Quickselect?
Quick select is an algorithm for finding the kth largest item in a list. Commonly this is used to find the median, or the len(A)//2 th largest element (for odd length arrays, and for most purposes a median off by one is fine).
What sorting method do we use partitioning?
Quick sort
What type of algorithm is quick sort?
Quick sort is historically a randomized algorithm in its choice of the pivot, but modern implementations may vary
what's the difference between quick sort and quick select?
Uses the same idea as quick sort, but only needs to recurse on one side of the partition
Efficiency
Want our problems to be correct and fast
Are all in-place algorithms destructive?
True
what is the pointer based storage method
non-contiguous blocks of memory. More flexible, but no random access unless you have the address compute the address by asking for the address of the array + i * sizeof(type)
Quick Sort Steps
o Choose a pivot o Partition the array verses the pivot o Recursively do the same to the subarrays on each side
Steps to merge sort
o Recursively splits data until we hit singleton arrays o Once we return up the stack call, we are left with two sorted arrays o Merge - § Use two pointers, one for each sorted array § At each step, take the minimum element from the two sorted arrays and add to the merged array § Then increment the pointer associated with the value you just added § Proceed until all elements from both arrays have been added to the merge array
what is the recursion Tree method?
o Visualize the recursion tree and count the number of operations at each level and how many levels
when do we apply the master method?
o When a recursion relation takes the form T(n) = aT(n/b) + f(n), we apply this method
What is an in-place algorithm?
one that only uses O(1) extra memory
What is a destructive algorithm?
one that overwrites the input array
Algorithm: Binary Search
procedure Binary_Search(A,x) if |A| == 0 then return false mid = |A|/2 if x == A[mid] then return mid else if x < A[mid] then return binary_search(A[0...mid],x) else if x > A[mid] then return binary_search(A[mid+1..|A|],x)
