Set 4, 5
how do we typically characterize algorithm efficiency?
"Order of Growth" of running time: looking at a particularly large n, only keep the term where n grows the fastest, and ignore any constant coefficient. so for an^2 + bn + c, we get n^2
what is the theta and O running time of Merge?
(r-l) (number of elements in sequence or subsequence) so linear
four steps of algorithm design
- clarify what inputs and outputs are - clarify the relationship between input and output - define a sequence of elementary operations that can transform input to output - prove that the algorithm is correct
what are examples of algorithms that use Divide and Conquer? describe what each does
-Mergesort- breaks down each component in two recursively until it reaches unit components that can be compared and sorted -Quicksort- finds a pivot position through the Partition part in which it has two increment counters (one that increments every round, j, and one that increments only when a switch occurs, i). When doing the partition, some elements get switched and therefore sorted. After finding the partition, the function is recursive and then goes and does the same thing to the left and right subarrays of that pivot point - Binary Search- assuming the list is sorted, it splits the list in two repeatedly until it finds the desired thing
what is the time complexity of each part of Counting Sort?
-build the table in theta(n+k) time -initialize table in theta(k) time -count how often each key occurs in theta(n) time -turn into cumulative sums in theta(k) time -overall runs in theta(n+k)
what are the steps to greedy fractional KP??
-calculate profit densities of each item -sort the items based on profit density -take as much item as possible not already in knapsack (basically each time check if the item weight plus total weight in KP is less than or equal to total weight capacity. if yes than we are good to take the whole thing. if no, take part of it and we are done)
pros and cons of Quicksort vs Mergesort
-quicksort requires less memory (not quite in-place but almost) -quicksort usually faster (even though asymptotic complexity is worse -mergesort is stable: quicksort is not (for running time purposes)
what are the three major properties of sorting algorithms?
-running time -stable: whether it preserves the original order of EQUAL elements so if two things are equal they will never be switched -sorting in place: does the data structure grow with input length or does it need constant additional memory?
what are the criteria that say an algorithm is correct?
-terminates for any valid problem instance - and turns it into a valid output
RAM Model is a good assumption for computer model given.. (4 things)
-we limit ourselves to small numbers (integers) and limited precision (floats) -the data fits into working memory -we analyze sequential problems - we discard constant factors anyways
how does Quicksort work?
1. Randomly pick a pivot value (or best: pick the 1st, last, and middle element... and then go with the median value) ... (our version takes the last element) 2. Compare all the rest of the elements to this value 3. If they are greater than this value, put them to the right of it. 4. If they are less than this value, put them to the left. 5. Put a "Low" counter at the first element, "High" counter at the last 6. Advance the Low forward until a value higher than pivot is found, and advance the High backward until a value lower than pivot is found. 7. When these two conditions are met, swap these numbers and repeat the process. 8. When the Low and High counter line up, swap their value with the pivot. 9. Now repeat the process for each partition; the left part and right part.
what kind of problems is dynamic programming suitable for? describe
1. ones that exhibit an optimal substructure (so the smaller optimal solutions that make the larger optimal solution exhibit the same structure) 2. in which sub-problems overlap/occur repeatedly
What is the smallest possible depth of a leaf in the decision tree (and why should we care about this questions)?
???
compare and contrast Divide and Conquer vs. Dynamic Programming
Divide and Conquer is like top down: start with a big problem, split it again and again, solve smaller parts, and then combine the solutions until you have reached the ultimate solution. Dynamic Programming is more like bottom up: first we solve some smaller solutions, then tabulate them, then combine tabulated solutions for the bigger solution.
How do we do change making with dynamic programming? what is the downside?
Divide and Conquer: recursively find the least number of coins required after each possible reduction (each possible coin takeaway value). very inefficient due to strong overlap
Give an example of backtracking in dynamic programming
In the dynamic change making algorithm, we only keep a table of the minimum number of coins required for each amount, but we do not keep track of the coin that got us to that amount. If we include another column C with that information, then Given C, we add the coin C(z), then the coin C(z − C(z)), etc.
Can you think of a situation in which we can sort elements without comparing them? (to beat the fundamental lower bound of comparison sort?
Integers in a reasonably small range [0, ... , k]
compare and contrast the partitioning and merging in Mergesort vs Quicksort
Mergesort does trivial partitioning with non-trivial merging of the data. Quicksort does non-trivial partitioning with trivial merging of the data
what is the running time of ApproxKP algorithm?
O n log n
what is running time of fractional KP greedy algorithm and what is is dominated by?
O(nlog(n)) dominated by the sorting step
of the sorting algorithms, what is the most widely used?
Quicksort
despite the fundamental lower bound, why do we usually choose quicksort over a linear sort?
Quicksort is still often faster due to constant factors or cache effects
what is the solution if we want to use linear sorting (like Counting Sort) but for bigger integers? (too large k)
Radix sort: counting sort in multiple passes, digit by digit, starting with the least significant one
what is RAM? describe
Random Access Machine: uses constant time operations for computation, data movement, and control flow
define divide and conquer and its three phases
Recursively break down a problem into smaller ones until it becomes very easy 1. define subproblems 2. solve subproblems recursively 3. combine results for 2. into a solution for whole problem
What is memoization?
Storing the result of a computation to use it later
what does T(n) mean
T is a function that maps input n to a running time, aka the worst case running time. It equals a number
how does the dynamic algorithm for the knapsack problem work?
Tabulate this as P(i, w), with i being the object and w being the total weight. We construct our table all the way up to the highest possible weight for our capacity. Constructing P(1, w) is trivial when constructing for i > or = 2, you have to decide if you're taking it, then make new profit and weight, or not taking it, leave old profit and weight
how do we optimize storage in dynamic KP?
We can save memory by storing only Θ(nW) values + Θ((n^2)W) bits
define dynamic programming
a variant of divide and conquer for cases in which subproblems overlap or occur repeatedly
what is the limitation of Order of Growth analysis?
a very large coefficient (that we end up dropping) could render an algorithm impractical
explain how dynamic programming works
after storing possible values in a table (memoization), a DP algorithm recursively goes and defines an optimal value based on optimal values for smaller problems the optimal solution itself will have to be constructed from information stored along the way
what is an invariant
an invariant is a property of a mathematical object which remains unchanged, after operations or transformations of a certain type are applied to the objects
explain the knapsack problem (fractional included)
as a thief, you want the object with the most value but least weight. you want to maximize profit density (p/w). in fractional, it is permitted to take parts of objects
why is it okay to say that something that runs in O(logn) also runs in O(n)? could we say the same with theta?
because it's true- it runs no faster than that. However, it is just not precise. we can not say the same with theta because theta is a situation (tight bound) description. we should describe theta as worst case
what is the running time of each level of the mergesort recursion tree
cn/(2^i) where n is number of nodes in that layer and i is the layer index from 0
define a comparison sort
does not assume anything about the objects except that they can be compared. Based on the idea that overloading the < operator can enable sorting
what does it mean if f = ω(g), if g=o(f)
f grows asymptotically faster than g
what does it mean f=o(g) and the limit toward infinity of f(n)/g(n) = 0
f grows asymptotically slower than g
describe algorithm analysis
find out the efficiency of the algorithm in terms of computational demands, memory requirements, and running time
how can we prove optimality of the dynamic change making algorithm?
go backwards/contradiction
what does running time with theta mean
grows asymptotically as fast- we use this to describe a situational running time, not a blanket statement (worst case, etc)
describe running time of nlogn
grows faster (takes longer) than linear but grows slower (not as long) as polynomial
define Approximation Algorithm
guarantee to achieve at least a certain fraction of the optimal solution
The solution from ApproxKP provides at least HOW MUCH OF the optimal profit
half
how do we know the time complexity of the lower bound for comparison sorts? (proof)
if we assume the correct ordering is unique, then we have a decision tree which must select the correct permutation out of n! possibilities. The height of the decision tree amounts to the worst case number of required comparisons, and thus a lower bound on running time. with leaves as l, n! ≤ l ≤ 2^h with h being the height of a binary tree
are mergesort and insertion sort in place?
insertionsort is in place because it works with a dynamic array that does not need additional memory and expands with the input. mergesort is NOT in place because the more input, the more small arrays/lists you need to complete the algorithm (O(n) additional spacee required)
what is the advantage of dynamic programming over divide and conquer?
it avoids recombination in case of overlaps
what is the modification (ApproxKP) we do to IntGreedyKP
it checks if the sum of the profits of the other products are worth more or equal to that one product. If not, it's fine to take. It guarantees to get at least 50% of the optimal profit.
what does running time with Ω mean?
it grows asymptotically at least as fast- lower bound- we can say this as a blanket statement for ALL CASES
what does running time with O mean?
it grows aysmptotically no faster than that- upper bound- we can use this as a blanket statement to describe ALL CASES
what is programming (as opposed to algorithm design)
it is the implementation of an algorithm in some language. it is not suitable for design and analysis of non-trivial algorithms
why don't we just change our Quicksort implementation?
it will be at a cost- take longer
explain how Counting Sort works
its goal is to sort in linear time to defeat the fundamental lower bound of comparison sort. it creates a lookup table mapping input integers to a the correct position. When you go through the input, you go backwards through the input, put the input value in the specified position, then decrement
is Quicksort in-place?
kind of. in our implementation, Partition does not require theta(n) extra space, but the recursion stack requires theta(log(n)) extra space
what is the height of a mergesort recursion tree
log2(n) where n is number of nodes in the highest layer
which technically grows faster- log8(n) or log2(n)?
log8(n)
why can we emit logarithm bases in O notation running time?
loga (n) = Θ logb(n) in other words, loga n and logb (n) differ only by a constant logb(a) because loga(n) = logb(n)/logb(a), and logb(a) is a constant that can be ignored for these purposes
since omega(nlog(n)) is the fundamental lower bound for comparison sorts, what does this mean about the sorting algorithms we know?
merge sort is an asymptotically optimal comparison sort
are insertion sort and mergesort stable?
mergesort is stable because it breaks things down to units and only switches if one is less than or greater than the other. the pseudocode in line 8 says that if the left component is less than or EQUAL to thee right component, then the left component goes first. therefore equal components won't be switched. insertion sort works similarly where it will only move a component to an earlier part of the list if it is less than the one on its left. if they are equal, it will not move unless it is just being shifted with its neighbors. it is also stable
what are the most common functions we see in Order of Growth analyses of running times?
n^2 root n n! n 1 log2(n) nlog2(n) 2^n
what is the theta and O running time of Mergesort
nlogn
will the optimum of dynamic KP always be unique?
no
is counting sort in-place?
no because it requires making the extra table and stuff
for Quicksort, is it worth taking the time to find the median and assigning it the pivot?
no because that will just increase running time. a good compromise is to take the median of 3 or some smaller number of elements.
is Quicksort stable?
no it is not stable because in pseudocode line 4 it says if its less than or equal then switch them... meaning that if they are equal elements they will be switched in order regardless
what is the worst case lower bound time complexity for comparison sorts?
omega(nlog(n))
what is dynamic programming usually used for?
optimization problems
how does IntGreedyKP do?
pretty poorly because we can no longer pack fractions, and the profit densities become misleading- one single very valuable object can trick the algorithm
What is a Quicksort trick we could use to speed up our implementation?
randomly choose a middle element and swap it with the last element at the beginning of partition- theta(1) effort
what are the two main strategies for analyzing running time?
recursion tree and mathematical induction OR the Master Theorem
what does it mean to say O(1)?
running time does not depend on input size
in our version of Quicksort, when do we get the worst case running time?
since we choose the right most component, it happens when the input is already correctly presorted
if we analyze an algorithm, what are we assuming?
the computer model
why dynamic programming was given its name?
the creator, Richard Bellman, said that the Secretary of Defense hated the word "research" so he coined it "dynamic programming" instead
in the change-making problem, what does optimality of the greedy algorithm depend on?
the denominations (coins available)
in Quicksort, what is the worst case running time? What does it depend on?
the running time depends on what pivot is chosen. worst case running time happens if largest or smallest element has been chosen because the recursion tree will be maximally inbalanced (so kind of just depends on luck of input). worst case ends up being theta(n^2)
what is the running time of binary search?
theta(log(n))
analyze the running time of insertion sort
theta(n) for best case because only went through for loop once, O(n^2) worst case in case descending order input
what is the expected Quicksort running time for random inputs?
theta(nlog(n))
what is a greedy algorithm and what advantages do they hold?
they are algorithms that solve the problems in a sequence of steps where each time they make a locally optimal (greedy) choice. they are fast, easy to implement, and sometimes even end up making the globally optimal solution
For Quicksort, Why not simply create and concatenate two lists?
this would require more and more memory
given d digits in a k-ary number system, what is the time complexity and memory of Radix sort?
time: O(d(n+k)) memory: O(n+k)
In line 7 of the dynamic KP pseudocode, we record whether the current element was included. What does this allow us to do?
track back the solution in O(n) time
how do we show optimality of greedy KP algorithm?
transform an optimal input x* with it and see if profit is not reduced
define an algorithm
unambiguous specifications that can be followed to transform a specific input into a desired output in a finite number of steps
if you want to sort dates, what is the best method?
using Radix because you can do day, month, and year in that order (least to most significant)
when is sorting stability the most relevant
when sorting key value pairs or sorting the same thing multiple times for comparison
In Quicksort, when do we expect the best case running time? What is the best case?
when the exact median is chosen as the pivot because it creates a perfectly balanced recursion tree. best case is theta(nlog(n))
how do we define a problem Π
which inputs to expect and which outputs to which they should be mapped
is counting sort stable?
yes, in our implementation it is because we go backwards through the input so the equal element that's second in the input gets assigned the higher placement
what is the time complexity of the dynamic change making algorithm?
Θ( z ⋅ M) with M being the number of denominations