Algorithms Midterm
O(n log n)
"n-log-n"; mergesort
𝐺 in single-source shortest path
(V; E) needs to be a weighted, undirected graph
L(sub s;v) in single-source shortest path
(s, ..., v) is a shortest path from start vertex s to end vertex v; s,v ε E
Example of a spanning tree
0 1 | | 2 ---- 3 | | 4 5
Lehmer's algorithm
0(𝑛/log(𝑛))
steps of algorithm design
1. Be clear about what problem your algorithm will solve 2. Pick a pattern. 3. Produce a first draft, using the selected pattern 4. Revise the draft by filling in blanks, eliminating potential infinite loops, or clarifying unclear passages. 5. Repeat until your pseudocode is correct, clear and able to terminate 6. Write a final and clean draft, ensuring your pseudocode will be clear to others 7. Prove that your algorithm is correct, working from your final draft. 8. Prove the efficiency of your algorithm, working from your final draft.
Steps of Dijkstra's algorithm with min-heap PQ
1. Make min heap PQ of size V. V is the num of vertices in the given graph. Every node in MH has vertex num and distance val of each vertex 2. Assign source vertex/ root = 0 and infinity to other vertices 3. While MH != empty: get vertex u with min distance from MH, check if vertices adj to u are in MH, and update adj vert's distance value if it's in the MH and the distance val is more than the u-adj edge + u's distance val
Steps of Kruskal's Algorithm
1. Sort the edges in non-decreasing order of their weight 2.Pick smallest edge. Include it if it doesn't make a cycle with spanning tree made so far 3. Ignore edge if it forms a cycle 4. Repeat until the spanning tree has n-1 edges
Obtaining MST using Prim's Algorithm
1. Start at any node in the graph. That node is reached while everything else is unreached 2. Find an edge with minimum cost that connects a reached node with an unreached one 3. Add the edge to the minimum cost spanning tree and mark the unreached node as reached 4. Repeat 2 and 3 until all the nodes in the graph are reached
How Dijkstra's algorithm works
2 lists visited and unvisited. Everything except the source vertex is in unvisited 1. Remove the current vertex from unvisited with the least cost (this vertex has least cost, all others must have greater cost) 2. add current to the visited set 3. If adj is in visited, then the minimum cost of reaching it is known 4. if adj isn't in visited, compute the new cost for arriving at adj by traversing the edge from current to adj
exhaustive search
2 paths: enumerate through all objects to be tested, test the objects
operations on dynamic sets
2 types: queries and modifying operations
Example of where greedy algo may not produce the optimal solution
7 / \ 3 12 / \ / \ 99 8 5 6
problem notation
<problem name>: Input: <definition of input objects> Output: <definition of output objects>
acceptable solution
A candidate solution which comprises a correct output for a given problem instance; unacceptable to have a candidate solution with an incorrect output
multiple valid spanning
A graph may have _____ trees.
exchanges in permutation algorithms
A natural way to permute an array of elements on a computer is to exchange two of its elements. The fastest permutation algorithms operate in this way
boolean circuit assignment
A over k variables is a list of k Boolean True or False values A = <a0, a1,..., ak-1> and each ai corresponds to a value that may be assigned to a variable xi in a corresponding k-variable Boolean circuit
spanning tree
A subgraph T = (V; K) of a graph G = (V; E) where K ⊆ E is a subset of edges in the main graph; the subgraph is connected and acyclic
ADT vs its implementation
ADT is language independent and needs to be implemented in an appropriate computer language
American Change Making Problem
Addresses question of finding the minimum number of coins that equal a specific monetary value using the greedy technique
Spanning Trees Properties
All possible spanning trees in graph G have the same number of edges and vertices.
Prim's Algorithm
Always gives connected component. Works only on connected graph
Johnson-Trotter Algorithm
An element x in the current permutation is mobile when another element y exists in the direction that x is looking
generating permutations
Any sequence of n elements has exactly n! distinct permutations.
Dijkstra's algorithm
Applies to directed and undirected graphs with nonnegative weights only.
pseudocode checklist: vagueness
Are any steps vague? Check that every line of pseudocode could be translated into program code without elaboration
pseudocode checklist: undefined variables
Are any variables used before they are defined or initialized?
pseudocode checklist: input and output
Are the algorithm's inputs and outputs clear, and explicitly separated from other variables? Arguments should correspond to problem inputs; return value corresponds to problem output.
Limitations of greedy techniques
Because it always goes for the immediate greatest return, it won't look for alternate paths that could give it optimal solution
boolean circuit
C = (T, X) is a directed tree T = (V, E) with a set of k indexed variables X = {x0, ..., x(k-1)}, where k < n and n = |V| vertices
O(|V|²)
Complexity of first step of Dijkstra's algorithm
data
Consists of finite mathematical objects that can be represented by strings of binary 0 and 1 digits
pseudocode checklist: dead code
Delete code that is never executed.
What to do when given T(n) and f(n)
Do the same steps as induction and see if T(n) is an upper bound of f(n)
pseudocode checklist: defined return value
Does every execution path have a defined return value?
pseudocode checklist: loop termination
Does every loop have a termination condition that prevents infinite loops?
pseudocode checklist: base case
Does every recursive function have a clearly-defined base case?
pseudocode checklist: return value data type
Does the data type of every returned value match the output in the problem definition?
pseudocode checklist: handles all cases
Does your algorithm have the potential to return every kind of valid output?
Random-Access Machine (RAM)
Each memory access takes 1 step
subset generating problem
Each subset contains at most 𝑛 elements
subset generating problem
Each 𝑛-bit integer value will map to one set 𝑆𝑖 containing up to 𝑛 elements; that means that each bit of 𝑖 will map to one particular element of 𝑈
ADT vs its implementation
Ex: implementing a stack with push and pop functions like its an array or a linked list
How Fibonacci Heap helps reduce MH Dijkstra's algorithm complexity
FH takes O(1) time for decrease key op while BH takes O(Logn)
General format for pure selection sort
Find the smallest elem in unsorted U, remove it, and append to S. Repeat until nothing is left unsorted
Dijkstra's algorithm with MH
Finding and updating each adj vertex's weight in MH = O(log(V)) + O(1) = O(log(V))
O(|V|) time
Finding the next current step in Dijkstra's algorithm
Why the complexity of Dijkstra's algorithm's first step is O(|V|²)
Finding the next current vertex takes O(|V|), and this happens |V| times
Dijkstra's algorithm
Finds the shortest paths to a graph's vertices in order of their distances from a given source.
proving efficiency classes with properties of O
For any complexity functions T(n) and f(n) with T(n) ≤ f(n), 𝑇(𝑛) 𝜖 𝑂(𝑓(𝑛))
MST Theorem
For any connected graph G = (V; E) with n = |V| vertices, there exists a spanning tree 𝑇 = (𝑉; 𝐾) for G. If n > 0 then any such T has exactly |𝐾| = 𝑛 − 1 edges.
Recursive permutation algorithm
Heap's algorithm( 1963): updated version of Steinhaus-Johnson-Trotter algorithm
Examples of greedy algorithms
Huffman encoding and Dijkstra's algorithm
real-valued function; |F(x) - L| < ԑ whenever x > k
If F(x) is a _____, then the limit of F(x) as x→∞ is L
univariate; 𝑇(𝑛) ∈ 𝑂(𝑓(𝑛))
If T and f are ______ complexity functions, f(n) > 0, and the limit as n→∞ of T(n)/f(n) = L (non-negative and constant w/ respect to n), then _____
pseudocode checklist
If a piece of pseudocode fails any of these tests, it is not good enough to specify an algorithm and needs more work.
pseudocode checklist
If pseudocode passes all these tests, it is probably, but not necessarily, at least adequate
Binary GCD algorithm
If u and v are both even, gcd(𝑢, 𝑣) = 2 gcd(𝑢/2, 𝑣/2). If u is even and v is odd, gcd(𝑢, 𝑣) = gcd(𝑢/2, 𝑣). Otherwise both are odd, and gcd(𝑢, 𝑣) = gcd(|𝑢 − 𝑣|/2, 𝑣)
General form of MST Pseudocode
Init K as an empty set, and while K doesn't span G, choose edges with greed and add them to K. Return K when you get spanning tree
pseudocode checklist
Input and output, undefined variables, variable meanings, defined return value, return value data type, handles all cases, loop termination, base case, repetitive code, dead code, vagueness
American Change Making Problem
Input: some number of cents k > 0 Output: list V of coin values such that the sum = k and the length of v is minimal
pseudocode checklist: variable meanings
Is the intended meaning of every variable clear? Potentially-confusing variables should be explained with a comment
Trial division
It is impossible that all prime factors of a composite number 𝑛 are bigger than √n. Hence, test the divisors 2 ≤ 𝑑 ≤ √n.
How the greedy algorithm works
It makes the optimal choice at each step as it attempts to find the overall optimal way to solve the entire problem.
General form of greedy algorithms
Iterate through list of inputted elems to find result (for loop)
𝑂(𝐸 log𝑉), where V is the number of vertices.
Kruskal's algorithm's time complexity
Bridge Theorem
Let G = (V, E) be a weighted, undirected, connected graph, the partition {L, R} of V, B be the set of all bridge edges, and let b ⊆ B be a bridge edge of least weight; there there exists some minimum spanning tree T = (V, K) such that b ∈ K
common abstract data types
List, Set, Graph, Stack, Queue, Priority queue
steps for proving by induction: make an informed guess about which efficiency class to use
Look at the T(n) given and infer what efficiency class it is (ex: T(n) = 2n + 2 resembles linear efficiency O(n))
Kruskal's Algorithm
May generate forest at any instant. May work on disconnected components
random permutations
Methods for generating "random" permutations of N elements are needed
Johnson-Trotter Algorithm
Models the notion of motion by assigning a direction to each element of the permutation.
nested cycling
N different permutations of P[1],... ,P[N] can be obtained by rotating the array
circuit satisfaction
NP-Complete problem
Spanning Tree properties
No cycles or loops
subset generating problem
No value of k can make 𝑛^𝑘 = 2𝑛, since n^k is a polynomial function and 2n is an exponential function; need to iterate through range of 0 to 2n-1
bridge edge
Nodes from G are partitioned into 2 groups, L and R. Any edge that crosses between L and R is a ________
the big 8 efficiency classes
O(1) < O(log n) < O(n) < O(n log n) < O(n²) < O(n³) < O(c^n) < O(n!)
MST by exhaustive search
O(2^m · (n+m)
circuit satisfaction
O(2^n · n)
subset generating
O(2^n · n)
Overall complexity of Dijkstra's algorithm with MH
O(E+V) * O(LogV) = O((E+V) * LogV) = O(ELogV)
proper exhaustive search
O(c) candidates, generating all candidates takes O(g) time, and verifying one candidate takes O(v) time
exhaustive optimization
O(c) candidates, generating all candidates takes O(g) time, verifying one candidate takes O(v) time, O(b) to compare the two candidates
exhaustive optimization
O(g+c(v+b))
proper exhaustive search
O(g+cv)
Euclidean Algorithm
O(log(min(a, b)))
Binary GCD algorithm
O(log(uv)/2)
TSP by exhaustive optimization
O(n! · n²)
exhaustive and greedy weaknesses
Often yields inefficient algorithms; Some exhaustive search algorithms are unacceptably slow; not very creative compared to other design techniques
Dijkstra's algorithm
One of the best-known algorithms for finding the single-source shortest-paths problem
Binary GCD algorithm
Repeated application of subtraction and division by 2; Mostly used for operations on binary representations.
pseudocode checklist: repetitive code
Repetitive code should be moved into a helper function or loop.
RSA algorithm
Rivest, Shamir, Adleman; an application of factorization.
Prim's Algorithm
Runs faster in dense graphs.
Kruskal's Algorithm
Runs faster in sparse graphs.
typical search operations for a set S
Search(S, k), Insert(S, x), Delete(S, x), Minimum(S, x), Maximum(S,x), Successor(S,x), Predecessor(S,x)
steps for proving by induction: prove the base case: T(n) ≤ c(f(n)) when n = n0, prove the inductive step that for any n > n0
Solve for T(n) and c(f(n)) with c and n0 and see if they match
Prim's Algorithm
Starts building the MST from any vertex in the graph.
Kruskal's Algorithm
Starts building the MST from the vertex carrying minimum weight in the graph.
generating candidates
Techniques for generating candidates includes iterators and the use use of integer range generator keywords
Johnson-Trotter Algorithm
The direction may be either positive (moving toward a higher index) or negative (moving toward a negative index).
Euclid's algorithm
The gcd of two positive integers a and b ( a > b) are the same as the common divisors of a - b and b
Spanning Trees Properties
The spanning tree is acyclic (no loops)
Prim-Jarnik algorithm, Kruskal's algorithm, and Boruvka's algorithm.
The three most widely known minimum spanning tree algorithms
Adjacent Exchanges
They discovered that it was possible to generate all N! permutations of N elements with N!-1 exchanges of adjacent elements
GCD problem
Time complexity: O(min(a, b)) Space complexity: O(min(a, b))
Dijkstra's algorithm with min heap priority queue
Traverse all vertices with bredth- first search and uses min heap to store unreached vertices. Min heap used to get minimum distance vertex from unreached vertices
Prim's Algorithm
Traverses each node more than once to get the minimum distance.
Kruskal's Algorithm
Traverses each node only once.
American Change Making Problem
Use as many quarters as you can, then dimes, nickels, and pennies as needed
Kruskal's Algorithm
Used for finding MST; uses greedy techniques to add next lowest weight edge that won't form a cycle in each step
Lehmer's algorithm
Used for larger numbers; iterate until one of a or b is zero; Uses a 2x2 matrix presentation
Fibonacci Heap
Used to reduce overall complexity of MH Dijkstra's algorithm to O(E + VLogV)
Johnson-Trotter Algorithm
Uses index in the original position for ranking
greedy pattern
View that achieving local optimal leads to a global optimum
Sorting
Well-studied in computer science for theoretical and practical reasons
subset generating problem
When the universe set 𝑈 contains n elements, its powerset contains |𝑃(𝑈) = 2^𝑛| subsets.
exhaustive and greedy strengths
Wide applicability; Simplicity; Results in reasonable algorithms for some important problems and computational tasks
O(V²), where V is the number of vertices
Worst case time complexity of Prim-Jarnik algorithm
O(n + m log n)
Worst case time complexity of Prim-Jarnik algorithm with priority queue
O(n); n
_____ is the set of all functions that are equivalent to f(n) = __ for the purposes of measuring algorithmic complexity
implementation
__________ of an algorithm is executable computer code that follows the process defined by the algorithm
Random-Access Machine (RAM)
a (simple) computational model
exhaustive search
a brute-force approach to combinatorial problems; suggests generating each and every element of the problem domain, selecting those that satisfy all the constraints, and finding a desired element
linked list
a collection of nodes where each node is connected to the next node through a pointer
linked list
a dynamic data structure where each element ( node) is made up of two items: the data and a reference (pointer) to the next node
dictionary
a dynamic set that supports insert, delete, and test membership functions; can also support other operations
complete graph
a graph in which each pair of vertices is connected by an edge
planar graph
a graph that can be embedded in a plane, such that no edge crosses each other
dense graph
a graph where the number of edges is close to the maximum number (n(n-1)/2) of edges possible
connected graph
a graph where there is at least one path between each pair of vertices
Prim's/ Prim-Jarnik Algorithm
a greedy technique for finding the Minimum Spanning Tree (MST) of a given weighted, connected, and undirected graph
pseudocode
a human-readable format for communicating algorithms that may include code-like syntax, math notation, and prose
singly linked list
a list in which each element points to its successor; a list item stores an element and a pointer to its successor
priority queue
a list that maintains S elements, each with an associated value called a key
abstract data type (ADT)
a logical description of data; It includes the views and the allowed operations without implementation details
graph
a mathematical object with of a set of vertices; and edges
complexity
a non-negative real number representing the amount of resources that is consumed by an algorithm when run on a specific instance
factor
a number that divides evenly into another number
directed graph
a pair 𝐺 = (𝑉, 𝐸) where V is a set whose elements are called vertices, and E is a set of ordered pairs of elements of V
Hamiltonian cycle
a path C = (c0, ..., ck-1) such that c0 = ck-1
Hamiltonian path
a path in G such that every v ∈ V appears in C exactly once
data structures
a scheme for organizing related pieces of information
array
a sequence of fixed-size data records, indexed by a system of integer coordinates
algorithm
a sequence of unambiguous instructions for solving a problem
process
a series of actions directed to some end
problem
a set of input instances and a task to be performed on the input instances
abstract data types (ADT)
a set of objects that are related to each other together with a set of operations
greedy algorithm
a simple, intuitive algorithm that is used in optimization problems.
matrix
a singular vector arranged into the specified dimensions
minimum spanning tree of a weighted graph
a spanning tree of minimal total edge weight
solution
a specific problem and instance is a valid concrete output corresponding to the problem and instance
(problem) instance
a specific, concrete input to a problem
weighted graph
a triple (𝑉, 𝐸, 𝑊) where (𝑉, 𝐸) is a graph (directed or undirected) and 𝑊 is a numerical value of the edge
matrix
a two-dimensional array but it can be extended to an arbitrary dimension
asymptotically efficient algorithms
a type of algorithm that grows very slowly
adjacency matrix of an weighted directed graph
a v x v matrix where 0 shows that that is the node itself, ∞ shows there is no edge connecting the two nodes, and any constant shows there is a connecting edge
adjacency matrix of an unweighted graph
a v x v matrix where 1 indicates a direct path and 0 indicates there is no edge between the two nodes, or that that itself is the node
Euclidean Algorithm
a variant of the Euclid's algorithm; the difference of the two numbers a and b is replaced by the remainder of the division of a by b.
data structures
act like containers in that they hold other data
Finding new cost in Dijkstra's algorithm
add the cost of getting to current and e's weight. If the new cost is better than the current cost of getting to adj, adj's cost is updated
handling loops in a sequence
add them together (i.e. O(n + n) → O(n))
enqueue
adding an element to a queue
in-place selection sort invariant
after k iterations of the loop, the indices 0 - k-1 are non-decreasing while the remaining indices at k - n-1 may not be ordered
vector
aka arrayed list, dynamic array, and resizable array
sequential search
aka linear search
Random-Access Machine (RAM)
all basic instructions take constant time to execute
complete graph
all pairs of distinct vertices are connected by edges; has n! Hamiltonian cycles
correctness
always produces a correct solution
sequential optimization vs sequential search
always return at least one thing vs return nothing if not found
ADT vs its implementation
an ADT can be implemented in several ways using the same programming language
verifier algorithm
an algorithm that takes a problem instance and candidate solution as input, and returns True when the candidate is acceptable and False when the candidate is unacceptable.
vector
an array with a non-fixed length
basic selection sort algorithm
an example of a pure algorithm
university classroom example
an example of lumpy cost
vector
an extensible array; it automatically grows
time complexity
an indication of the run time of an algorithm in terms of how quickly it grows relative to its input
logarithm
an inverse exponential function; reflect how many times we can double something until we get to n, or divide by 2 until we get to 1
candidate solution
an object of the same data type as a problem output, which may or may not be a correct output.
traveling salesperson problem
an optimization problem
knapsack problem
an optimization problem that deals with doing more with less
algorithm
an ordered sequence of process/ steps - which produces a solution to a problem
tree
an undirected graph that is connected and has no cycles
permutation of a sequence with a defined order
another sequence that contains the same elements, but most likely in a different defined order
path in single-source shortest path
any non-empty sequence (p₀, ..., p(sub k-1) of vertices such that each p ε V and every pair of adjacent vertices is connected in G, so (p(sub i), p(sub(i+1))) ε E
when a graph is weighted
any of its spanning trees have a defined weight; some may be of minimum weight
basic operations
arithmetic (add, shift, floor, etc) and data movement (load, store, copy)
homogeneous data structures
array and matrix
how studying algorithms helps develop analytical skills
asks if the algorithm solves the problem and if it uses resources efficiently
Insert(S, x)
augment S with the element pointed by x, assuming that all the fields in x have been initialized (modifying op.)
amortized analysis
average cost over a sequence of operations
factoring an integer x
breaking x into the product of two or more positive factors
greedy pattern
builds up a solution piece by piece, always choosing the next piece that offers the most obvious and immediate benefit
steps for proving by induction: use algebra to solve for the constants c and n0 that seem likely to work
c ≥ T(n)/ f(n) make an assumption about n and have that be n0. Then input it into the reduced function to solve for c
generating candidates
can be applied in generating pairs, subsets, and permutations
graph
can be directed or undirected
graph
can be represented by an adjacency matrix or adjacency list
knapsack problem
can be solved with dynamic programming or exhaustive search
Big O
can be used to categorize functions
proving by induction
can prove T(n) is a member of O(f(n)) by picking a pair of constants c and n0 and showing T(n) ≤ c(f(n)) for any n > n0
sorting algorithm
can solve different problems that don't look like sorting problems
cases for lim as n→∞ for f(n)/g(n)
case 1: 0 case 2: c case 3: ∞
modifying operations
change the set
sequential search
checks each element in a sequence until the deaired element is found or the list is exhausted
Bridge Theorem
clarifies how to make greedy choices in MST; tells which edges should be in MST's edges
3 attributes of algorithms
clarity, correctness, termination
traveling salesperson problem
combination of NP hard and optimization
'lumpy' cost
comes up in algorithm analysis when a data structure with a one-time construction cost is used
factor
composite positive integers can be factored; impossible to factor if its prime
Shortest path problems
computing minimal-weight paths between vertices in weighted graphs
Single source shortest path problems
computing paths originating from a designated source (or start) vertex
algorithm growth rate
concerned with how the running time of an algo increases with input size
general rules for computing the step count
consecutive statements add up
O(1)
constant; evaluate a statement
how abstract data types can relate to each other
contain description of the data type, the relationships between the individual objects, and the operations performed on the objects
clarity
contains clear description for implementation
naive algorithms
contains essential building block of practical software development
O(n)
convert the list into a heap
how algorithm run time is measured
counting the number of steps
O(1)
create a heap, find/ return the min elem of a heap, test if the heap is empty
array operations
create an array, return array length, return an iterator for all elements, get the element at index I, set the element at index I to x
vector operations
create an empty vector, create a vector with n copies of x, return the length of a vector, return an iterator for all elements, get the element at index I, set the element at index I to x, add element x to the back (highest index), remove the element at the back (highest index)
linked list operations
create empty linked list, let length, get iterator, get first/last elem, get first/last node, get i's node, add to front/back, insert before/after x and return new node, remove from front/back, remove anywhere, access a node's element
O(n³)
cubic; three nested loops
distance[v] from non-negative single source shortest paths problem
d(subs,v) if s and v are connected infinity if not connected
why the greedy pattern is the simplest algorithm pattern
deals with one piece of input at a time, and repeats that until the input has been handled completely
abstract data type (ADT)
defined by its behavior/ operations; implementation may vary
abstract data types (ADT)
defines a data representation for objects of the type and the set of operations that can be performed on these objects
deleting elements from a binary heap
deleting at any intermediary position can be costly, so the root is replaced by the last element and the last elem is deleted
time complexity
denoted by an efficiency class; the amount of time taken by an algorithm to run, as a function of the length of the input
mathematical analysis
dependent of the activities/ step counts/ input sizes
iterative process
designing an algorithm
TSP by exhaustive optimization
determine the starting node, generate every Hamiltonian path in G, confirm if it's a Hamiltonian cycle by extending it to the starting node (if yes, then consider it a candidate solution), calculate the cost of every permutation and keep track of the minimum cost permutation, return the permutation with the minimum cost
why we study algorithms
develop analytical skills; necessary for solving problems/ coding
forest
disconnected components
linked list variations
doubly-linked list singly-linked list
knapsack problem
dynamic programming time complexity is O(b^w · n)
linked list advantages
dynamic; doesn't need to know how many nodes are in the list (created in memory as needed)
university classroom example
each classroom can be used for x years before needing $y in renovations. Instead of budgeting each classroom on a case-by-case basis, do it more efficiently with amortized cost
weighted graph
each edge may have a non-negative unit cost or numeric weight
dynamic sets
each element is represented by an object whose fields can be examined and manipulated
doubly linked list
each element points to its successor and to its predecessor; a list item stores an element and two pointers, one to its successor and one to its predecessor
adjacency list
each node has 2 values: the destination node, and the weight between the two nodes
features of nodes
each node in a tree has a parent except for the root node
tree nodes
each node in a tree has a parent except for the root node
circuit satisfaction
each variable has 2 possibilities (0, 1); 2^n total possibilities
Dijkstra's algorithm
each vertex becomes the current vertex exactly once
boolean circuit
each vertex is either a literal vertex, an output vertex, an and vertex, an or vertex, or a not vertex
linked list advantages
easy and fast insertions and deletions; no need to move other nodes, just reset some pointers
worst-case vs amortized
efficiency class is true for every operation vs efficiency class is true on average
sets
elements appear in key:value pairs
queue
elements are added to the back and removed from the front
priority queue implemented using a binary heap
enqueue and dequeue is O(log n); always balanced
priority queue
entries are stored according to numeric priority associated with the keys of the entries
experimental analysis and mathematical analysis
evidence based approaches for analyzing algorithm efficiency
exhaustive search vs greedy method
exhaustive explores all possible solutions and returns the best vs greedy starts with a partial solution and improves on it in a way that always gets better, but may not necessarily lead to the best solution
knapsack problem
exhaustive optimization time complexity is O(2^n · n)
MST by exhaustive search
expected output needs n-1 edges, no cycles, and minimum cumulative weight
defined
expected running time is _____ for randomized algorithms
O(c^n)
exponential; the subsets of an n-element set
random permutations
f N is so large, it is unfeasible to generate all permutations of N elements
O(n!)
factorial; all permutations of an n-element sequence
circuit satisfaction
feeding it with an assignment that causes a True evaluation
traveling salesperson problem
find the shortest (minimal cost) and most efficient route to cover a list of destinations
TSP Graph
finding a Hamiltonian cycle in an arbitrary graph is NP-Complete
sequential search
finding an element of a list with a particular property
sequential optimization
finding the best solution among competing alternatives
steps for proving by induction: T(n) ≤ c(f(n)) implies T(n+1) ≤ c(f(n+1))
for T(n) and c(f(n)) substitute n for n+1. Expand and solve. If T(n+1) ≤ c(f(n+1)), it's valid
Fermat's factorization algorithm
for an integer n, a and b are factors that n = a² - b² Where a+-b are factors of n
dropping additive constants
for any complexity functions f(n) and constant c, O(f(n)+c+ = O(f(n))
dropping dominated terms in functions
for any complexity functions f0(n) and f1(n), O(f0(n) + f1(n)) = O(max(𝑓0 (𝑛), 𝑓1 (𝑛)))
sequential search pseudocode
for elem in S: if elem satisfies the condition return elem return None (didn't find anything to satisfy the condition)
factor
for large numbers, no efficient (non-quantum) integer factorization algorithm is known
pros of mathematical analysis
formal, rigorous; no need to implement algorithms; machine-independent
lexicographic algorithms
generate the permutations in alphabetical ordering
selection sort
gets its name because it dedicates a lot of time and attention to selecting which elem to append next
Search(S, k)
given S and a key value k, return a pointer x to an element in S whose key is k (key[x]=k) or NIL, if S does not contain such an element (query)
Delete(S, x)
given a pointer x to an element of S, remove x from S (modifying op.). Note: x is a pointer and not a key value. (modifying op.)
Maximum(S,x)
given a totally ordered set S, return the element of S with the largest key value (query)
Minimum(S, x)
given a totally ordered set S, return the element of S with the smallest key value (query)
Successor(S,x)
given an element x whose key is from a totally ordered set S, return the next larger element in S, or NIL if x is the maximum element (query)
Predecessor(S,x)
given an element x whose key is from a totally ordered set S, return the next smaller element in S, or NIL if x is the minimum element (query)
sparse graph
graph in which the number of edges is close to the minimal number of edges (0)
amortized analysis
guarantees a less-expensive worst-case complexity fir each operation
boolean circuit: output vertex
has exactly one incoming edge and no outgoing edges (tree's root)
traveling salesperson problem
has further applications in emergency management, post-disaster relief delivery and operational business scenarios
spanning tree
has n-1 edges, where n is the number of nodes
boolean circuit: not vertex
has one incoming and one outgoing edge
boolean circuit: and vertex
has two incoming edges and one outgoing edge
boolean circuit: or vertex
has two incoming edges and one outgoing edge
how analysis is important to understanding an algorithm
how to compare different algorithms for a problem
use priority heap
how to improve time complexity of Dijkstra's algorithm
how analysis is important to understanding an algorithm
how to predict an algorithms performance
how analysis is important to understanding an algorithm
how well an algorithm scales up
Euclidean Algorithm
if a < b in gcd(a, b), swap them; divide a by b to get the remainder, if r = 0, b is the GCD; if not, b = a and r = b
pure
if an algorithm leaves its arguments and global variables unchanged, it is ______
in-place
if an algorithm stores its output in the same data struct as its input, it is ______
all of them are valid edges
if there are multiple edges tied for least, then ____
general rules for computing the step count
if there's an if else statement, max(if, else) determines which time to use
experimental analysis steps
implement algorithm in a given programming language, measure runtime with several inputs, and infer running time from the inputs
Dijkstra's algorithm with MH
implementation uses an adjacency list
Dijkstra's algorithm
implementation uses an adjacency matrix
f(n) ≤ O(g(n)) (worst-case)
implies C x g(n) is an upper bound of f(n)
f(n) ≥ Ω(g(n)) (best case)
implies that C x g(n) is a lower bound of f(n)
f(n) = Θ(g(n)) (average case)
implies that C1 x g(n) is an upper bound of f(n) and C2 x g(n) is a lower bound of f(n)
case 3 for limits
implies that f(n) has a larger order of growth than g(n)
case 1 for limits
implies that f(n) has a smaller order growth than g(n)
case 2 for limits
implies that f(n) has the same order growth as g(n)
bridge edges
in any weighted, undirected, connected graph G = (V, E) and partition {L, R} of V, these ________ B ⊆ E are edges with one end in L and the other in R
outline of Greedy Patterns for finding MST
init K as an empty set and partition G, go through finding the minimum weight bridge edge, add it to K, update the partition. Keep doing that until K has n-1 edges, then return K
operations in a max priority queue
init(S), is_empty(S), maximum(S), extract_max(S), increase-key(S,x,k)
boolean circuit evaluation
input: A Boolean circuit 𝐶 = (𝑋; 𝑇) with 𝑘 variables and 𝑛 vertices, and Boolean circuit assignment 𝐴 with 𝑘 variables output: the Boolean output of circuit 𝐶 for assignment 𝐴
Dijkstra's algorithm + MH pseudocode (Binary heap for PQ implementation)
input: Graph + start for all verts in G, make their distances infinite and their previous null if the vert isn't the start, add to PQ set distance S to 0 While PQ != empty current = Min from PQ for all u's unvisited neighbors tempD = dist[u] + u,adj edge weight if tempD < dist[v] dist[v] = tempD prev[v] = u return distance[] and previous[]
circuit satisfaction problem
input: a Boolean circuit 𝐶 with k variables and 𝑛 vertices output: a Boolean assignment 𝐴 that satisfies 𝐶, or 𝑁𝑜𝑛𝑒 if no such assignment exists
Minimum Spanning Tree Problem
input: a connected, undirected, and weighted graph G = (V; E) with n = |V| vertices and m = |E| edges output: set of edges K such that T = (V; K) is a minimum spanning tree for G
traveling salesperson problem
input: a graph G = (V, E) where each edge e ε E has a numeric weight output: a Hamiltonian cycle in G of minimum total weight, or None if no such cycle exists
permutation generation problem
input: a list L of length n output: a list of all permutations of L in unspecified order, where each permutation is a list of exactly n elements
the sorting problem
input: a list U of n comparable elements output: a list S containing the elements of U in non-decreasing order
the factoring problem
input: a positive integer n output: two integers a>1 and b>1 such that 𝑛 = 𝑎. 𝑏 of None, if a and b do not exist
subset generating problem
input: a set U of n distinct elements, represented by an utterable object output: an iterator for every subset S of U, where each subset is represented by a vector of distinct elements
sequential optimization problem
input: an iterator for a sequence S of n > 0 elements output: the element x of S that's the best according to <SPECIFIC MEASURE>
non-negative single source shortest paths problem
input: an undirected graph G = (V, E) with nonnegative edge weights and a start vertex s ε V output: two lists distance and penultimate, each of length n for any v ε V
0-1 knapsack problem
input: non-negative integers W (the size of the knapsack), n (the number of items), X (a list of n weights), V (a list of n values) such that V[i] is the value of item X[i] output: a kist L of items chosen from the list X such that the sum of the weights is less than the size of the knapsack
GCD problem
input: two positive integers a and b output: the greatest positive integer d such that (a mod d = 0) and (b mod d = 0)
Random-Access Machine (RAM)
instructions are executed sequentially; nothing happens simultaneously
iterative process
involves working through many drafts
why basic selection sort is a pure algorithm
it takes in U as an argument, and returns a new list object S, leaving U unchanged
generating candidates
iterate through the elements of the problem instance for the inputs
exhaustive search
iterates through objects in search for a particular kind of object
exhaustive optimization
keep track of the best acceptable candidate that we have seen so far, and update best whenever a superior candidate is found
stack
last element in is the first one out
lower bound of sorting
lays the foundation for the lower bounds/ worst case run time of best possible algorithm of other problems
O(n)
linear; for loop
array vs linked list
linked list is more complex to code and manage than arrays, but has advantages
examples of data structures
lists, arrays, stacks, queues
O(log n)
logarithmic; search a balanced search tree
queue
maintains elements in First-in First-out (FIFO) order
sequential optimization method
make a guess about which element is best, and initialize best with any element; if a better elem is found, better -> best
steps for proving efficiency class with limits
make an informed guess about which efficiency class to use, take the limit [lim n→∞ T(n)/f(n)], if the limit is constant with respect to n and non-negative, conclude that 𝑇 (𝑛) 𝜖 𝑂(𝑓(𝑛)), otherwise try again with a different efficiency class
steps for proving by induction
make an informed guess about which efficiency class to use, use algebra to solve for the constants c and n0 that seem likely to work, prove the base case: T(n) ≤ c(f(n)) when n = n0, prove the inductive step that for any n > n0, T(n) ≤ c(f(n)) implies T(n+1) ≤ c(f(n+1)), conclude that T(n) ϵ O(f(n))
cons of mathematical analysis
math knowledge
Spanning Trees Properties
maximally acyclic; adding one edge to the spanning tree will create a cycle/ loop
worst-case complexity
maximum number of steps
problem definition
may introduce mathematical variables whose scope is limited to that problem definition
space
measured in units of bits, bytes, gigabytes, or generic words
input/output bandwidth (I/O)
measured in units of bytes or blocks
cache
measured in units of integers
energy
measured in units of kilowatt-hours
time
measured in units of seconds, CPU instructions, or generic steps
execution time and memory need
measurements of algorithm quality that this class focuses on
time complexity
measures the time taken to execute each statement of code in an algorithm
Spanning Trees Properties
minimally connected; Removing one edge from the spanning tree disconnects the graph
best case complexity
minimum number of steps
handling nested loops
multiply the inner loop's complexity by the outer loop's complexity (i.e. O((3n + 2)n) → O(n²))
generating candidates
need to be done to design an exhaustive search
Bill payer problem
need to pay bills, but you don't do them right away
general rules for computing the step count
nested loops are analyzed inside out
Single source shortest path applications
network routing, trip planning/driving directions, and other planning problems; robotics; airline crew scheduling
pros of experimental analysis
no math, straightforward method
cons of experimental analysis
not always reliable and is heavily dependent on the sample inputs + the programming language and environment
naive algorithm pattern
not based on any of the formal and structured patterns
loops and subroutine calls
not basic operations
traveling salesperson problem
number of roundtrip permutations and combinations grows extremely fast, with a slight change in the number of destinations
average case running time
often as bad as the worst case running time
Adjacent Exchanges: Johnson-Trotter algorithm (1962)
one of the most prominent permutation enumeration algorithm
simplifying assumptions made when analyzing the running time of an algorithm
only consider the leading term, ignore constants, solutions that take constant time have constant run time
improving the efficiency of an algorithm
optimize the algorithm
data structures
organized to ensure efficient processing
penultimate[v] from non-negative single source shortest paths problem
p(subs,v) if s and v are connected s if s = v infinity if s and v are not connected
Dijkstra's algorithm pseudocode
pass in G and starting node s Init lists/vectors distance (G.v, None), penultimate (G.v, None), and seen (G.v, False), distance[s] = 0 seen[s] = True done = False; while not done: find edge b= {v, u} so path to the end node = shortest If no edge, then done else update u distance with v's distance + weight of b, u's penultimate is v, u is seen return distance and penultimate
experimental analysis steps
pose a question, state a hypothesis, make a prediction, test, analyze
data structures
present specialized format for organizing and storing data
mathematical analysis methodologies
proving efficiency class by induction, limits, or properties of big O
O(log n)
push an element into the heap, pop the minimum element and heapify, pop the top and heapify
inserting an element into a binary heap
put the new element into a placeholder and increase heap size by 1, insert the new element at the end of the heap, fix up/ heapify the heap
O(n²)
quadratic; two nested loops
traveling salesperson problem
received a lot of attention due to its puzzlelike character, and its practical applicability
boolean circuit
recursion can be used to evaluate these
order of growth
relevant for large inputs
constructing a spanning tree from a complete graph
remove a maximum of e-n+1 edges
dequeue
removing an element from a queue
Euclid's algorithm
replace the larger number by the difference of the numbers, and repeating this until the two numbers are equal
graph edges
represents a connection between two vertices
queries
return info about the set, an element in the set, or a group of elements in the set
proper exhaustive search
returns the first acceptable candidate
sequential optimization vs sequential search
search for the best element according to a particular measure vs one element with a particular property
MST by exhaustive search
search for the minimum; need optimization instead of search
knapsack problem
selecting a set of items to fit inside knapsack. Each item has an integer weight and real-number value, the knapsack has an integer weight capacity. The goal is to choose a subset of items that maximizes the total value while fitting within the knapsack's weight capacity.
O(n)
sequential optimization time complexity (assuming the comparisons are in O(1) time)
dynamic sets
sets that can change over time
pseudocode
similar to program source code in a language such as Python or C, but is not required to be syntactically-perfect code.
general rules for computing the step count
simple operations take 1 unit of time
how selection sort operates
smallest element is selected in unsorted, then swapped with the leftmost elem to become part of sorted
naive algorithms
some real-world software development problems can be solved efficiently by ______
selection sort
sort an array by repeatedly finding the minimum element from unsorted part and append it to the beginning of the array
generating pairs
space complexity = O(1)
minimum spanning tree
spanning tree with the lowest possible cost
nⁿ⁻²
spanning trees in a complete graph
strategies for bill payer problem
stack, queue, and heap strategies
examples of priority queues
stacks, queues, and heaps
deleting elements from a binary heap
standard deletion operation on Heap is to delete the element present at the root node of the Heap
in-place selection sort
start at the beginning of U, find the smallest elem and swap it to the front. Keep doing this and move down U until the entire thing is sorted
steps for proving by induction: conclude that T(n) ϵ O(f(n))
state that by induction T(n) ϵ O(f(n))
stack
stores a set of elements in a Last-in First-out (LIFO) order
sequential search
straightforward approach to this is to use a loop to check each element of S for the desired property; if a match is found, return the element and stop the loop
vector
supports adding and removing elements dynamically
vector
supports same operations as an array: retrieving by index, assigning by index, and iteration
boolean circuit: literal vertex
tabled with an index i corresponding to variable xi, no incoming edges, and exactly one outgoing edge (these are the tree's leaves)
how to prove efficiency classes with properties of O
take the limit of T(n)/f(n) as n goes to infinity, if the limit is nonnegative and constant with respect to n
Johnson-Trotter Algorithm
takes O(n!n) total time
termination
takes a finite amount of time/steps
lexicographic algorithms
takes longer because more than one exchange may be needed to generate the next element
step counting
the amount of computing represented by one step may be different from that represented by another
space complexity
the amount of memory required for the algorithm to execute; the less space needed, the better the quality
average case complexity
the input is chosen at random, the average number of steps
analysis
the key to understanding algorithm
exhaustive search
the list of candidate solutions is typically larger than the input size
general rules for computing the step count
the maximum step count of a for loop is the step count of the statements inside the for loop times the number of iterations
n-1 (n is the number of vertices)
the number of edges in the graph/ tree
step count / running time
the number of primitive operations or steps
p(sub s;v) in single-source shortest path
the penultimate (next to last) vertex on a shortest path from s to v L(sub s,v) = (..., p(sub s;v); v)
power set problem
the power set of the set U is the set of all possible subsets of that set, including NULL
semiprimes
the product of two prime numbers
how algorithm quality is measured
the resources used when the algorithm is executed on a computer are measured
E in single-source shortest path
the set of grapg edges where each elem of E is a set of exactly two vertices
V in single-source shortest path
the set of graph vertices
d(sub s;v) in single-source shortest path
the shortest distance between s and v (aka the total weight of all edges visited by some L(sub s;v)
greedy pattern
the simplest algorithm pattern
step count / running time
the sum of steps (or running times) for each executable statement
the total number of steps executed by a for loop
the sum of the number of steps executed in each individual iteration
find, remove, append
the three main steps of a selection sort; need to be detailed
time complexity
the time required for the algorithm to execute; the shorter it is, the better the quality
amortized analysis
the time required to perform a sequence of operations over all the operations performed
W(X) in single-source shortest path
the total weight of a path X
sorted and unsorted
the two subarrays in a selection sort array
ends of the edge
the two vertices connected by an edge
amortized analysis
the university classroom example is an example of what kind of analysis
worst-case running time of an algorithm
the upper bound on the running time for any input
in-place selection sort
the version of selection sort that is mostly used
w(sub e) in single-source shortest path
the weight of any edge e ε E
why loops and subroutines aren't basic operations
they depend on the size of the data and the contents of a subroutine
generating pairs
time complexity = O(|L|·|R|)
knapsack problem
time complexity depends on method and data structures used
O(1)
time complexity for access a node's element node.element
O(1)
time complexity for access an entry's key entry.key
O(1)
time complexity for access an entry's value entry.value
O(1) amortized
time complexity for add element x to the back (highest index) vector.add_back(x)
O(1)
time complexity for add element x to the back and return the new node node = ll.add_back(x)
O(1)
time complexity for add element x to the front and return the new node node = ll.add_front(x)
O(1)
time complexity for all queue operations
O(1)
time complexity for all stack operations
O(n)
time complexity for create a vector with n copies of x vector = Vector(n, x)
O(n)
time complexity for create an array array = Array(n, x)
O(1)
time complexity for create an empty linked list ll = LinkedList()
O(1)
time complexity for create an empty vector vector = Vector()
O(1)
time complexity for get an iterator for all elements iter(ll)
O(1)
time complexity for get the element at index I vector[i]
O(1)
time complexity for get the element at index i array[i]
O(1)
time complexity for get the first element of a non-empty list x = ll.first()
O(1)
time complexity for get the first node of of a non-empty list node = ll.first_node()
O(1)
time complexity for get the last element of a non-empty list x = ll.back()
O(1)
time complexity for get the last node of of a non-empty list node = ll.last_node()
O(1)
time complexity for get the length of a linked list len(ll)
O(n)
time complexity for get the node at index i node = ll.node_at(i)
O(1)
time complexity for insert element x after node p and return the new node node=ll.insert_after(p,x)
O(1)
time complexity for insert element x before node p and return the new node node=ll.insert_before(p,x)
O(1)
time complexity for remove a node at an arbitrary position and return its element x = ll.remove(node)
O(1)
time complexity for remove and return the first element of a non-empty list x = ll.remove_first()
O(1)
time complexity for remove and return the last element of a non-empty list x = ll.remove_last()
O(1) amortized
time complexity for remove the element at the back (highest index) vector.remove_back()
O(1)
time complexity for return an iterator for all elements iter(array)
O(1)
time complexity for return an iterator for all elements iter(vector)
O(1)
time complexity for return array length len(array)
O(1)
time complexity for return the length of a vector len(vector)
O(1)
time complexity for set the element at index I to x array[i] = x
O(1)
time complexity for set the element at index I to x vector[i] = x
O(|V|²) time
time complexity of Dijkstra's algorithm using an adjacency matrix
O(n)
time complexity of sequential search (assumes each elem can be tested in O(1) time)
Less than O(|V|²)
time it takes to do the rest of Dijkstra's algorithm
can be used to measure efficiency
time, space, I/O bandwidth, cache, energy
subset generating problem
to create a vector with a subset of U, need to map from an int 𝑖 ∈ [0, 2^n] to a specific subset 𝑆𝑖 ⊆ 𝑈
exhaustive optimization
to look for and return an optimal candidate
n(n-1)/2
total number of edges in a complete graph
Huffman encoding
used in compressing data
index
used to access elements and to permit alteration of individual elements
order of growth for large inputs
used to determine the asymptotic efficiency of algorithms
Dijkstra's algorithm
used to find the shortest path through a graph
Random-Access Machine (RAM)
used to predict running time
adjacency list
uses linked list of vertex objects
mathematical analysis
uses math to estimate the running time od an algorithm
experimental analysis
uses scientific knowledge verification methods
RSA algorithm
uses semiprimes to calculate public and private keys
mathematical analysis
using the mathematical method of modeling, lemma, and proof
experimental analysis
using the scientific method of hypothesis, experiment, and empirical data analysis
naive algorithm pattern
usually an adhoc approach
factor
usually concerns finding nontrivial factors that != 1 or n
MST by exhaustive search
verify and compare sub algorithms need to be separated
execution time increases
what happens when time complexity and input size increase
any kind
what kinds of nodes can be the root node
naive / simple
when ________ algorithms are possible and optimal, use them instead of complicated ones
when the edge weights aren't unique
when can a graph have more than one minimum spanning tree
when it consumes few resources
when is an algorithm efficient
when the first algorithm has a lower order of growth
when is an algorithm is more efficient than another
the execution time depends on the processor's speed
why measuring the execution time of an algorithm is not useful
O(n²)
worst case time complexity for pure selection sort
O(n²)
worst case time complexity of in-place selection sort
subset generating
yields a sequence of 2^n vectors of length n at most
Johnson-Trotter Algorithm
yields a sequence of n! vectors of length n
in-place selection sort
|U| + |S| = n Because each iteration of while moves an elem out of U and into S
formal notation for for loop summation
∑t(sub(x)) (x ϵ X)
undirected graph
𝐺 = (𝑉, 𝐸) where 𝑉 is a set whose elements are called vertices, and 𝐸 is a set of unordered pairs of distinct elements of V
Single source shortest path elements
𝐺, V, E, path, w(sub e), W(X), L(sub s;v), d(sub s;v), p(sub s;v)