CS325 - Final Notes
If the solution obtained by an approximation algorithm is : 10 The optimal solution is : 5 What will be the value of the approximation ratio? A) 2 B) 5 C) 0.5 D) 1
The correct answer is A
In dynamic programming, the technique of storing the previously calculated values is called A) Storing value property B) Saving value property C) Memoization D) Mapping
The correct answer is C) Memoization
Consider the graph M with 3 vertices. Its adjacency matrix is shown below. M = [[0, 1, 1], [1, 0, 1], [1, 1, 0]] This graph has ____ number of spanning trees. A) 1 B) 4 C) 3 D) 2
The correct answer is C.
What does it mean for a problem to have optimal substructure?
When a problem requests that you find the optimal value and the optimal solution to the problem can be obtained by solving for the optimal solution for the problem's subproblems. When this is true, we say the problem has optimal substructure.
is 5^n > n^5 as you compare asymptotic growth?
Yes.
How many hamiltonian circuits does a complete graph of n elements have at most?
(N-1)!/2
List a few common known NP-Hard problems
Circuit SAT, 3SAT, Subset-Sum, Knapsack, Independent Set, and Traveling Salesman Problem
What is a Hamiltonian Cycle?
A hamiltonian cycle is a cycle in a graph that visits each node exactly once. Given a Graph G, determining if the graph has a hamiltonian cycle consisting of all vertices is a NP-Complete problem. Note: It's worth noting that if you drop an edge from the hamiltonian circuit graph, you have an MST. MST and Hamiltonian Circuit problems are very similar.
What is the loop invariant? How would you write one?
A loop invariant is some condition that is true at every iteration of a loop. In a insertion sort, for example, you would say that at every iteration of the loop the subarray A[1..i-1] consists of elements originally in array A{1...i-1] but in sorted order.
What is an MST (Minimum Spanning Tree)?
A minimum spanning tree is the spanning tree of G whose cost is the least among any other spanning trees of G. It is the lowest cost spanning tree, visiting all nodes once. A graph has V^(V-2) possible trees. An MST will have (n-1) edges as compared to its parent graph. The two most common solutions to MST are Prim's and Kruskal's algorithm.
Which of the following is true for Prim's algorithm? A) It can be implemented using a heap B) It never accepts cycles in the MST C) It is a greedy Algorithm
All of the above. Prim's algorithm can be implemented using a heap - it never accepts cycles in the MST - AND it is a greedy algorithm.
What is a P Problem?
Any decision problem that can be solved in Polynomial time is in the set of P - for polynomial. The P set is a subset of the NP set - meaning that all members of P are also members of NP, but the reverse is not true.
What is an NP Problem?
Any problem that cannot be solved in P time but can be verified in P time is in NP - for Non Deterministic Polynomial.
Given an array A of size n, we want to access the ith element in the array, 0<i<n. What will be the time complexity of this operation? A) O(1) B) O(n) C) O(n2) D) O(log n)
Array's are stored in linear memory, so we can access any value between 0 and 'n' in constant time. The correct answer is A - O(1)
Given two vertices s and t in a connected graph G, which of the two traversals, BFS and DFS can be used to find if there is a path from s to t?
Both BFS and DFS can be used to find if there is a path from s to t.
How do you calculate the approximation ratio?
C : The solution produced by the algorithm C* : The optimal solution p(n) = max(C/C*, C*/C)
What is the difference between Divide and Conquer and Dynamic Programming?
Divide and Conquer typically combines the solution of its subproblems, whereas Dynamic Programming solves the subproblems and uses the results to determine the larger problem. The difference is combining subproblems vs solving them.
Define the Divide and Conquer approach to programming
Divide and conquer is a recursive method for solving a problem that breaks the problem into three steps. Divide - Break the problem into smaller subproblems. Conquer - This step involves solving the subproblems. Combine - Combine the solutions of the subproblems into the solution of the larger problem.
Describe Djikstra's Algorithm
Djikstra is a shortest path finding algorithm that works for weighted graphs with non-negative weights. It does not work for negative weighted graphs. Djikstra is a Greedy Method type algorithm. Solve using the following steps: 1) Initialize distances of all vertices to infinity from source. 2) Set the distance of sourceNode to zero 3) while length(unvisitedVertices) is greater than 0 - dist[vertice] = min(dist[vertice], weight(currentNode) + dist[currentNode]
Define the Dynamic Programming approach to programming
Dynamic Programming is used to find the most optimal solution to problems with strictly overlapping sub-problems.
For every decision problem there is a polynomial time algorithm that solves it (T/F)
False
When performing the topological sort we always find a unique solution.
False
If problem A is in NP then it is NP-complete. (T/F)
False. Explanation: For a problem to be NP-Complete, it has to be proven NP-Hard and in the NP subset. In this case, we know that problem A is in the NP subset but we have not proven that it is NP-Hard, so it is not NP-Complete.
If there is a polynomial time reduction from a problem A to Circuit SAT then A is NP-hard. (T/F)
False. Explanation: If Circuit SAT (known NP-Hard) could be reduced to Problem A in polynomial time, then it would be proven NP-Hard. The NP Hard problem has to reduce to the problem we are solving. In this case, the problem we are solving (Problem A) is being reduced to the known NP Hard problem - that does not tell us enough to say if Problem A is NP-Hard or not.
A spanning tree of a graph should contain all the edges of the graph. (T/F)
False. A spanning tree of a graph should contain all the vertices of a graph, but not necessarily all the edges.
What is the definiteion of Big Theta notation?
For large values n the running time of f(n) is at least a⋅g(n) and at most b⋅g(n)
What is the definition of Big Omega notation?
For large values of n the running time of f(n) is at least b⋅g(n)
What is the definition of Big O Notation?
For large values of n the running time of f(n) is at most b⋅g(n)
Describe the 3SAT problem
Given a 3 CNF Boolean Formula, is the formula satisfiable? CNF is Conjunctive Normal Form - an example of a 3 CNF problem, like 3SAT, is (a ∨ b ∨ c) ∧ (b ∨ c ∨ c¯ ) ∧ (d ∨ e ∨ c¯ )
What is DFS? Describe it's implementation
In Depth First Search (DFS) we start at the root of the graph and go as far as we can. This is usually implemented recursively and backtracking when you hit a dead end. TIme complexity is O(V+E)
What is BFS? Describe it's implementation
In a Breadth-First Search (BFS) we search the graph level by level. This is usually implemented with a Queue, enqueueing neighbor nodes and visiting them in sequence. TIme complexity is O(V+E)
Describe the 0/1 knapsack problem
In an 0/1 knapsack problem there is one copy of each item and your options are whether or not to include it. Find the best solution for target weight. This problem can be solved in the same manner as an unbound knapsack problem, but with the addendum that we are only considering each item once. A bottom-up approach with tabulation will work. dp[x][i] = max(dp[x][i] , dp[x-wi][i-1] + vi)
Describe the Knapsack problem
In an unbound knapsack problem where each entry has unlimited copies, you would break it down using optimal substructure. By solving the optimal purchase for smaller weights we can build up to the larger weights. This is a bottom-up approach using tabulation. dp[x] = max(dp[x] , dp[x-wi] + values[i])
What is the Backtracking approach to programming?
In backtracking, you evaluate every possibility - we stop the evaluation if we meet any constraints and take to evaluate other routes. We continue until we have explored every possible path.
Describe the Traveling Salesman Problem
In simple terms, the Traveling Salesman Problem (TSP) is "Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?". The simplest approach to solve the TSP problem is by using the Greedy method. You start at an arbitrary vertex and approach the next closest vertex that has not yet been visited. This is called 'closest point heuristic'.
Describe Kruskal's Algorithm
Kruskal's algorithm looks for the minimum edge in the entire graph, G. This differs from Prim's algo in that it does not only look at neighbors, but all of G. Kruskal will check if the minimum edge creates a cycle - if it does not create a cycle, it is added to the minimum tree. The Naive time complexity of Kruskal is O(EV), but using a disjoint data structure it can be improved to O(ElogV)
What data structure can be used to implement the Djikstra Algorithm most efficiently?
Min Priority Queue
What is the most efficient data structure for Djikstra?
Min Priority Queue based off a minHeap. With this implemented, the new time complexity of the algorithm is O((V+E)logV) time complexity.
What is an NP Complete Problem? How would you prove NP-Complete?
Not all NP-Hard problems are in the NP subset. If a problem is NP-Hard AND in NP, it is called NP complete. To prove for NP-Complete, a problem must be verifiable in polynomial time and proven NP-Hard.
What is the time complexity of Djikstra naively implemented?
O(V^2)
Describe Prim's Algorithm
Prim's algorithm uses the Greedy method to solve for the MST of a graph. Prim chooses an arbitrary node in the graph and then expands to the neighbor with the lowest weight. Time complexity of prim is O(VE) if implemented naively. If you use a minimum priority queue, however, we can reduce that to O(ElogV).
Prim's and Kruskal's algorithms to find the MST follows which of the algorithm paradigms? Brute Force Divide and Conquer Dynamic Programming Greedy Approach
Prim's and Kruskal's algorithms are both Greedy Methods. In slightly different ways, both build larger solution from locally accurate steps and trust that it will be optimal.
Describe the N-Queens Problem
The N Queens problem relies on backtracking to exhaustively look for a route that allows N Queens to exist on the board without attacking eachother. This is an exhaustive search where you start by placing queens in the first available spot until you have placed them all or you have no more valid spots. If you place them all, return. If not, backtrack one queen and look for another solution that lets you place them all. Continue until you find or solution. Time complexity for this backtracking algorithm is O(n!)
What is an NP Hard Problem? How do you prove a problem is NP hard?
The NP Hard set of problems includes NP problems and others - despite the name. To prove a problem is NP hard, you must take a problem that is already known to be NP Hard. Take the NP-Hard problem and reduce it to the problem you are trying to prove. If you suceed in polynomial time, the problem is NP-Hard.
For an undirected graph G, what will be the sum of degrees of all vertices. (degree of a vertex is the number of edges connected to it.) V: number of vertices, E: number of edges. A) |V| B) 2|E| C) |V|+|E| D) |E|
The answer is 2 |E| or twice the number of edges. The correct answer is B. Explanation: Since the given graph is undirected, every edge contributes as 2 to sum of degrees. So the sum of degrees is 2E.
Which of the following techniques can be called as intelligent exhaustive search? A) Backtracking B) Divide and Conquer Approach C) Greedy Approach D) Dynamic Programming
The answer is A) Backtracking
Select correct inequality for the asymptotic order of growth of the below function. ∑1n(i) This means summation on i where i ranges from 1 to n < nk Where k>2
The answer was True. ∑1n(i) This means summation on i where i ranges from 1 to n - is less than n^k where k > 2. Note: revisit this later - not entirely sure how to solve
Given a sorted array A of size n, we want to find if an element k belongs to this array. What will be the best time complexity to perform this search operation? O(n) O(1) O(n2) O(log n)
The average amortized time complexity of a binary search on a sorted linear array is O(log n). The correct answer is D.
Desribe the bottom-up approach of Dynamic Programming
The bottom-up approach starts from the base case and builds up to the larger solution. Tabularization is often used for this approach.
Describe the Change Making Problem
The change making problem relies on Dynamic Programming and optimal substructure to complete in an efficient manner. You can use top-down memoization or bottom-up tabulation for this problem. Time complexity is O(A*n)
We use reduction to prove that NP-Completeness of a problem X from A. As a part of reduction we must prove which of the following statements? Assume A is a NP-Hard problem. Statement P: A can be transformed to X in a polynomial time Statement Q: We can obtain solution to A from X in polynomial time A) P alone B) Neither P nor Q C) Q alone D) Both P and Q
The correct answer is D.
Which of the following are propertis of a dynamic programming problem? A) Optimal Substructure B) Has a greedy solution C) Overlapping Subproblems D) Both optimal substructure and overlapping subproblems
The correct answer is D. Both optimal substructure and overlapping subproblems are characteristics of a dynamic programming problem.
In the exploration to show that the independent set problem is NP-Complete we have used which of the following NP-Hard problems? A) None of the options B) 2SAT C) Circuit SAT D) 3SAT
The correct answer is D. The 3SAT problem was reduced to independent set problem in linear time, thereby proving independent set was NP-Hard
Define the Greedy Approach of programming
The idea of a greedy algorithm is to make the best (Greedy) choice at each step of the problem with the hope that this will ultimately lead to the optimal solution.
Describe the Longest Common Subsequence Problem, it's recurrence formula, and its pseudocode
The longest common subsequence problem is solved recursively using a Dynamic Programming approach. It acts based on whether both values in str1 and str2 match. 1) If the values match, increment i and j 2) If the values don't match, j 3) Repeat the second step until you find another match OR hit the end of string. If you hit the end of string, return j to the last unmatched value and increment i. LCS[i,j] = 1 + LCS[i-1, j-1] Commonly solved using bottom-up tabulization
How do you calculate the number of spanning tree's possible from a graph M?
The number of spanning tree's possible for graph M is V^(V-2)
Describe the tabulaiton method of Dynamic Programming
The practice of iteratively creating a table of sub-problems building up to the solution of a larger problem. Commonly used in bottom-up approach of Dynamic Programming.
What is Memoization?
The practice of memoization is to store values of unique subproblems in a Top-down DP solution. By storing the solutions of subproblems - you are saving the algorithm from having to repeat the same calculations time and time again.
Describe the Top-Down approach of Dynamic Programming
The top-down approach to Dynamic Programming break's the problem up into overlapping subproblems. 'memoization' is often used to store the results of sub-problems, making the algorithm much more efficient.
The following pseudocode is for which of the algorithms? def mystery(G): s <- pick a source vertex from V for v in V: dist[v] = infinity prev[v] = Empty #initalize source dist[v] = 0 prev[v] = s #update neighbouring nodes of s for node in s.neighbours dist[v] = w(s,node) prev[v] = s while(len(visited)<len(V)): CurrentNode = unvisited vertex v with smallest dist[v] MST.add((prevNode, CurrentNode)) for node in CurrentNode.neighbours: dist[node] = min(w(CurrentNode, node), dist[node]) if dist[node] updated: prev[node] = CurrentNode visited.add(CurrentNode) return MST
This is Prim's algorithm. When looking for Prim's algorithm - the clues are that it is an MST problem looking for solutions from Prim's neighbors - instead of from the whole graph (like Kruskal).
Describe the Circuit SAT problem
This question asks - given there is a boolean circuit, is there a set of inputs that makes the circuit output TRUE. If it is true, we say the circuit is SAT (satisfiable).
A graph can have many spanning trees. (T/F)
True
Dijkstra is used to solve for a shortest path in a weighted graph with non negative weights. (T/F)
True
Is the asymptotic order of growth of n! > 2^n? (T/F) How do you solve this type of question?
True Note: revisit this later
Removing the maximum weighted edge from a Hamiltonian cycle will result in a Spanning Tree. (T/F)
True.
If problem A can be solved in polynomial time then A is in NP. (T/F)
True. Explanation: If Problem A can be solved in polynomial time then it is a part of the 'P' subset of 'NP'. All items that are a part of 'P' are also a part of 'NP'.
Is the asymptotic order of growth of log n < sqrt(n)? (T/F) How do you solve this type of question?
True. Note: revisit this later
Is the asymptotic order of growth of n log n < 2^n? (T/F) How do you solve this type of question?
True. Note: revisit this later
Is the asymptotic order of growth of n^2 > n log n? (T/F) How do you solve this type of question?
True. Note: revisit this later
A graph can be represented as a tuple containing two sets. For example: A= ({...},{...}) (T/F)
True. A graph can be represented as a tuple containing two sets, one set for vertices and one set for edges.
P=NP (Polynomial = Non-Deterministic Polynomial) (T/F)
Unknown. Explanation - This is a topic of research right now. Trying to reduce NP algorithms to polynomial time solutions is an active school of research. So far, nothing has been proven one way or another.
Given an array A of size n, we want to find if an element k belongs to this array. What will be time complexity of this search operation? Assume that we don't know anything about the order of elements in the array. A) O(n) B) O(n2) C) O(log n) D) O(1)
Using a linear scan in an unsorted array, the worst case time complexity would be O(n).
What does it mean for a problem to have overlapping subproblems?
When a problem demands the same calculation many times, it is said to have overlapping subproblems. Methods such as memoization and tabulization are used to mitigate this.
Describe the Activity Selection Program
You are given a list of activities {a1, a2,...an} with their start times [s1, s2,...sn] and end times [e1,e2,....en]. Example: activities: {Play Golf, Paint, Cook, Sleep, Jog, Write Code, Eat} start times: [1, 3, 1, 3, 4, 6, 8] end times: [3, 4, 4, 6, 6, 9, 9] The solution to the Activity Selection program lies in the greedy approach. You take the possible activites for a given time and arrange it by end time - take the one with the earliest possible end time. Then pick the next choice with the earliest possible end time. So on and so forth.