Algorithms Test 2

Ace your homework & exams now with Quizwiz!

Give a dynamic-programming solution to the 0-1 knapsack problem that runs in O(n W) time, where n is number of items and W is the maximum weight of items that the thief can put in his knapsack.

DYNAMIC-0-1-KNAPSACK(v,w,n,W) for w <- 0 to W do c[0,w] <- 0 end for for i <- 1 to n do c[i,0] <- 0 for w <- 1 to W do if wi <= w then if vi+c[i-1,w-wi] > c[i-1,w] then c[i,w] <- vi + c[i-1,w-wi] else c[i,w] <- c[i-1,w] end if else c[i,w] <- c[i-1,w] end if end for end for return c[n,W]

Suppose that we have a set of activities to schedule among a large number of lecture halls. We wish to schedule all the activities using as few lecture halls as possible. Give an efficient greedy algorithm to determine which activity should use which lecture hall. (This is also known as the interval-graph coloring problem. We can create an interval graph whose vertices are the given activities and whose edges connect incompatible activities. The smallest number of colors required to color every vertex so that no two adjacent vertices are given the same color corresponds to finding the fewest lecture halls needed to schedule all of the given activities.)

Find the smallest number of lectures halls to schedule a set of activities S in.To do this efficiently move throught the activities according to starting and finishing times. Maintain two lists of lecture halls: Halls that are busy at time t and halls that are free at time t. When t is the starting time for some activity schedule this activity to a free lecture hall and move the hall to the busy list. Similarly, move the hall to the free list when the activity stops. Initially start with zero halls. If there are no halls in the free list create a new hall. The above algorithm uses the fewest number of halls possible : Assume the algorithm used m halls. Consider some activity a that was the first scheduled activity in lecture hall m. i was put in the mth hall because all of the m-1 halls were busy, that is, at the time a is scheduled there are m activities occurring simultaneously. Any algorithm must therefore use at least m halls, and the algorithm is thus optimal. The algorithm can be implemented by sorting the activities. At each start or finish time we can schedule the activities and move the halls between the lists in constant time. The total time is thus dominated by sorting and is therefore O(nlgn).

Show how to compute the length of an LCS using only 2 · min(m, n) entries in the c table plus O(1) additional space. Then show how to do this using min(m, n) entries plus O(1) additional space.

First Case

Write pseudocode for the procedure CONSTRUCT-OPTIMAL-BST(root) which, given the table root, outputs the structure of an optimal binary search tree. For the example in Figure 15.8, your procedure should print out the structure k2 is the root k1 is the left child of k2 d0 is the left child of k1 d1 is the right child of k1 k5 is the right child of k2 k4 is the left child of k5 k3 is the left child of k4 d2 is the left child of k3 d3 is the right child of k3 * d4 is the right child of k4 d5 is the right child of k5

#include <iostream> using namespace std; const int MaxVal = 9999; const int n = 5; // Search probability root and virtual keys double p[n + 1] = {-1,0.15,0.1,0.05,0.1,0.2}; double q[n + 1] = {0.05,0.1,0.05,0.05,0.05,0.1}; int root[n + 1][n + 1];//Record root node double w[n + 2][n + 2];// Subtree probability sum double e[n + 2][n + 2];//Subtree Expected Cost void optimalBST(double *p,double *q,int n) { //Initializes only children with virtual keys for (int i = 1;i <= n + 1;++i) { w[i][i - 1] = q[i - 1]; e[i][i - 1] = q[i - 1]; } //From bottom to top, step by step from left to right for (int len = 1;len <= n;++len) { for (int i = 1;i <= n - len + 1;++i) { int j = i + len - 1; e[i][j] = MaxVal; w[i][j] = w[i][j - 1] + p[j] + q[j]; //Find the root of the subtree with the lowest cost for (int k = i;k <= j;++k) { double temp = e[i][k - 1] + e[k + 1][j] + w[i][j]; if (temp < e[i][j]) { e[i][j] = temp; root[i][j] = k; } } } } } //Output optimal binary search tree root of all subtrees void printRoot() { cout << "The root of each subtree:" << endl; for (int i = 1;i <= n;++i) { for (int j = 1;j <= n;++j) { cout << root[i][j] << " "; } cout << endl; } cout << endl; } //Print the structure of the optimal binary search tree //Prints the [i,j] subtree, which is the left subtree and right subtree of root r void printOptimalBST(int i,int j,int r) { int rootChild = root[i][j];//Subtree Root Node if (rootChild == root[1][n]) { //Output the root of the entire tree cout << "k" << rootChild << "Root" << endl; printOptimalBST(i,rootChild - 1,rootChild); printOptimalBST(rootChild + 1,j,rootChild); return; } if (j < i - 1) return; else if (j == i - 1)//Encountered a virtual key { if (j < r) cout << "d" << j << "Yes" << "k" << r << "Left Child" << endl; else cout << "d" << j << "Yes" << "k" << r << "Right Child" << endl; return; } else//Encounter internal nodes { if (rootChild < r) cout << "k" << rootChild << "Yes" << "k" << r << "Left Child" << endl; else cout << "k" << rootChild << "Yes" << "k" << r << "Right Child" << endl; } printOptimalBST(i,rootChild - 1,rootChild); printOptimalBST(rootChild + 1,j,rootChild); } int main() { optimalBST(p,q,n); printRoot(); cout << "Optimal binary tree structure:" << endl; printOptimalBST(1,n,-1); }

Give an O(n lg n)-time algorithm to find the longest monotonically increasing sub-sequence of a sequence of n numbers. (Hint: Observe that the last element of a candidate subsequence of length i is at least as large as the last element of a candidate subsequence of length i - 1. Maintain candidate subsequences by linking them through the input sequence.)

#include <iostream> using namespace std; int find(int *a, int len, int n) { int left(0),right(len),mid = (left+right)/2; while(left <= right) { if(n > a[mid]) left = mid + 1; else if(n < a[mid]) right = mid - 1; else return mid; mid = (left + right)/2; } return left; } int main() { int n, a[100], c[100], i, j, len; cin >> n; for(int i = 0;i < n;i++) cin >> a[i]; c[0] = -1; c[1] = a[0]; len = 1; for(i = 1;i <= n;i++) { j = find(c, len, a[i]); c[j] = a[i]; if(j > len) len = j; } cout << len << endl; return 0; }

What are the steps to develop a greedy algorithm?

1. Determine the optimal substructure of the program. 2. Develop a recursive solution. 3. Show that if we make the greedy choice, then only one subproblem remains. 4. Prove that it is always safe to make the greedy choice. 5. Develop a recursive algorithm that implements the greedy strategy. 6. Convert the recursive algorithm to an iterative algorithm.

What is the activity-selection problem?

A problem in which we wish to select a maximum-size subset of mutually compatible activities.

Professor Gekko has always dreamed of inline skating across North Dakota. He plans to cross the state on highway U.S. 2, which runs from Grand Forks, on the eastern border with Minnesota, to Williston, near the western border with Montana. The professor can carry two liters of water, and he can skate m miles before running out of water. (Because North Dakota is relatively flat, the professor does not have to worry about drinking water at a greater rate on uphill sections than on flat or downhill sections.) The professor will start in Grand Forks with two full liters of water. His official North Dakota state map shows all the places along U.S. 2 at which he can refill his water and the distances between these locations. The professor's goal is to minimize the number of water stops along his route across the state. Give an efficient method by which he can determine which water stops he should make. Prove that your strategy yields an optimal solution, and give its running time.

Answer The optimal strategy is the obvious greedy one. Starting with both bottles full, Professor Gekko should go to the westernmost place that he can re ll his bottles within m miles of Grand Forks. Fill up there. Then go to the westernmost re lling location he can get to within m miles of where he lled up, ll up there, and so on. Looked at another way, at each re lling location, Professor Gekko should check whether he can make it to the next re lling location without stopping at this one. If he can, skip this one. If he cannot, then ll up. Professor Gekko doesn't need to know how much water he has or how far the next re lling location is to implement this approach, since at each llup, he can determine which is the next location at which he'll need to stop. This problem has optimal substructure. Suppose there are n possible re lling locations. Consider an optimal solution with s re lling locations and whose rst stop is at the kth location. Then the rest of the optimal solution must be an optimal solution to the subproblem of the remaining n-k stations. Otherwise, if there were a better solution to the subproblem, i.e., one with fewer than s-1 stops, we could use it to come up with a solution with fewer than s stops for the full problem, contradicting our supposition of optimality. This problem also has the greedy-choice property. Suppose there are k re ll- ing locations beyond the start that are within m miles of the start. The greedy solution chooses the kth location as its rst stop. No station beyond the kth works as a rst stop, since Professor Gekko would run out of water rst. If a solution chooses a location j < k as its rst stop, then Professor Gekko could choose the kth location instead, having at least as much water when he leaves the kth location as if he'd chosen the jth location. Therefore, he would get at least as far without lling up again if he had chosen the kth location.

Describe a greedy algorithm that, given a set {x1, x2, ..., xn} of points on the real line, determines the smallest set of unit-length closed intervals that contains all of the given points. Prove that your greedy choice is correct and that the problem has optimal substructure

First we sort the set of n points {x1; x2; ...; xn} to get the set Y = {y1; y2; ...; yn} such that y1 ≤ y2 ≤ ... ≤ yn. Next, we do a linear scan on {y1; y2;...; yn} started from y1. Everytime while encountering yi, for some i ∈ {1;....; n}, we put the closed interval [yi; yi + 1] in our optimal solution set S, and remove all the points in Y covered by [yi; yi + 1]. Repeat the above procedure, finally output S while Y becomes empty. We next show that S is an optimal solution. We claim that there is an optimal solution which contains the unit-length interval [y1; y1+ 1]. Suppose that there exists an optimal solution S∗ such that y1 is covered by [x′; x′+1] ∈ S∗ where x′ < 1. Since y1 is the leftmost element of the given set, there is no other point lying in [x′; y1). Therefore, if we replace [x′; x′+1] in S∗ by [y1; y1+1], we will get another optimal solution. This proves the claim and thus explains the greedy choice property. Therefore, by solving the remaining subproblem after removing all the points lying in [y1; y1 + 1], that is, to find an optimal set of intervals, denoted as S′, which cover the points to the right of y1 + 1, we will get an optimal solution to the original problem by taking union of [y1; y1 + 1] and S′. The running time of our algorithm is O(n log n+n) = O(n log n), where O(n log n) is the time for soring and O(n) is the time for the linear scan.

What is a good example of greedy algorithms?

For this algorithm, a simple example is coin-changing: to minimize the number of U.S. coins needed to make change for a given amount, we can repeatedly select the largest-denomination coin that is not larger than the amount that remains.

Give an O(n^2)-time algorithm to find the longest monotonically increasing subsequence of a sequence of n numbers.

Given a sequence X = <x1,x2,...,xn> we wish to find the longest monotonically increasing subsequence. 1.First we sort the string X which produces sequence X'. 2.Finding the longest common subsequence of X and X' yields the longest monotonically increasing subsequence of X. The running time is O(n^2) since sorting can be done in O(nlgn) and the call to LCS-LENGTH is O(n^2). answer: Let A[0..n-1] be the input array and LIS[0..n-1] be used to store lengths of longest increasing subsequences ending at LIS[i] for each i. LIS[i] = 1, 0 <= i < n for i = 1 to n-1: for j = 0 to i-1: if A[i] > A[j] and LIS[j] + 1 > LIS[i]: LIS[i] = LIS[j] + 1 Required solution = max(LIS[i], 0<=i<n) The time complexity of the above solution is O(n^2). Example : Arrary = { 10, 22, 29, 33, 21, 50, 41, 60, 70 } Length of lis is 7

How are greedy algorithms better than dynamic programming?

Greedy algorithms provide an optimal solution for many such problems much more quickly than would a dynamic-programming approach.

Huffman Coding

Huffman coding is a lossless data compression algorithm. The idea is to assign variable-length codes to input characters, lengths of the assigned codes are based on the frequencies of corresponding characters. The most frequent character gets the smallest code and the least frequent character gets the largest code. The variable-length codes assigned to input characters are Prefix Codes, means the codes (bit sequences) are assigned in such a way that the code assigned to one character is not prefix of code assigned to any other character. This is how Huffman Coding makes sure that there is no ambiguity when decoding the generated bit stream. There are mainly two major parts in Huffman Coding 1) Build a Huffman Tree from input characters. 2) Traverse the Huffman Tree and assign codes to characters. Steps to build Huffman Tree Input is array of unique characters along with their frequency of occurrences and output is Huffman Tree. 1. Create a leaf node for each unique character and build a min heap of all leaf nodes (Min Heap is used as a priority queue. The value of frequency field is used to compare two nodes in min heap. Initially, the least frequent character is at root) 2. Extract two nodes with the minimum frequency from the min heap. 3. Create a new internal node with frequency equal to the sum of the two nodes frequencies. Make the first extracted node as its left child and the other extracted node as its right child. Add this node to the min heap. 4. Repeat steps#2 and #3 until the heap contains only one node. The remaining node is the root node and the tree is complete.

dynamic programming

In mathematics and computer science, dynamic programming is a method for solving complex problems by breaking them down into simpler subproblems. It is applicable to problems exhibiting the properties of overlapping subproblems which are only slightly smaller and optimal substructure (described below). When applicable, the method takes far less time than naive methods.

Suppose that instead of always selecting the first activity to finish, we instead select the last activity to start that is compatible with all previously selected activities. Describe how this approach is a greedy algorithm, and prove that it yields an optimal solution.

It is the same idea that begins and ends first. GREEDY-ACTIVITY-SELECTOR(s,f) n <- length[s] A <- {an} i <- n for m <- n-1 to 1 do if fm <= si then A <- A U {am} i <- m return A

What is an optimal binary search tree?

It's a binary search tree that has most frequented nodes at the top.

Give a memoized version of LCS-LENGTH that runs in O(mn) time.

LCS-LENGTH(X, Y) m <- length[X] n <- length[Y] for i <- 1 to m do for j <- 1 to n do c[i,j] <- -1 end for end for return LOOKUP-LENGTH(X,Y,m,n) LOOKUP-LENGTH(X,Y,i,j) if c[i,j] > -1 then return c[i,j] end if if i = 0 or j = 0 then c[i,j] <- 0 else if X[i] = Y[j] then c[i,j] <- LOOKUP-LENGTH(X,Y,i-1,j-1)+1 else c[i,j] <- max(LOOKUP-LENGTH(X,Y,i,j-1),LOOKUP-LENGTH(X,Y,i-1,j)) end if end if return c[i,j]

Knuth [184] has shown that there are always roots of optimal subtrees such that root[i, j - 1] ≤ root[i, j] ≤ root[i + 1, j] for all 1 ≤ i < j ≤ n. Use this fact to modify the OPTIMAL-BST procedure to run in Θ(n2) time.

Line 9 is replaced with: if i = j r <- j else for r <- root[i,j-1] to root[i+1,j]

show that no compression scheme can expect to compress a file of randomly chosen 8-bit characters by even a single bit.

Notice that the number of possible source files S using n bit and compressed files E using n bits is 2^n+1 - 1. Since any compression algorithm must assign each element s ∈ S to a distinct element e ∈ E the algorithm cannot hope to actually compress the source file.

prove that the fractional knapsack problem has the greedy-choice property

Proof: We need to show that we can construct a global optimal solution from local optimal solutions. To show that is to show if every time we pick out the commodity with the greatest value per pound, we will finally obtain the optimal value of all commodities under limit W. Consider an non-empty set S, and let Cm be the commodity with the greatest value (v[m]/w[m]) per pound in S. Let A be a subset of S such that A has the greatest value under limit W and let Cj be the commodity with the greatest value (v[j]/w[j]) per pound in A. To prove the thm, we need to show, Cm must be in the subset A. By picking out Cm, we will encounter two cases: case1: w[m] >= W In this case, the knapsack is full. Because Cm is the the commodity with the greatest value (v[m]/w[m]) per pound, W * (v[m]/w[m]) must be the greatest value for all W * (v[i]/w[i]), where 1 <= i <= n. Thm is already proved. case2: w[m] < W In this case, we know for sure that for weight w[m], v[m] is the greatest value this amount commodities can bring. If Cm = Cj, then the thm is proved. If Cm != Cj, then to reach v[m], the weight needed from Cj must be more than w[m] because v[m] is, as shown, the greates value for weight w[m]. That is, if we replace Cm with Cj in A, the value of knapsack actually decreases. Contradiction to the original solution being optimal.

Suppose that in a 0-1 knapsack problem, the order of the items when sorted by increasing weight is the same as their order when sorted by decreasing value. Give an efficient algorithm to find an optimal solution to this variant of the knapsack problem, and argue that your algorithm is correct.

Solution Let i1, . . . , in be the items with values v1, . . . , vn and w1, . . . ,wn be their weights. Let W be the maximum knapsack weight. We have: w1 · w2 · · · · · wn (1) v1 ¸ v2 ¸ · · · ¸ vn. (2) The following linear-time algorithm does the job. Algorithm 1 Knapsack 1 1 w = 0 // knapsack weight S = ; // knapsack content for (i = 1; i · n; i++) if (w + wi · W) w += wi S = S [ {i} Proof of correctness: A proof of correctness of a general greedy algorithm usually consists of two steps. The greedy choice property. Let S be an optimal knapsack load. We show that without loss of generality one can assume i1 S. Indeed, if i1 62 S, let k be the smallest index of an item of S. Consider the packing S0 = (S \ ik) [ i1. Since w1 · wk, we have w(S0) · w(S) · W, so S0 is a legal packing. On the other hand, v1 ¸ vk implies v(S0) ¸ v(S), so S0 is also optimal. The optimal substructure property. For an optimal packing S with i1 2 S, the packing S00 = S \ i1 is optimal for the items i2, . . . , in and W00 = W − w1. Indeed, if S00 is not optimal, that one can improve the original packing S by improving S00.

Suppose that instead of always selecting the first activity to finish, we instead select the last activity to start that is compatible with all previously selected activities. Describe how this approach is a greedy algorithm, and prove that it yields an optimal solution.

Solution: "A greedy algorithm always makes the choice that looks best at the moment. That is, it makes a locally optimal choice in the hope that this choice will lead to a globally optimal solution." (p. 414 of text) Theorem 16.1: Consider any nonempty subproblem Sk (in our case it would be Sij) and let am be the activity in Sk with the earliest finish time. Then am is included in some maximum-size subset of mutually compatible activities of Sk. Description of how this is a greedy algorithm: The approach described in the problem is in fact a greedy algorithm, but it starts from the end rather than the beginning. Let's say we are given set Sij of activities, with Aij being the maximum-size subset of mutually compatible activities of Sij. Then let's arrange the activities in order of decreasing start time. Now let's define ak = [sk, fk) as the last activity in Aij. Furthermore, let am be the activity with the earliest finish time. At each point, the algorithm discards any overlapping activities with already chosen activities. The latest starting activities remains chosen. Proof that it yields an optimal solution: If ak = am, then we're done. If ak ≠ am, then we'll build our new set. Our set will start with ak and go till am ≤ ak. This new set will have the same number of activities as our original set. The new set is mutually compatible with the original set. The optimal solution for the original set Sij maps directly to the optimal solution in our new set. Both sets include am.

Give a dynamic-programming algorithm for the activity-selection problem, based on recurrence (16.2). Have your algorithm compute the sizes c[i, j] as defined above and also produce the maximum-size subset of mutually compatible activities. Assume that the inputs have been sorted as in equation (16.1). Compare the running time of your solution to the running time of GREEDY-ACTIVITYSELECTOR.

Solution: Recurrence 16.2: C[i, j] = 0 whenever Sij = 0 C[i,j] =max{c[i,k]+c[k,j]+1} if Sij ≠ 0 Equation 16.1: f1 ≤ f2 ≤ f3 ≤ ... <= f(n-1) ≤ fn DYNAMIC_ACTIVITY_SELECTOR(s, f) n= length[s] for i= 0 to n c[i,i] = 0 for m = 2 to n for i = 1 to (n-m+1) j = (i+m-1) c[i, j] = ∞ for k = (i+1) to (j-1) q=c[i,k]+c[k+1,j]+1 if q<c[i,j] then c[i,j]=q Run time for the above algorithm is O(n3) versus recursive-greedy-algorithm run time of O(n lg n).

Determine an LCS of <1,0,0,1,0,1,0,1> and <0,1,0,1,1,0,1,1,0>

The LCS is <1, 0, 0, 1, 1, 0> , Or It can be <1,0,1,0,1,0>

Give a dynamic-programming algorithm for the activity-selection problem, based on the recurrence (16.3). Have your algorithm compute the sizes c[i, j] as defined above and also produce the maximum-size subset A of activities. Assume that the inputs have been sorted as in equation (16.1). Compare the running time of your solution to the running time of GREEDY-ACTIVITY-SELECTOR.

The dynamic programming time complexity is O(n^3) and the greedy algorithm time complexity is O(n). DYNAMIC_ACTIVITY_SELECTOR(S): initialize c[i,j] = 0 for i <- 1 to n do for j <- 2 to n do if i >= j then c[i,j] <- 0 else for k <- i+1 to j-1 do if c[i,j] < c[i,k] + c[k,j] + 1 then c[i,j] <- c[i,k] + c[k,j] + 1 s[i,j] <- k

suppose we have an optimal prefix code on a set C = {0,1,...,n-1} of characters and we wish to transmit this code using as few bits as possible. Show how to represent any optimal prefix code on C using only 2n-1 + n lg n bits

The given set of messages with probabilities p1 p2 .... pn, the Human code tree is constructed by recursively combining sub trees: Start with n trees, each tree contains a single node corresponding to one message word, with the weight of pi Repeat the process until there is only one tree: consider two sub trees with smallest weights and combine them by adding a new node as root, and make the two trees its children. The weight of the new tree should be calculated by summing the weights of two sub trees with a heap, each step of combining tree takes O(log n) time, and the total time is O(n log n).

What is the key idea of a greedy algorithm?

The key idea in this algorithm is to make each choice in a locally optimal manner.

longest common subsequence problem

The longest common subsequence (LCS) problem is to find the longest subsequence common to all sequences in a set of sequences (often just two). Note that subsequence is different from a substring, see substring vs. subsequence. It is a classic computer science problem, the basis of file comparison programs such as diff, and has applications in bioinformatics.

Suppose that instead of maintaining the table w[i, j], we computed the value of w(i, j) directly from equation (15.17) in line 8 of OPTIMAL-BST and used this computed value in line 10. How would this change affect the asymptotic running time of OPTIMAL-BST?

The time complexity is still O(n^3). Although the original calculation method uses only O(n^2) to calculate w, the algorithm has a triple loop.

What are greedy algorithms?

These algorithms always make the choice that looks best at the moment. That is, it makes a locally optimal choice in the hope that this choice will lead to a globally optimal solution.

Huffman Analysis

Time complexity: O(nlogn) where n is the number of unique characters. If there are n nodes, extractMin() is called 2*(n - 1) times. extractMin() takes O(logn) time as it calles minHeapify(). So, overall complexity is O(nlogn).

How can we determine if a greedy algorithm is effective?

Use the matriod theory, which provides a mathematical basis that can help to show us that a greedy algorithm yields an optimal solution.

What is a downside to greedy algorithms?

Using this algorithm, we cannot always easily tell whether the approach will be effective.

What assumption is made in the activity-selection problem?

We assume that the activities are sorted in monotonically increasing order of finish time.

Show how to compute the length of an LCS using only 2 · min(m, n) entries in the c table plus O(1) additional space. Then show how to do this using min(m, n) entries plus O(1) additional space.

second case


Related study sets

Chapter 20: Blood Vessels and Circulation

View Set

Slave Narrative-Honors English (Enderby/Hilty)

View Set

Accounting 2600 - Financial Accounting

View Set

SPTE 440 - Final Study Guide MC & T/F

View Set

Social Studies Test 1-Analyzing Historical Sources #2

View Set

Power Engineering 4th Class Ch 111

View Set

250 Things Every AP Student Should Know About U.S. History

View Set

ECON: Ch. 1 & 2 Practice Quizzes

View Set