Analysis of Algorithms
Bubble Sort Algorithm
- Compare pairs of adjacent elements - Swap the adjacent elements if they are not in order - Repeat until no swaps are needed
What is the worst-case scenario for insertion sort running time?
Everything is already sorted opposite of the desired order Example: [ 9, 8, 7, 6, 5, 4, 3, 2, 1] 8 will shift 1 spot > 1 comparison 7 will shift 2 spots > 2 comparisons 6 will shift 3 spots > 3 comparisons 5 will shift 4 spots > 4 comparisons .... How many comparisons? = 1 + 2 + 3 + 4 + 5 + ... + 8 = Sum from σ𝑖=1 𝑛−1 of 𝑖 = −𝑛 + Sum from σ𝑖=1 𝑛 of 𝑖 = −𝑛 + n(𝑛+1)/ 2 T(n) is O(n^2 )
REVIEW
Theoretical analysis assume the simple RAM model Counting operations Sorting algorithms Comparison-based Non-Comparison based Not all runtimes can be stated in terms of a single variable!
Counting Sort Review Performance:
- Worst case? O(n + k) • O(n) to find the min/max • O(n) to count elements • O(k) to accumulate frequencies • O(n) to place elements in sorted order
Insertion Sort Review Performance
- Worst case? O(n2 ) - More efficient if data is almost sorted already > Fewer "swaps" toward the front of the array
Selection Sort Review Performance
- Worst case? O(n^2 ) > When values are in reverse order Best case : O(n^2)
For given input size, running time may depend on the nature of input
1. Best case not usually interesting/meaningful 2. Average case often difficult to derive 3. Worst case the focus of this course - Easier to analyze - Crucial to applications requiring an upper bound on performance: • Airplane software • Medical systems • Weather forecasting • Search engine results • Intrusion detection
Counting Sort Example 4 5 1 3 0 3 1 5 4 1 2
1. Compute the range. Range = max-min+1 = 5-0+1 = 6 2. Create an array C to maintain frequencies. 3. Process each value and increment the associated count index: 0 1 2 3 4 5 data: 1 3 1 2 2 2 4. Accumulate frequencies index: 0 1 2 3 4 5 data: 1 4 5 7 9 11 5. Use the accumulated frequencies to produce the sorted output. 4: 9 -1 = index 8 : update table 5: 11 -1 = index 10: update table 1: 4-1 = 3 3: 7-1 = 6 etc Once done double check: index: 0 1 2 3 4 5 data: 0 1 4 5 7 9 0: start at 0 1: start at 1 2: start at 4 3: start at 5 4: start at 7 5: start at 9 data: 0 1 1 1 2 3 3 4 4 5 5 index: 0 1 2 3 4 5 6 7 8 9 10
How many total multiplication (*) operations are performed in the entire mystery algorithm? Algorithm mystery(A) Input an array A of n integers Output ??? 1 value ← A[0] * A[n-1] 2 for i ← 1 to n-1 do 3 A[i] ← A[i] * 2 4 for i ← 0 up to 316 do 5 A[i] ← 3 * A[i] 6 return A[0]
317 + n 1 n-1 317 ---- 1+317 = 318 + n - 1 = 317 + n T(n) = 317 + n O(n)
Example: 24 16 87 316 126 3 246 51 188 213 1. Insert values into buckets based on least significant digit (ONES place) 2. Retrieve values from buckets 3. Insert values into buckets based on next least significant digit (TENS place) 4. Retrieve values from buckets. 5. Insert values into buckets based on next least significant digit (HUNDREDS place) 6. Retrieve values from buckets
51 3 213 24 16 316 126 246 87 188 3 213 16 316 24 126 246 51 87 188 3 16 24 51 87 126 188 213 246 316 SORTED!!
Insertion Sort
Algorithm • Process each element one-by-one • Shift left until in correct sorted order
Bogosort Algorithm bogoSort(A[]) Input an array of n elements Output the sorted array while !isInOrder(A) shuffle(A)
Algorithm shuffle(A[]) Input an array of n elements Output random shuffled array for x 0 to length(A) - 1 index1 random() * length(A) index2 random() * length(A) value A[index1] A[index1] A[index2]; A[index2] value; Algorithm isInOrder(A[]) Input an array of n elements Output true if the array is sorted for x 0 to length(A) - 1 *if A[x] > A[x+1]* return false; return true; = O(n): bc start a 0 end at n-1 While: O(n) shuffle: no upper bound Runtime for BOGO sort is unbounded.
Algorithm insertionSort(A) Input an array of n elements Output the array with elements sorted in ascending order
For i<- 1 to n-1 x <- A[i] j <- i -1 while j >= 0 and A[j] > x A[j+1] <- A[j] j <- j-1 A[j+1] <- x for: O(n) while: O(n) total = O(n^2)
Counting Operations Algorithm arrayMax(A) Input an array A of n integers Output the max value in array 1 currentMax <- A[0] 2 for i <- 0 to n-1 do 3 if A[i] > currentMax then 4 currentMax <- A[i] 5 return currentMax
Line Cost Description 1 1 Array access: A[0] 1 1 Variable assignment: currentMax A[0] 2 1 Variable assignment: i<-0 2 n+1 Subtraction: n-1 2 n+1 Comparison: i <= n-1 3 n Array access: A[i] 3 n Comparison: A[i] > currentMax 4 n Array access: A[i] 4 n Variable assignment: currentMax<- A[i] (4) n Addition: i + 1 (4) n Variable assignment: i <- i + 1 5 1 return TOTAL 8n + 6 O(n)
Slow to Fast Runtimes
Log_2(n) n n^2 n^3 n^5 2^n 3^n
Insertion sort best case scenario
O(n) Where elements in the input array are already sorted in the desired order.
What is the worst-case number of comparisons in radix sort?
O(wn) • Always going to repeat based on the number of digits of the "longest" value radix sort HAS to be from right to left
Bubble Sort Review
Performance: - Worst case? O(n^2 ) - More efficient if data is almost sorted already > Fewer "repeats"
What does the actual running time of software depend on?
Software language, compiler Hardware machine used Programmer experience, coding details Algorithm approach to solving the problem
Focus on growth rate analysis of the number of times an-- Algorithm arrayMax(A) Input an array A of n integers Output the max value in array 1 currentMax <- A[0] 2 for i <- 0 to n-1 do 3 if A[i] > currentMax then 4 currentMax <- A[i] X 2 5 return currentMax
essential operation is performed! O(n)
Algorithm selectionSort(A) Input an array of n elements Output the array with elements sorted in increasing order
for i <- 0 up to n-1 do min <- i for j <-i+1 up to n-1 do if *A[j] < A[min]* then min <- j if NOT i = min then x <- A[i] A[i] <- A[min] A[min] <- x 2nd for loop: n-1 + .... + 1 = from n-1 i =1, Sum of [(n-1)(n)] / 2 = O(n^2) Or high level Both for loops: O(n) = O(n^2)
Running time of algorithm typically grows with
input size
Algorithm radixSort(A) Input an array A of n integers Output the array sorted in ascending order
max <- 0 for i <- 0 to n-1 do max <- max( max, A[i]) // Calculate number of digits in largest value w <- ceiling( log10(max+1) ) p <- 1 // tracks "place" (ones, tens, hundreds, etc.) for j <- 1 to w do C <- new array with length 10 for i <- 0 to n-1 do C[ (A[i]/p) % 10 ] <- C[ (A[i]/p) % 10 ] + 1 for i <- 1 to 9 do C[i] <- C[i-1] + C[i] F <- new array with length n for i <- n-1 down to 0 do F[ C[ (A[i]/p) % 10 ] - 1 ] <- A[i] C[ (A[i]/p) % 10 ] <- C[ (A[i]/p) % 10 ] - 1 A <- F p <- p X 10
Algorithm countingSort(A) Input an array of n elements Output the array with elements sorted in increasing order
min A[0] // Find the min value max A[0] // Find the max value for i 0 to n -1 do min min(A[ i], min) max max(A[ i], max) // Calculate the range of the elements k (max -min+1) // Create array to hold counts C new array with length k // Count frequency of values for i 0 to n -1 do C[ A[ i ] -min ] C[ A[ i ] -min ] + 1 // Accumulate frequencies for i 1 to k -1 do C[ i] C[i - 1] + C[ i ] // Build Final output array F new array with length n for i 0 up to n -1 do F[ C[ A[ i ] -min ] - 1 ] A[ i ] C[ A[ i ] -min ] C[ A[ i ] -min ] - 1 A F
Algorithm bubbleSort(A) Input an array A of n integers Output the array sorted in ascending order
repeat <- true while repeat is TRUE do repeat <- false for i <- 1 up to n-1 do if A[i] < A[i-1] then x <- A[i-1] A[i-1]<- A[i] A[i] <- x repeat <- true For loop: n - 1 = O(n) While: last value needs to be shifted to beginnning = O(n) Total: O(n^2)
Number of reptions depends on
the # of digits in the largest value being sorted
Growth Rate of Running Time
• Changing hardware/software environment - Affects T(n) by a constant factor - Does not alter the growth rate of T(n)
Selection Sort Algorithm
• Find minimum unsorted element, swap with first unsorted element • Continue until data is completely sorted
Counting Sort
• Input array of n integer elements in a range k • Counting array C Algorithm • Process each integer input value, incrementing the counter at C[value] (counting the number of times each value occurs in input) • Create the sorted output by using C
Radix Sort
• N = radix (number of unique digits used to represent numbers in a numeral system; base 10 radix = 10) • Auxiliary array B of N buckets • w is word size of integers (max # of digits in input integers) Algorithm • Starting with least significant digit of each element in S, add elements to buckets • Move entries out of B[i] back to S • Repeat with next most significant digit • We can use counting sort to help us!!!
Non-Comparison Sorting
• Often "distributes" values to other intermediate data structures to help with sorting • Not comparison-based - do not make comparisons of elements being sorted • Examples: - Counting Sort (Pigeonhole Sort) - Radix Sort
Comparison-based Sorting
• Sort by making comparisons between pairs of elements • We often analyze comparison-based sorting algorithms in terms of the number of comparison operations • Efficient (on average) for small data sets • More efficient for data that is almost sorted • Examples: - Bogosort - Bubble sort - Insertion sort - Selection sort
Theoretical Analysis
• Uses pseudocode (high-level description of the algorithm) instead of an implementation • Characterizes running time as a function on the input size n: T(n) • Allows us to evaluate the speed of an algorithm independent of the hardware/software environment
The Random Access Machine (RAM) Model
• Very simple model of how a computer performs • A computer consists of: - A CPU - A bank with an unlimited number of memory cells • Assumptions - Accessing any cell in memory takes constant time - Each "simple" operation takes 1 time step > Simple operation examples: +, -, *, /, comparison, variable assignment, method call, accessing an array index - Loops and subroutines are not simple > depend on number of loop iterations or complexity of subroutine