CS 610 Data Structures & Algorithms Chapter 1 Algorithm Analysis
Purpose of random variable
In algorithm analysis, we use a random variable X that has a discrete set of possible outcomes, which in this case is sample space S all of the possible outcomes of the random sources used in algorithm, to characterize the running of a randomized algorithm.
Importance of Asymptotic Notation
It allows us to quickly view the long time performance of an algorithm
Iteration
List of steps used by algorithm
Random Access Machine(RAM) Model
A computational model that views a computer as a CPU connected to a bank of memory cells. The CPU is able to access any arbitrary memory cell at any time with one primitive operation.
Clearable Table(Simple Array Implementation)
A data structure that stores a table of elements which can be accessed by their indices in the table
Worst case analysis
A determination of the maximum amount of time that an algorithm requires to solve problems of size n. This is used the most to characterize running times of an algorithm.
Components of Generic Analysis Methodology
A language for describing algorithms, a computational model that algorithms execute within, a metric for measuring algorithm running time, and an approach for characterizing running times, including those for recursive algorithms.
Primitive operation
Low level instruction with an execution time that depends on the hardware and software environment but is nevertheless constant.
Exponential(Big O)
O(a^n)(a>1)
Logarithmic(Big O)
O(log n)
Linear(Big O)
O(n)
Quadratic(Big O)
O(n^2)
Polynomial(Big O)
O(n^k)(k>=1)
Independent(Probability)
Two events A & B are independent if Pr(A∩B)=Pr(A)*Pr(B)
Independent (Random Variables)
Two random variables X and Y are independent if ( Pr( X=x | Y=y ) = ( Pr( X=x ) ) for all real numbers x and y
Accounting Method(Amortization Technique)
Use a scheme of credits and debits to keep track of the running time of the different operations in the series.
Summation
accumulation; they arise in data structures and algorithm analysis because the running times of loops naturally give rise to summations
Counterexample(Justification Techniques)
an example used to support a claim or statement that is the opposite of another claim or statement. these claims usually are of the generic form.
DeMorgan's Law(Justification Techniques)
A method you can use to rewrite the negative version of complex conditionals, for example the negation of statement "p or q" is "not p and not q" and the negation of statement "p and q" is "not p or not q"
Running time of an algorithm or data structure
A natural measure for the purposes of scalability, since time is a precious resource
Indicator Random Variable
A random variable that maps outcomes to the set {0,1}
Probability Space(Probability)
A sample space S together with a probability function, Pr, that maps subsets of S to real numbers in the interval [0,1]. It mathematically describes the probability of certain events occurring.
True or false: we don't need to prove our claim about a algorithm or data structure
False. we always need to prove our claim about the correctness of an algorithm or data structure
asymptotic
refers to tails of distributions that approach the baseline but never touch the baseline
Conditional Probability
the probability that one event happens given that another event is already known to have happened Pr(A | B) = ( Pr(A∩B) ) / ( Pr(B) ) if Pr(B) > 0
Induction(Justification Techniques)
the process that moves from a given series of specifics to a generalization
Contrapositive(Justification Techniques)
the statement formed by negating both the hypothesis and conclusion of the converse of a conditional statement
Rules of Exponents
x^m + x^n = x^(m+n) x^m * y^m = (xy)^m (x^m)^n = x^nm x^0 = 1 x^-m = 1/x^m (x^m)/(x^n) = x^(m-n) (x^m)/(y^m) = (x/y)^m
Closed Form(Recurrence Equation)
A characterization of an recurrence equation that does not include references to the function T on the righthand side
Accumulator pattern
A common programming pattern in which a final answer is built a piece at a time in a loop.
Big Omega Notation
An asymptotic way to say a function is "greater than or equal to" another function
Data Structure
A systematic way of organizing and accessing data
Big O Notation
A way of expressing the worst-case run-time or space usage of an algorithm, useful for comparing the speed of two algorithms.
Advantage of amortization
A way to do a robust average case analysis without using any probability
Clearable Table(Extendable Array Implementation)
Address overflow issues of implementing a clearable table with a simple array by instead creating a clearable table with an extendable array.
Generic Analysis on Algorithms
An analytic framework that takes into account all possible inputs, allows comparison of efficiencies of two algorithms in a way independent from the hardware and software environment, and can be performed by studying a high level description of algorithm without implementing or running experiments on it.
Loop Invariant(Justification Techniques)
An assertion that expresses a relationship between variables that remains constant throughout all iterations of the loop.
Little-omega notation
An asymptotic way of saying one function is strictly greater than another function
Little-Oh Notation
An asymptotic way of saying one function is strictly less than another function
Big Theta Notation
An asymptotic way of saying two functions are asymptotically equal, up to a constant factor.
Pseudocode
An outline of the basic ideas behind how algorithms will work. Meant to be easy to read and understand.
Gauss Summation
Another common summation in data structure and algorithm analysis. s = 1+2+3+...+(n-2)+(n-1)+n
Space usage of an algorithm or data structure
Another important measure for purposes of scalability
Asymptotic Notation
Are languages that allow us to analyze an algorithm's running time by identifying it behavior as the input size for the algorithm increases. This also know as an algorithm growth rate.
Potential Function Method(Amortization Technique)
Based on an energy model, we associate with our structure a value Φ which represents the current energy state of our system.
Three Limitations of Experimental Analysis on Algorithms
Can only be done a limited set of test inputs, difficult to compare efficiencies of two algorithms unless both were performed on the same hardware and software environments, and an algorithm needs be implemented and executed for an experimental analysis to take place.
Dynamic Programming
Ch. 12 in textbook
Maximum Subarray Problem
Common interview question. We are given an array of integers and asked to find the subarray whose elements have the largest sum.
Geometric Summation
Common summation in data structure and algorithm analysis. Each term is geometrically larger than the previous one if a>1. s = a + ar + ar2 + ar3 + ar4
Recurrence Equation
Definition of mathematical statements that the running time of a recursive algorithm must satisfy
Expected Value of discrete random variable X
E(X) = Σ_x ( x Pr(X=x) ), summation is defined over the range of x. The expected value of a random variable is highly sought after
Event(Probability)
Each subset A of sample space S.
Contradiction(Justification Techniques)
Establish a statement q is true by first supposing that q is false and then showing that this assumption leads to a contradiction
Scalability
Refers to how well a system can adapt to increased demands
Chernoff Bounds
Section 19.5 in textbook
Experimental Analysis of an algorithm
Study of an algorithm's running time by executing it on various test inputs and recording the actual time spent in each execution
Telescoping Sum
Summation where for all terms other than the first and last term cancel each other out
Prefix sums
Sums of first t integers in array A for t = 1, 2, 3, ... , n
Amortized Running Time
The amortized running time of an operation within a series of operations is the worst case running time of the series of operations divided by the number of operations
Base case (Recursion)
The condition under which a recursive function returns without calling itself, thereby ending the recursive calls.
Floor Function; Ceiling Function
The floor and ceiling functions are defined for all real numbers x by floor(x) = x = the largest integer in the interval (x - 1, x] and ceiling(x) = x = the largest integer in the interval [x, x + 1)
Maximum suffix sum
The maximum of prefix sum S_j,t for J=1, ... , t.
Mutually Independent(Probability)
The probability of a finite intersection of mutually independent events is the product of all events.
Array-maximum problem
The problem of finding the maximum element in an array A storing n integers.
Recursion
The process of a method calling itself in order to solve a problem.
Sample Space (Probability)
The set of all possible outcomes of an experiment
Amortization
Tool used to understand running times of algorithms that have steps with widely varying performance. Instead of focusing on the worst case running time of each method in an algorithm or data structure, it considers interactions between methods by studying the running time of a series of these operations.
Random Variable(Probability)
Variables whose values depend upon the outcome of some experiment.. Formally, its a function X that maps outcomes from sample space S to real numbers.
Count of primitive operations in Algorithm
We count the number of primitive operations executed, and use this number t as a high level estimate of the running time of the algorithm.
What do we need if we want to analyze an algorithm that uses randomization or we wish to analyze the average case performance of an algorithm?
We need basic facts from probability theory
Justification Techniques
We use justification techniques to justify or prove our statements about a data structure or algorithm
How to order functions by their growth rates?
We use little-oh notation to see which algorithms are asymptotically better than others algorithms.
Probability function
a function, denoted by f(x), that provides the probability that x assumes a particular value for a discrete random variable
Algorithm
a step-by-step procedure for solving a problem in a finite amount of time
Average case analysis
evaluates the time complexity of an algorithm by determining the average running time of the algorithm on a random input
Rules of Logarithms
loga(1) = 0 log a(A) = 1 log (A X B) = log A + Log B log (A / B) = log A - Log B log (A^B) = B *log A B^(logc(A)) = A^(logc(B)) logb(A)=(logc(A)) / (logc(B)) log (1/A) = - log A