Numerical Analysis MT1
Equivalence of norms
1>2>infinity we also have: 1<= sqrt(n) * 2. 2<=sqrt(n) * infinity. 1<=n * infinity. For all vector norms m and M there exist constants c and C s.t c*m<= M <= C * m
Matrix norm 2 norm
2-norm of symmetric matrix is max absolute eigen value.
Evaluating Function Condition Number
Absolute Forward error: f(x+delta(x)) - f(x)=f'(x)delta(x) relative forward error: f(x+delta(x))-f(x)/f(x) = f'(x)delta(x)/f(x) THUS condition number = |f'(x)delta(x)/f(x) / delta(x)/d| = |xf'(x)/f(x)|
An iteration scheme reduces the relative error by ½ with each iteration. What is the maximum number of iterations you would need if you are working in 64-bit IEEE floating point (i.e., IEEE double precision)?
Another scheme reduces the relative error by a factor of 10 with each iteration. What is the maximum number of iterations required? Another scheme starts with an initial relative error of 0.1, and its error is squared with each iteration. How many iterations are required to get to machine precision?
Floating point numbers
B-base. p - precision [L,U] exponent range. number x= +- (d0+d1/B+d2/B^2+....d(p-1)/B^(p-1))B^E. Where 0<=di<=B-1, i=0,..,p-1 and E is in the exponent range. mantissa:d0d1...d(p-1) fraction:d1d2...d(p-1) B=2 most computers, IEEE fp systems are now almost universal in ditial computers
Sources of approximation
Before computation: modeling, empirical measurements, previous computations. During computation: truncation, discretization, rounding.
GS: Classical V Modified
Classical: v=a for k=1,...,j-1{ v=v-qk(qk^T*aj) } modified: v=a for k=1,...,j-1{ v=v-qk(qk^T*vj) } Lack of associativity in fp arithmetic drives difference between methods. Conceptually MGS projects the residual rj=aj-P(j-1)*aj. Repeating cleans up. MGS example of "small corrections are preferred to large ones". CGS requires separate storage for A,Q,R but MGS can overwrite A for Q
1D projection
Consider subspace spanned by a. The projection of a point b in R^2 onto span(a) is the point on the line y=alpha*a that is closest to b. To find the projection we look to minimize alpha in the 2 norm ||alpha*a-b||
Matrix norm infinity norm
Corresponds to maximum absolute row sum. Matrix agree with corresponding vector norms for nx1 matrix.
Matrix norm 1-norm
Corresponds to the maximum absolute column sum.
Pivoting not required for these Matricies
Diagonally Dominant: diagonal element is larger than sum of row elements before it. Symmetric Positive Definite. A=At and x^T * A * x >0 for al x not equal to 0
BLAS
Encapsulate basic operations on vectors and matrices so they can be optimized for given computer architecture while high-level routines that call them remain portable. Encapsulate matrix-vector and matrix-matrix operations for better memory hierarchiy utilization, cache and vm w/ paging
Complete Pivoting
Exhaustive stratgey in which largest entry in entire remaining unreduced submatrix is permuted into diagonal pivot positioning. Requires interchanging columns as well as rows leading to PAQ=LU, w/ l lower triangular,U is upper triangular, P and Q permutations. Theoretically more stable numerically, but pivot search is more expensive than for partial pivoting
Condition Number
For any matrix A, cond>=1. Cond(I)=1. For scalar c, cond(cA)=cond(A). For diagonal D, cond(D)=max|di|/min|di|
Induced/subordinate matrix norm
For any vector norm ||x||*, ||A||*= max||Ax||*/||x||* s.t x!=0 = max ||Ax||* s.t ||x||*=1
Vector Norm Properties
For any vector norm ||x||, ||x|| >0 if x != 0 ||cx||=|c|*||x|| for scalar c, ||x+y||<=||x||+||y|| | ||x||-||y|| | <= ||x-y||
Condition number
For nonsingular square matrix = cond(A)=||A||*||A^-1||. If A singular cond(A)=infinity. Also equal to (max (||Ax||/||x||) s.t x!=0)*(min (||Ax||/||x||) s.t x!=0)^-1 condition number measures ratio of maximum stretching to maximum shrinking matrix does to nonzero vectors. Large cond(A) means A is nearly singular
Tridiagonal Matrices
Gaussian eliminaiton w/o pivoting only costs o(n).
Matrix norm properties
Greater than 0 if A is not 0. Scalar ||cA||=|c|*||A||. Triangle ||A+B||<=||A||+||B||. ||AB||<=||A||*||B|| ||Ax||<=||A||*||x|| for any vector x
Householder Transformation
H=I-2(v*v^T)/(v^t*v) for nonzero v. H is orthogonal and symmetric (H equal to its transpose and inverse). Given a vector a, we want to chose v s.t Ha=[alpha 0 ... 0]= alpha[1 0 ... 0]= alpha * e1. Substituting into formula for H, we can take v=a-e1 and alpha= (-a1/|a1|)*||a|| with sign chosen to avoid cancellation. I-(v*v^T)/(v^t*v) is a projector onto span(vectors orthogonal to v) householder reflects transformed vector past that span.
Orthogonalization methods overview
Householders- very stable. Givens Rotations- stable and cheaper than householders. Gram-Schmidt-better than normal eqns but not great Modified Gram Schmidt- better than "classical" GS
Exceptional Values
IEEE fp special values: Inf- infinity such as 1/0; NaN-not a number such as 0/0 or Inf/inf or 0*inf
Error bounds: relative change in solution x for relative changes in matrix A
If (A+E)x'=b then ||delta(x)||/||x|| <= cond(A) ||E||/||A|| if input data are accurate to machine precision then error becomes ||delta(x)||/||x|| <=cond(A) emach
Cholesky Factorization
If A is symmetric and positive definite, then LU factorization can be arranged so that U=L^T, which gives cholesky factorization A=LL^T, where L is lower triangular with positive diagonal entrie.
Ax=b existence/uniqueness
If A non singular, unique solution. If singular and b is in the span (A) infinitely many. If singular and b not in span (A), no solution
Symmetric matrix
If a_ij=a_ji for all I,j. A=A^T
Orthonormal Bases
If columns are orthogonal and normalized then aj^T*aj=1. If I!=j then 0. A^TA=I in this case (dij=1 or dij=0 is the Kronecker delta)
Orthogonal bases
If columns of A were orthogonal s.t aij=ai^T*aj=0 for j!=i. Then A^TA is a diagonal matrix and the system is easily solved.
Subnormal and Gradual underflow
If leading digits are allowed to be zero, only when xponent is at its minimum, then the gap around zero in normalization is filled in by additional subnormal or denomalized fp numbers. Less precision, unit roundoff no smaller, extend range of magnitudes. Augmented system exhibits gradual underflow
Normal Equations Method
If mxn matrix A has rank n, symmetric nxn matrix A^T*A is positive definite so cholesky factorization: L*L^T can be used to obtain solution x to system of normal equations A^T*A=A*b which ahs same solution as linear least squares problem Ax~b. rectangular->square->triangular. In general orthogonalized better
FP Rounding Rules
If real number not machine number. Chop: truncate base-B expansion of x after (p-1)st digit: also called round toward zero. round to nearest: fl(x) is nearest floating-point number to x, using fp number whose last stored digit is even in case of tie; round to even
Reduced QR
If we partition mxm Q=[Q1 Q2] where Q1 is mxn then A= Q[R 0]^T= [Q1 Q2] [R 0]^T =Q1R. Columns of Q1 are orthonormal basis for span(A), columns of Q2 are orthonormal basis for orthogonal complement span(A). Q1*Q1^T is orthogonal projector onto span(A). Solution given by Rx=c1=Q1^T*b
LU factorization costs
In general cost is O(n^3). Dominant cost comes from essential update step. Total cost is 2(n^3)/3 operations
Scaling linear systems
In principle, no effect. In practice, scaling affects both conditioning of matrix and selection of pivots, which affects numerical accuracy in fp arithmetic. Best if all entries of matrix have about same size. Scaling can introduce rounding errors.
Shortcomings of Normal Equations
Information can be lost inf forming A^T*A and A^T*b due to fp arithmetic. Also cond(A^T*A)=cond(A)^2.
Givens Rotations
Introduce zeros one at a time. Given vector [a1 a2]^T choose scalars c and s so that [[c s], [-s c]] * [a1 a2]=[alpha 0] with alpha =sqrt(a1^2+a2^2) solved by gaussian.
Computing condition number
Inverse nontrivial to compute. Estimated in practice. If Az=y ||z||/||y||<=||A^-1||. Efficient condition estimators pic y s.t ratio is large, aka worst case condition.
Uniqueness Of LU
LU is unique up to diagonal scaling of factors. If we have to LU=L'U'=PA. L'^-1L=U'U^-1=D is a diagonal matrix. Uniqueness made explicit in LDU factorization, PA=LDU, L lower triangular and U(unit upper), D diagonal
Errors Bounds:For possible relative change in solution x due to relative change in right-hand side b.
Let x be sol. Of Ax=b and x' be a sol of Ax=b+delta(b). ||delta(x)||/||x|| <= cond(A) ||delta(b)||/||b||;
LINPACK and LAPACK
Lin is softare package for solving wide variety of systems of linear equations. LAPACK is more recent replacement. Based on lower level basic linear algebra sub programs (BLAS).
least^2 existence/uniquene
Linear least square ALWAYS has a solution. Solution is unique if and only if columns of A are linearly independent, I.e, rank(A)=n, where A is mxn. If rank(A)<n, then A is rank-deficient and solution is not unique.
Triangular
Lower triangular: all entries above main diagonal are zero upper triangular: all entries below main diagonal are zero.
LU storage management
M, inverses of M (L), and permutations P are not formed explicitly. YOU overwrites upper triangle A, multipliers in L overwrite strict lower triangle of A, and unit diagonal of L need not be stored
LU factorization with Partial Pivoting
M=M(n-1)P(n-1)...M1P1 L=M^-1 ( not necessarily lower triangular) alternatively PA=LU and L is lower triangular
Triangular Least Square QR
Matrix A w/ mxn s.t m>n, seek mxm Q s.t A=Q [R 0]^T. We get Q^T*A*x= [R 0]^T*x ~= [c1 c2]= Q^Tb.
Matrix norm
Matrix norm corersponding to given vector norm is defined by; ||A||= max (||Ax||/||x||) s.t x!=0. Norm measures maximum stretching matrix does to any vector in given vector norm
Orthogonal Projectors
Matrix p is orthogonal projector if it is idempotent, P^2=P and symmetric, P^T={. Orthogonla projector onto orthogonal complement span(P)├ is given by P├ = I-P. For any vector v=Pv+(P├) v. For least squares problem if rank(A)=n then P=A(A^TA)^-1A^T.
Pivot properties
Need for pivoting has nothing to do with singularity. Using a small pivot, and correspondingly large multiplier, can cause loss of information in transformed matrix, but can be corrected with permutation. Moving pivots down moves us closer to upper triangular form with no round off.
Pseudo Inverse
Non square mxn matrix A has no inverse, pseudo-inverse defined by: A+=(A^TA)^-1 A^T. Cond(A)=||A|*||A+||. If rank(A)<n cond(A)=infinity. Least square solution x=A+*b
Error bounds Caveats (condition number)
Norm-wise analysis bound relative error in largest component of solution, relative error in smaller components can be much larger. Conditioning of system is affected by relative scaling of rows or columns.
Avoid normal equations
Normal A^TAx=A^T*b. Instead use Orthogonalized A: Ax=QRx=b. Columns of Q orthonormal,R is upper triangular
Singularity
Nxn matrix A is nonsingular if ithas any of the following equivalent properties: Inverse of A exists, det(A) !=0, rank(A)=n, for any vector z !=0, Az!=0
Linear Least Squares
Overdetermined linear system with mxn matrix A, m>n. System written as Ax~b. Minimize residual b-Ax. A is tall and thin
Permutation Matrix
P has 1 in each row and column and zeros elsewhere, I w/rows or columns permutated. P^T=P^-1.
Partial Pivoting Costs
Procedure: for each k pick largest val in kth colum k'. Swap k and k'. Proceed with central update step. For each step, search is O(n-k) = n^2/2. Row Swap is O(n-k) = n^2/2. Full pivoting: total cost for partial pivoting is O(n^2) << 2n^3=3. Full pivoting is O(n^3) row and column exchange costs O(n^2) still only
Residual Vector
R=b-Ax'. In theory if A non singular ||x-x'|| = 0 iff ||r||=0. Small relative residual implies small relative error in approximate solution only if A is well conditioned. If computed solution x' exactly satisfies (A+E)x'=b then ||r||/||A|| ||x'|| <= ||E||/||A||. So large relative residual implies large backward error in matrix, and algorithm used to compute solution is unstable. Stable algorithm yields small relative residual regardless of conditioning. Small residual easy to obtain, not necessarily accurate.
QR Factorization
Reduced QR and Full QR. Note A= [Q1 Q2] * [R 0]^T=Q1R. Conditions span(a1) = span(q1),...,span(a1,a2,...an)=span(q1,q2,...qn). Give rise to equations a1=q1*r11, a2=q1*r12+q2*r22. Q(j-1):=[q1 q2 q(j-1)] P(j-1)=Qj*Q(j-1)^T P(j-1)_composite=I-P(j-1)=P'(j-1_ for j=2,...,n-1 vj=aj-P(j-1)*aj=P'(j-1)*aj qj=vj/||vj||
Sherman-Morrison Formula
Refactorzation can be avoided even when matrix changes. Matrix resulting from rank one change. (A-uv^t)^-1 where you and v are n-vectors. Evaluation requires O(n^2) instead of O(n^2) required for inversion. To solve (A-uv^t)x=b with new matrix. Use steps: solve Az=u for z, solve Ay=b for y. Compute x= y+ ((v^T*y)/(1-v^T*z))z.
Diagonal Scaling
Row scaling: premultiplying both sides of a system by nonsingular diagonal matrix D, DAx=Db multiplies each row of matrix and right hand side by corresponding diagonal entry of D. Column scaling: post multiplying A by D multiplies each column of matrix by corresponding diagonal entry of D
QR w/ column pivoting
Select column of w/ max euclidean norm. If rank(A)=k<n, then after k steps, norms of remaining unreduced columns will be zero below row k. Yields orthogonal factorization of form Q^T*A*P=[[R,S][0,0]] where ARE is kxk upper triangular and nonsingular. Basic solution can now be computed by solving Rz=c1 where c1 =(Q^T*b)[1:k] and then taking x=P * [z 0] rank unknown so determined by monitoring norms of remaining unreduced columns and terminating factorization when max value falls below chosen tolerance
Least square conditioning and sensitivity
Sensitivity of least squares solution depends on b and A. Define angle theta between b and y=Ax by cos(theta)=||y||/||b||=||Ax||/||b||. perturbation delta(x) due to perturbation delta (b): ||delta(x)||/||x| <=cond(A)*(1/cos(theta)*||delta(b)||/||b|| perturbation delta(x) due to perturbation E in matrix A, ||delta(x)||/||x|| <= (cond(a)^2*tan(theta)+cond(A))||E||/||A||. Cond of least square solution is about cond(A) if residual small, but can be arbitrarily worse for large residual
Condition Number Arithmetic Operations
Subtraction of two positive ( or negative) values of nearly same magnitude is ill conditioned. Multiplication and division are benign. Culprit is subtraction, more than the division by a small number
Computational Error
Sum of truncation and rounding error, but one of these usually dominates
Gram Matrix
Suppose Wb=z. Suppose W sparse mxm matrix m>10e6. If A=(a1 a2 ... An) can form n vectors WA=(Wa1 Wa2... Wan) and the gram matrix W'=A^TWA. Solve W'x=A^Tz.
Impact of Finite Precision on Approximation and Iteration
The presence of round-off error implies there is no point to approximating a function to a relative error that is < each. Similarly with iterative methods no point to driving the relative residual to < emach
Givens QR factorization
To annihilate selected component of vector in n dimensions rotate raget component with another component. Straightforward Givens 50% more work than householder and more storage. Advantageous for computing QR factorization when many entries of matrix are already zero. EX: upper hessenberg Aij=0 if i>j+1. Givens row operation applied only n times instead of O(n^2) times. Givens work is O(n^2)
Normal equation
To minimize squared euclidean norm of residual vector : ||r||=r^T*r=(b-Ax)^T(b-Ax) =bTb-2x^T*A^T*b+x^TA^T*A*x, take derivative w/ respect to x and set it to 0: 2A^TAx-2A^Tb, which reduces to nxn linear system of normal equations A^TAx=Ab
Triangular Least Squares
Upper triangular overdetermined (m>n) least squares has form: [R 0]^T * x= [b1 b2]^T residual ||r||=||b1-Rx||+||b2|| no control over b2 but we can zero first term if x satisfies Rx=b1. Resulting x is least squares solution and min sum of squares is ||r||^2=||b2||^2
Householder QR Factorization
Use Householder transformations to annihilate subdiagonal entries of each successive column. Each H applied to entire matrix, but does not affect prior columns. Hn...H1=Q then A= Q [R 0]^T. Right hand side also multiplied by householder transformations and solve [R 0]^T * x = Q^T*b product Q need not be formed explicitly. R stored in upper triangle of A initially containing A. Householder vectors v stored in lower triangular portion of A. Work is O(n^3)
Orthogonality
Vectors v1 and v2 are orthogonal if their inner product is zero. Space spanned by columns of mxn matrix A ( Ax: x in R^n), is of dimension at most n. If m>n b generally does lie in span(A), so there is not exact solution. Vector y=Ax in span(A) is closest to b in 2-norm occurs when residual is orthogonal to span(A), again giving system of normal equations A^TAx=A^Tb
Projection in higher dimensions
We have basis coefficients xi written as vector x=[x1..xn]. Minimize square of the 2-norm of the residual ||Ax-b||. Our projection is y=Ax=A(A^T*A)^-1 *A^T *b. It is the projection of b onto R(A)
Weighted least squares
Weighted least square approximation of x = (A^T*W*A)^-1*A^T*W*b.
Insensitive
Well-conditioned, if relative change in input causes similar relative change in solution
Partial pivoting
When preforming gaussian elminiaiton, if we approach a zero at a leading diagonal at any stage we pick the largest nonzero entry below that diagonal and interchange row k. If no nonzero on or below diagonal nothing else to do so go to the next column.
What does it mean for x=Vy?
X lies in the column space of V. X lies in the range of V. X is in the span(V). X is a linear combinations of the columns of V
Relative Error
absolute error/true value
machine precision
accuracy of fp system characterized by unit roundoff or machine precision or machine epsilon. chopping emach= B^(1-p) nearest emach= 1/2(B^(1-p)). alternative definition is smallest number e s.t fl(1+e)>1. max relative error in representing real number x within range of fp system given by |fl(x)-x/x| <= emach which can be thought of as fl(x)=x(1+e)-->|e|<=emach for single precision emach 2^-24. double=2^-53
FP arithmetic
addition/subtraction: shifting of mantissa to make exponents match may cause loss of some digits of smaller number. multiplication: product of two p-digit mantissas contains up to 2p digits, so may not be representable. division: quotient of 2 p digit mantissas may contain more than p digits. overflow more serious than underflow, no good approximation to arbitrarily large magnitudes, but for arbitrarily small magnitude 0 is gooda addition and multiplication are commutative but not associative
Band Matrix
aij=0 for all |i-j|>B, B bandwidth of A. Matrix is stored in array of diagonals to avoid storing zero entries. If pivoting is required for numerical stability, bandwidth can grow( no more then 2x). For fixed small bandwidth, band solver can be extremely simple. In general requires O(Bn) storage and its factorization requires O(B^2n). Ideal if B << n. Significant savings in storage and work if A is banded. LU factors preserve nonzero structure of A ( unless there is pivoting) storage costs 2nB, factor cost is nB^2 . DO NOT invert A, L,U, for banded systems
stability
algorithm stable if result produced is relatively insensitive to perturbations during computation. Stability is analogous ton conditioning of problems. From backward error analysis: algo is stable if result produced is exact solution to nearby problem. For stable algorithm, the effect of computational error is no worse than effect of small data error in input.
Absolute Error
approximate-true value
FP advantage
by sacrificing a few bits to exponent, fp allows us to represent a huge range of numbers. All numbers have same relative precision. The numbers are not uniformely spaced.
Accuracy
closeness of computed solution to true solution of problem. Stability does not imply accuracy. Accuracy depends on conditioning of problem and stability. Applying stable algorithm to well conditioned problem yields accurate solution
Condition number
cond= |relative change in solution|/| relative change in input data| = |f(x')-f(x)/f(x)|/|x'-x/x|=|delta(y)/y|/|delta(x)/x| problem sensitive if cond>>1
amplification factor
condition number is amplication factor relating relative forward error to relative backward error but since its not exact: |rel. forward E| <= cond X |rel. back E|
Scientific Computing
design/analysis of algos for numerically solving math problems. deals w/ continuous quantities. considers approximations .
Rounding Error
difference between result produced by given algorithm using exact arithmetic and result produced by same algorithm using limited precision arithmetic.
Truncation ERror
difference between true result (for actual input) and result produced by given algorithm using exact arithmetic. Due to approximations such as truncating infinite series or terminating iterative sequence before convergence.
finite difference approximation
f'(x)~f(x+h)-f(x)/h exhibits tradeoff b/w rounding error and truncation error. Truncation error boudned by Mh/2, where M bounds |f''(t)| for t near x. Rounding error bounded by 2e/h where error in function values bounded by e. Total error minimized when h=2sqrt(e/m). smaller h increases rounding, bigger increases trunaction.
Floating Point system Properiteis
fp number system is finite and discrete. Total number of normalized fp numbers is : 2(B-1)B^(p-1)(U-L+1)+1. UFL: B^L. OFL=B^(U+1)(1-B^-p) ~ B^U
Normalization
fp system normailzed if leading digit d0 is always nonzero, unless whole number=0. in normal system, manitissa m of nonzero floating point number always satisfies 1<=m<b. Reasons to normalize: unique representation, no leading zeros, leading bit ned not be stored in binary system.
Taylor series truncation error
if f'(k) exists (is bounded) on [x,x+h], k=0,..,m then there exists epsilon in the interval s.t f(x+h)= f(x)+hf'(x)+h^2/2f''(x)+...h^m/m!f^(m)(epsilon). Take m=2; computable=desired+trunc ; f(x+h)-f(x)/h = f'(x)+h/2f''(e) truncation error ~ h/2f''(x) as h->0
sensitive
ill conditioned if relative change in solution can be much larger than that in input data
machine numbers
real numbers exactly representable
Fp Cancellation
subtraction between two p digit numbers having same sign and similar magnitudes yields result w/ fewer than p digits, so exactly representable. Leading digits of two numbers cancel . Serious loss of information. operands uncertain due to rounding or other precious errors, so relative uncertainty in difference may be large. Subtraction not at fault: merely signals loss of information that had already occured: (1+e)-(1-e)=1-1=0 but should be 2e. of basic +,/,*-, subtraction onl one with terrible cond. number, rest are OK
Forward and Backward Error
suppose we want y=f(x) where f:R->R but we obtain approx value y'. forward error=y'-y, backward error=x'-x where f(x')=y'
emach vs UFL
unit roundoff, emach, is determined by number of digits in mantissa of fp system where as UFL determined by number of digits in exponent field. 0<UFL<emach<OFL o<B^L<emach<B^U
Well posed
well posed if solution existed, is unique, depends continuously on problem data. Ill-posed otherwise. Can be well posed AND sensitive. Computational algorithm should not affect sensitivity
Vector norms
||X|| measure the magnitude of a vector x, measure closeness of approximate solutions. P-norm: ||x||p=(i=1-n ∑|xi|^p)^(1/p). 1-norm= sum of absolute values. 2-norm square root of the sum of squares 3 max of absolute values