ADM II - Section 1 and 2

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Name 4 ways an equality constraint can be modeled

1. = sign 2. using two opposing inequality constraints 3. (if possible) by eliminating dependence among design variables 4. using Lagrange Multipliers

Major categories in the complexity of optimization (5)

1. No of dimensions 2. Constraints (quantity) 3. Linearity vs. Non Linearity 4. Local vs. Global 5. Discrete vs. Continuous Variables

What is the Hessian?

A matrix of partial second order derivatives

How does Dr. Volovoi recommend solving conceptual problems/proofs

Assume something is true or not true and prove it otherwise

What is the gradient?

A vector of partial first order derivatives of an equation

If the function is linear where will your optimum be?

At the edge, so you can jump from one vortex to another without worrying about any internal points

The advantages of the Diagonal Hessian Matrix are movitation for?

Conjugate Direction Method

What is unique about conjugate directions?

Conjugate directions are orthogonal so they are not unique. Only the first conjugate direction is unique, you can move around when you first start. After that the rest will not be unique but you can always move around the first one.

What is the simplest line search method

Coordinate (Univariate Search)

When no derivatives are known, what search methods can be used?

Direct Search Methods: Compass Search, CPS, Powell, Bi-Section, Golden Section, Coordinate, Finite Difference

What are the advantages of the exterior penatly method?

Easy to incorporate, squaring violated constraints ensures continuous space, does not require initial design to be feasible, good for soft constraints

What is the sufficient condition for a strict minimum

Gradient = 0 and Hessian is positive

What is the necessary condition for a local minimum?

Gradient = 0.

Do you converge interms of f(x) or x?

Either, it depends on what you want to do. Is there an independent variable you are trying to get or an overall obj func value that you want to meet/minimize etc. In design most of the time we are looking for a variable value

What is the Trust Region?

Evaluating a local approximation m(x) of function f(x) in the neighborhood ("trust region") of x^k. Usually a ball. (I'm in a certain area. Let me select a multidimension small ball and solve my problem locally. Then you need to iterate to find another point until you get the global minimum). m(x) represents the approx function usually a quadratic.

In unconstrained optimization what are the non-iterative methods?

Grid Search and Random Search

What is the fundamentals of line search?

Find the best point in one direction before changing directions. You are finding the direction (where to go) and the minimum along that direction (how far to go) before changing directions.

What is the definition of convexity function?

If objective and constraints are convex: fi*[ax+(1-a)y] <= a*fi(x) + (1-a)*fi(y), for 0<a<1 fi - represents the number of contraints x & y are vectors, a is scalar

What is the greatest benefit of convexity?

If you find the local optimum it is always the global optimum, if the problem is convex

What is the benefit of localization in modeling?

If you get close enough any non-linear func will look like a quadratic. You want to model a func as simple as possible but not simper than that. (constant, linear, quadratic)

What is the benefit of a quadratic function?

In optimization we really like to approximate everything as a quadratic b/c a quadratic has a nice ability to find a minimum. Derivative of a quadratic is linear and will cross zero. But a quadratic is not going to fit the entire obj func well so you create local quadratic models

What does KKT stand for

Karush-Kuhn-Tucker

What are the advantages of the interior pentaly method?

Keeps design feasible, but space continuous, can be used with first order optimization techniques

Describe how the exterior penalty method works

Most of the time your solutions converge from outside the feasible space. Start with a small r and then gradually increase it as you get closer to the solution b/c you want to get into the feasible space but also minimize f(x) so you will be close to where the constraints meet. You only get into the feasible space if the "r" goes to infinity and that is the disadvantage of this method b/c if optimization is stopped early, no feasible design is found.

Define Design Space

N-dimensional space within which all candidate designs will lie (can be bounded or unbounded)

When the 2nd derivate is known, what search methods can be used

Newton. It is computationally more intensive but you get locally, assuming that your function is quadratic, you get the minimum right away. 2nd derivative = need to know the hessian

Is a convex problem always linear

No

Using CPS does it matter which direction is choosen next?

No because all directions are evaluated at each point

What is key about a diagonal matrix?

No cross coupling, no interactions, so what happens in one direction is independent of what happens in another direction

Can you use the interior penalty method if you have an equality constraint?

No, b/c it stays on the inside of the constraints line. You must use the exterior method if you have an equality constraint.

What is Satisficing?

Not always looking for the optimum solution, looking for a solution that is good enough

Basic idea of Simplex Algorithm (what conditions do we need for the constraints?)

Only positive variables are allowed (add a constant or introduce two variables instead of each negative one) Only equality constraints (use slack and surplus variables)

What does heuristics pertain to and when is it used?

Pertaining to trial and error method of problem solving. Used when an algorithm approach is impractical. Heuristic Search is another term for Stochastic Search

If you don't know the gradient and hessian what conjugate direction method do you use?

Powell Method

What penalty function method would you use with 2nd order optimization techniques?

Quadratic Extended Penalty Function

If you know the gradient and the hessian what conjugate direction method do you use?

Quasi-Newton Methods (Variable Metric)

Define "Side" Constraints

Range limits of design variables

What is the variable penalty function used for?

Used to improve numerical conditioning and avoid extremely large values of for the penalty associated with large constraint violations as compared to the quadratic penalty function which continues to get large. The variable uses a cubic function which attempts to overcome machine overflow.

Is a linear problem always convex?

Yes

Explain Powell's Method

You don't know the derivatives so you use line search and approximate the hessian. You have to do more iterations so if you have 2 variables it will take 6 steps to find the optimum, versus 2 steps

Write the eqn2 for lagrange multiplier

You have a Lagrange Multiplier for every constraint

Define iteration

a movement from one point to a new point

An approximate function is usually

a quadratic one

CPS is primitive in what two ways?

directions are coordinates not conjugates, and step size is randomly chosen and then just halved

Define Coordinante Pattern Search (CPS) method

given a step size r, explore all coordinate axes directions exhausting all combinations. Half the step if no improvements are found.

Using CPS, if we half the step, at the next iteration do we resume with the half step or with a full one?

half

Define Noise Parameters

inputs that cannot be controlled by the user, but whose impact can be quantified

An active constraint means that the solution...

is on the constraint line

If a constraint is inactive, how does its perturbation affect the solution?

it does not

Is it better to have local or global lower degree polynomials?

local

If f'(x)=0 and the first non-zero term is odd power do you have a local minimum or maximum?

neither

Orthogonal implies unique or non-unique

non-unique

Is Steepest Descent a conjugate gradient method?

not by itself. have to correct to get orthogonality and conjugate direction

if f '(x0) does not equal zero then the first term is dominating, and by selecting (x-x0) to be either positive or negative what can one assure?

one can always assure that in the arbitrary small neighborhood of x0 we can find points with the value of f(x0)-f(x) both positive and negative, assuring that there is no local max or min at that point. (If the first term is linear you won't have an optimum local min or max)

What is the eqn for conjugate directions (vectors pi) wrt positive definite matrix H?

p_i^T*H*p_j=0, for all i not equal to j

Write the classical form eqn of the sequential unconstrained minimization techniques?

phi(X, rp) = F(X) + rp x P(X)

Why are iterative methods more efficient than non-iterative methods?

they take advantage of previous knowledge about the function values

When the 1st derivative is known but not the 2nd, what search methods can be used?

Steepest Descent. 1st derivative = need to know the gradient.

Why is the inverse of the hessian approximated in the Quasi-Newton Method?

The actual hessian is hard to solve for but the hessian or inverse is needed to find the minimum

If a design is feasible, what does that mean in regards to constraints?

The design satifies all the constraints

What does it mean if the lagrange multiplier is zero?

The function does not care about the constraint. Do this when you are on the interior or so far away from the constraint that you can deactivate it and make the problems simpler

The gradient is in what direction relative to the function

The gradient is in the direction of the function improvement (function increasing)

Explain Conjugate Gradient Method

The hessian is unknown but the gradient is and we have line search. Using steepest descent and moving opposite the gradient we are able to minimze by correcting the gradient descent direction to make it conjugate to the previous direction

Which terms are most important in a Taylor Series?

The left most non-zero term. It determines the sign of the whole expression. This is accomplished by taking x close enough to x0 you can always assure that the left most non-zero term is dominating.

In 2 dimensions, if the gradient is zero what does the graph look like?

There is a saddle point

Why are conjugate directions good?

They are the gold mine for multi-dimensional optimization b/c it lets you find the "right" coefficient in each direction ensuring one line search for each direction so it is the most efficient way to reach the optimum. 2 variables = 2 steps, 3 variables = 3 steps. Each direction you know how far to go from a_k = sigma_k

Explain the Quasi-Newton Method

This method approximates the inverse of the hessian using the secant property. It is just a little more efficient/exact but requires more memory storage

Line Search has 2 parts what are they?

(1) Direction is selected in accordance with some criteria (2) A one dimensional problem is approx solved along the chosen direction min f(x^k + ap^K)

How do you get a maximum from the minimum?

1/min

If you know the gradient but not the hessian what conjugate direction method do you use?

Conjugate Gradient Method

Explain the difference between a constraint and objective function

Constraint: you can meet all the constraints, and lie anywhere in the feasible space Obj Func: you want to meet all the constraints and get closer to some desired value

Define Bi-Section search method

Don't know the derivative, assuming the func is continuous. Pick a point less than zero and one higher than zero. Divide by half until you get points that have the sams sign. Start with the min and max values of x. if x [l, u]: f(lo)<0, f(uo) > 0; then y = (l + u)/2, if f(y) > 0; u(k+1) = y and l(k+1) = l; else l(k+1) = y and u(k+1) = u

Define Golden Section search method

Don't know the derivative, assuming the func is continuous. similar to bi-section but need three points (lower, mid, upper) It shrinks your range by 39% from lower point (62% from upper point) compared to bi-section which shrinks 50% each time.

Why is it better to have lower degree polynomials than higher degree polynomials?

Don't over fit. It is better to have many local splines than to have higher order polynomials which normally results in over fitting.

Name three types of penalty methods

Exterior, Interior and Extended Penalty Method

What are the key differences between the bi-section and golden search methods?

GS is less efficient than bi-section method (shrinks by 39% vs. 50%). GS finds a minimum but it may not be the global min.

What is the difference between grid search and random search?

GS: evaluates every combination of design variables within a given range and resolution (effectively an exhaustive search over a discretized design space) RS: evaluates large number of randomly selected points

Explain the idea of penalty functions

It augments the objective function, f(x). If the constraints are not being met then put a penalty on the objective function. f(x) + r*p(x): f(x) is the objective function, p(x) is the penalty function, r gets changed as you iterate. When r is small f(x) dominates and the penalty won't be enough to meet the constraints to get in the feasible space. If r is large p(x) is dominating and you won't be minimizing the objective funcion. Have to balance them.

What does lagrange multiplier really mean physically?

It corresponds to the sensitivity of a perturbation. Assuming all other constraints are kept the same it tells you what the sensitivity is, if you change one particular constaint by "x" amount then the objective function will change by "x*lambda" amount (e.g. 50 lbs allocation is decreased to 45 lbs how does that effect the overall objective)

What is the KKT necessary condition for minimum to have?

L(x,lam) = f(x) + sum(lami x g(x)) + sum(lamj x h(x)): X (vector x must be feasible), lamj x g(X) = 0, grad[(L(x,lam)] = 0 = gradf(x) + sum(grad(lami x g(x))) + sum(grad(lamj x h(x))) = 0. If on the interior l=0. On the surface c=0.

What is the difference between the three extended penalty methods?

LEIP: the resulting augmented objective has continuous first, but not 2nd derivatives QEP: provides continuous 2nd derivatives by changing the outer portion of the penality (more non-linearity) functon VP: further modification of the extended

What is the eqn for finding eigen values?

Must have a square matrix. det[A - Lambda*IdentityMatrix]x = 0

Define Design Variables (x)

N variables that can be changed by optimizer; considered to be independent (e.g. engine thrust, rudder surface area,...)

Where must the solution lie for an equality constraint? for an inequality constraint?

On the line To the side of the line

Developing strategies for multiple-steps decision making is a subject of?

Operational Research (OR) and Artificial Intelligence (AI) fields (e.g. Markov decision processes, dynamic programming etc)

"Analysis" itself usually implies

Optimization

What are the fundamental differences between the conjugate gradient method and quasi-newton method?

QNM has an approx inverse of the hessian which means you have to store that matrix so if you have 1000 variables you have to store 1 million entries (0.5 mil b/c H is symmetric). CGM only require you to store the current and previous gradient so at any step you only have two vectors stored as as oppose to an entire matrix. CGM is more economical in terms of computer memory and efficiency.

What are the three methods for rates of convergence?

Quotient Linear or Q-Linear (Steepest Descent) - Good, Q-Super Linear (typcially Quasi-Newton) - Better, Q-Quadratic (Newton) - Best

How many function evaluations are needed for n dimensions for right difference and central difference?

RD: 1 per dimension in addition to the regular eval at x CD: 2 per dimensiion in addition to the regular eval at x. CD gives a better/more exact slope but more evaluations are required.

What is the simplest inequality constraints?

Range limits of design variables which are sometimes called "Side" constraints

What are the disadvantages of the interior penalty method?

Slightly more complex. Your first point could be in the infeasible space so you have to check it and then select a new point to begin optimization.

What are the finite difference methods?

Taylor Series, Right (Forward) Difference, Central Difference

Explain this eqn: p_k+1 = -g_k+1 + gamma_k*p_k

This eqn tells you the direction and how far to move using conjugate gradient method. gamma k is selected in such a way that p_k+1 is orthogonal (perpindicular) to p_k and it is actually orthogonal to all the previous p_k's; g_k+1 is the new gradient direction; p_k+1 is the new direction to move and how far to move.

What is finite difference used for?

To calculate the gradient. Then use a gradient based method such as conjugate gradient method to find the optimum

What is the advantage of Extended Penalty method and name the three types?

Tries to combine exterior and interior penalty methods. Linear extended of interior penalty functions, quadratic extended penalty functions, variable penalty function

What are the disadvantages of the exterior pentalty method?

Typically approach optimum from outside of the constraint boundaries, so if optimization is stopped early, no feasible design is found

Describe how the interior penalty method works

Your solution converges from inside the feasible space (never goes to the infeasible space) so you will always get a feasible soln. As you get closer to the constraints this method pushes away so you start with a large r and have to decrease r otherwise it will blow up (division by a very samll number).

The lagrange multiplier can only be positive if the corresponding constraint is?

active

Generally speaking do you have more iterations with Compass Search or CPS and why?

compass search b/c you are moving way more

Generally speaking do you have more function calls with Compass Search or CPS and why?

cps b/c you are evaluting the function in every direction even if you already have found a better direction to move

if f'(x0) = 0, and f''(x0) does not equal zero then the second term is dominant, and the power (x-x0) is_____, so the sign of f(x0)- f(x) will be________, assuring _________.

even; the same for all points in the close neighborhood of xo; local maximum if f''(x0) > 0, and local min otherwise.

Define Compass Search method

given a step size r, explore directions along coordinate axis until the first improvement. Half the step if no improvements are found. x^k+1 = x^k + re..i, where e..i = {0...1...0}. Next step direction is selected to be the same as the previously successful one, otherwise a new direction is selected in accordance with some rule (i.e. clockwise, ccw etc)

Find the gradient and hessian, and eigen values of the hessian for: f(x,y) = x^2 - y^2

gradient: [2x, -2y] Hessian: [ 2 0, 0 -2] Eigen values: 2 and -2

Define Parameters (p)

inputs that are changed by the user, not the optimizer

If f'(x)=0 and the first non-zero term is even power do you have a local minimum or maximum?

local minimum or maximum, it depends on if f^n(xo)>0 (max) or f^(n)<0 (min). Here "n" represents the derivative from the Taylor Series Expansion

The coefficient "r" that is with the penalty function is adjusted to

make sure the contributing parts of the objective function are relatively of the same order. In any kind of engineering formula it is a good thing to have terms of the same order.

How many modes does a convex problem have?

one mode / one hump on the graph and it looks like a quadratic

If you want to minimize the function, how do you move with respect to the gradient

opposite the direction of the gradient

Define the Random Walk search method

pick a random direction s, select fixed (relatively large step r), x^k+1 = x^k + rs, if f(x^k+rs) < f(x^k) else select new direction, if this still does not work after a number of selected directions, half the step. A "hybrid" method selects the direction randomly, but uses optimized step obtained from line optimization

How is the convergence speed of the exterior penalty method?

slow

What are the two key things needed for iterative method evaluations?

step size and direction

Define rates of convergence

tells you how many steps I need to make to get to the solution in terms of the value of the variables, not the value of the function.

Define function call

the evaluation of the function with a new value then comparing that result to the current obj func value

What do we care more about, iterations or function calls and why?

the number of func calls b/c this is the time consuming part.

Describe Fletcher-Reeves and Polak-Ribiere

these are methods to finding gamma_k without knowing the hessian so they are used with the conjugate gradient method. Depending on the situation one is more stable and better to use than the other.

For gradient based optimization how likely is it that we will use these penalty methods?

unlikely, but there are some fundamental concepts that are used practical algorithms

Define Objective Function (f)

value to be optimized (can be a combination of several factors), minimize, maximize, multi-discipline optimization, local minimum, global minimum

Solve: min(x1+x2), O = {2-x1^2-x2^2 >= 0, x2 >= 0}

x2=0, x1=sqrt(2), lambda2=1, lambda1=-1/(2*sqrt(2))

Define coordinate search

you find the min in one direction (variable), then move to another direction (variable) but you may end up over shooting in one of the directions (that is why conjugate direction methods are better). This turns a multi-dimensional problem into a 1-D problem.


Kaugnay na mga set ng pag-aaral

Test 4 disorders of cardiac function, heart failure, and circulatory shock ch27, 12. CARDIAC CONDUCTION/RHYTHM, Porth's Ch 33 - Disorders of Cardiac Conduction & Rhythm, Pathophysiology Chapter 17 (Control of Cardiovascular Function) PrepU Quizzes, C...

View Set

mastering environmental science: environment sustainability and science

View Set

International Marketing Exam Week 1-3

View Set

Sociology Unit Test 5 (Chapter 16, 18-19)

View Set

American Lit: Instructional Cycle 1 Unit Test ""(This is a C%)""

View Set

Section 7: Real Estate Agency - quiz

View Set

TCP/IP, Multiplexing, Linux, MacOS Complete Study Set

View Set