M1 : Principle Component Analysis (PCA)

Ace your homework & exams now with Quizwiz!

Why is high variation important?

More variation helps create an explanatory or predictive model

Example of importance of uncorrolating data

x1 = radio advertising (predictor) x2 = tv advertising (predictor) y = sales of product (response) if we have multi-colinearity, it is hard for me to determine what affect radio advertising is having on the sale of the product, compared to what effect TV advertising is having on a product.

How does PCA remove correlation within the data?

by changing the coordinate system

What is the significance of PCR

by using PCR you can perform dimensionality reduction on a high dimensional dataset and THEN fit a linear regression model to a smaller set of variables while at the same time keeping most of the variability of the original predictors

What is the purpose of feature extraction?

dealing with : 1. High dimensional data 2. High Correlated data

multi colinearity

multiple variables (factors) directly influence an independent variable (response)

What are the benefits of ranking coordinates?

reduces the effect of randomness - the earlier principal components are likely to be driven by a higher ratio of actual effects to random effects

Eigenvalue (lambda)

scales the vector to be longer or shorter

How does finding implied coefficients for original factors help us?

- give an intuitive explanation for the model - In other words, PCA can be explained over the original factor space, so an intuitive explanation in terms of those variables can be given?

How do we interpret the new model in terms of the original factors?

- if we plug in the transformation formula, for each T vector. - We can find the implied coefficient aj for each of our original factors j.

What are the goals of principal Component Analysis?

1. Reduce the number of predictors we use 2. eliminate the need for large sets of data 3. removes correlation within the data 4. ranks coordinates by importance

How does PCA rank coordinates by importance?

By ranking the coordinate dimensions in order of the amount of variance in each

Eigenvalue Math Theory

Every value of lambda for which the determinant of a minus lambda times the identity matrix equals 0 - every one of those values of lambda is an an eigenvalue of A - once we have an eigenvalue we can solve for the corresponding eigenvector V

What is a term for the first ranked coordinates?

First end principle components

Eigenvector

If we start with some vector V, and we use a linear transformation on it "A" - we end up with a vector that goes in the exact same direction

is PCA only linear?

No, you can use kernals to perform nonlinear functions

Perfect Colinearity

One variable exactly determines the behavior of another ex. know x1, would know exactly what x2 would be

PCA Change of Coordinate System

PCA changes the coordinate system (denoted by the arrows) to create an uncorrelated scenario

Why is it important to uncorrelate data?

Performing PCA on the raw data produces linear combinations of the predictors that are uncorrolated. - therefore it allows you to disentangle the different affects predictors have on the response

What is PCR?

Principal Component Regression

What are Eigenvalues used for? (Math Theory)

The first step of PCA is to find all the eigenvectors of the matrix X transpose times X - we multiply x by each eigenvector to find each of the principal components

What are Eigenvalues used for? (Principle)

basically uses the properties of eigenvectors and eigenvalues to get the transformed set of coordinate directions also for the directions to all be orthogonal to each other

See all study sets

M1 : Principle Component Analysis (PCA)

Related study sets

E-commerce - IT4123 - Kennesaw - Final

psy 103 exam 2

APUSH Chapter 19 Quiz Study Guide

Physics Test 2

Modern Database Management - Self Check 08

Economics Chapter 23: Homework

PrepU Chp 28: Assessment of Hematologic Function and Treatment Modalities

Módulo 7, 7.04 quiz review

Ch 15 LS - Leases

MKTG 596 Exam 2

AP GOV Units 1-3 Study

mkt 3360 exam 1

Module 3: Life Insurance and Annuities

Ch 20 Adaptive Quiz

Art History CH.14-15 quiz

Life-Span Development Chapter 3- Prenatal Development and Birth

Chapter 16 - Persuasive Speaking

Strategic Planning- Module 1 (Chapters 1 & 2)

Forecasting (Ch. 3.3-3.6)

Global Dimension of Bus: Chp 1