Machine Learning
What are the steps in a backward elimination model method?
1. Select a significance level to stay in the model 2. Fit the full model with all possible predictors 3. Consider the predictor with the highest P-value. if P > SL, go to step 4, otherwise go to FIN (if all your p-value(s) are less than your significance level then your model is finished.) 4. Remove predictors with the lowest p-values 5. Fit the model without this/these variable
What are the steps in a All Possible model method?
1.Select a criterion of the goodness of fit (e.g. Akaike criterion) 2. Construct All Possible Regression Models: 2^n-1 total combinations 3.Select the one with the best criterion
True and False/ Yes and No is coded as what when encoding for Machine Learning? Why?
0 and 1 because they are binary
What are the five methods of building models?
1. All in 2. Backward Elimination 3. Foward Selection 4. Bidirectional Elimination 5. Score Comparison
What are the five assumptions of linear regression?
1. Linearity 2. Homoscedasticity 3. Multivariate normality 4. Independence of errors 5. Lack of multicollinearity
What are the steps in a bidirectional elimination model method?
1. Select a significance level to enter and to stay in the model e.g: SL ENTER = 0.05, SL STAY =0.05 2. Perform the next step of the Forward Selection (new variables must have: P < SL ENTER to enter) 3. Perform ALL steps of Backward Elimination (old variables must have P < SL STAY to stay) 4. No new variables can enter and no old variables can exit
What are the steps in a Forward Selection model method?
1. Select a significance level to enter the model 2. Fit all simple regression models y~xn(all variables). Select the one with the lowest P-value 3. Keep this variable and fit all possible models with one extra predictor added to the one(s) you already have 4. Consider the predictor with the lowest P-value. If P < SL(when this is NOT true), go to step 3, otherwise go to FIN
Reinforcement Learning
Algorithms that take actions to maximize cumulative reward
Why do we call the it a linear regression?
As the independent increase or decreases the dependent variable increases or decreases in a linear fashion.
What are the 10 applications of Machine Learning discussed in the course?
Face recognition, facebook ads, voice recognition, Amazon, Audible, Netflix, physical video games, medical imaging, space imaging, robots.
What is the dummy variable trap?
If there are n binary variables and each observation falls into one and only one category, then the regression will fail because of multicollinearity. Always omit one dummy variable
In a regression y is the
dependent variable
Not all Artificial Intelligence could count as Machine Learning since some basic Rule-based engines could be classified as AI but they don't learn from experience therefore they do not belong to the machine learning category
True
The simple linear regression package takes care of feature scaling(standardizing and normalizing the data)
True
Deep Learning
A specialized field of Machine Learning that relies on training of Deep Artificial Neural Networks (ANNs) using a large dataset such as images or texts.
Machine Learning
A subfield of Artificial Intelligence that enables machines to improve at a given task with experience. It is important to note that all machine learning techniques are classified as Artificial Intelligence ones.
Why is a simple linear regression simple?
Because it exams the relationship between two variables only.
We should use Simple Linear Regression to predict the winner of a football game
False. Simple linear regression is used to predict continuous data. The football question is a yes/no which means it is discrete (or categorical data), you will need a classification model for that like Logistic Regression
The split ratio parameter in the sample.split function can only use numerics i.e 0.8
False. You can also use fractions i.e 2/3
You should include all of your dummy variables in your multiple linear regression models
False. You should never include all of your dummy variables in your model.
What package is the stepAIC function in?
MASS
Brackets instead of parentheses in R represent
Indexes
Artificial Neural Networks
Information processing models inspired by the human brain
Why is a polynomial linear regression still linear when the visualization is curved
It is still linear because it is referring to the coefficients and the type of the equation which is b+b1+b2 etc
What does this function do? split = sample.split(dataset$Purchased, SplitRatio = 0.8)
It splits the dependent variable into a ratio of 80/20
Why would you use the all in method to build a model?
Prior Knowledge You have to (company requested etc.) Preparing for Backward Elimination
All in model method
Putting all your variables in your dataset in your model.
Artificial Intelligence
Science that empowers computers to mimic human intelligence such as decision making, text processing, and visual perception. Ai is a broader field (i.e.: the big umbrella)that contains several subfields such as machine learning, robotics, and computer vision.
what does this function mean? dataset[2:3]
Since R's columns start with an index of 1 this function is telling R to take indexes(columns) 2-3 from dataset.
Why do you have to encode your categorical variables to factors when modeling?
Since machine learning is used by using mathematical equations we have to transform these type of variables to fit into an equation for standardizing and normalizing the data
What do we know about Ordinary Least Squares?
That yi represents the observed value and yi hat represents the fitted value or the "best fit" on the trend line. The formula is calculated with the sum(yi-yi^)^2. The ordinary least squares are the minimum squared value or the 'best fit'
What is this part of the function doing? dataset$Age = ifelse(is.na(dataset$Age),
This function is saying that if the column Age in the dataset has an NA then return true or false
Supervised Learning
Training algorithms using labeled input/output data. classification and clustering are supervised learning techniques.
Unsupervised Learning
Training algorithms with no labeled data. It attempts at discovering hidden patterns on its own. Clustering is considered unsupervised learning.
When looking at your data what something you should distinguish between the variables?
What are the dependent and independent variables?
What should you do if you need to put categorical variables in your linear regression model?
You have to transform the categorical variable into dummy variables with discrete values that 'count' the category of the new dummy variable. There should be as many dummy variables as categories in the previous categorical variable
Why do you have to split the data into a training and test set when modeling?
Your model could become overfitted and to test that you would need a separate dataset.
Statistical Significance
a statistical statement of how likely it is that an obtained result occurred by chance
What package in R takes care of the dummy variable trap for you?
caTools
In a regression b1 is the
coefficient
In a regression b0 is the
constant
The lower the p-value the higher the
effect it has on the dependent variable
Multiple Linear Regression
examines the relationship between more than two variables
How can you fit a linear regression model in R with all variables except one predictor value?
formula = y ~ . - x ,where y is the objective variable and x is the predictor we want to remove from the model
In a regression x1 is the
independent variable
Before building a linear regression model you need to make sure what is true?
linear regression assumptions
What is the function used in R to create a simple linear regressor ?
lm
Polynomial Regression
models the relationship between the independent variable and the dependent variable as an nth degree of a polynomial in x
what does the alt + n shortcut do in R?
selects the tilde symbol
Write down the function used to split a dataset
split = sample.split(dataset$Purchased, SplitRatio = 0.8)
Feature Scaling
standardizing or normalizing your numeric variables to ensure the range is similar and no variable is dominated by the other.
What is this part of the function doing? ave(dataset$Age, FUN = function(x) mean(x,na.rm = TRUE)), dataset$age)
this function is getting the average of the age column in and specifying the function used (mean). Including na.rm = TRUE tells R that we want to include the missing values when R calculates the age mean. The last part is what is returned if the first condition is not True.
When would you NOT split the dataset into a training and test dataset when creating a model?
when the observations are too small (n<10)
What are the first three levels of the factor function
x(the character variable) , levels(the current name of values) , labels(the new value of the previous values)
What is the regression equation?
y = b0 + b1 * x1
What is the correct way of writing a simple linear regression equation in the formula parameter in R ?
y ~ x