Machine Learning

Ace your homework & exams now with Quizwiz!

T/F Ridge and Lasso Regression are both designed for regularizing the weights. They differ just in terms of the regularization terms they use.

False

T/F Separability in training set does ensure the separability in the feature space of the entire data spectrum.

False

True- False: Overfitting is more likely when you have huge amount of data to train?

False - With a small training dataset, it's easier to find a hypothesis to fit the training data exactly i.e. overfitting.

Describe the difference between Linear Regression and Logistic Regression

In linear regression, the outcome (dependent variable) is continuous. It can have any one of an infinite number of possible values. In logistic regression, the outcome (dependent variable) has only a limited number of possible values. Logistic regression is used when the response variable is categorical in nature.

Which of the following methods do we use to find the best fit line for data in Linear Regression? -Least Square Error -Maximum Likelihood -Logarithmic Loss -Both A and B

Least Square Error - In linear regression, we try to minimize the least square errors of the model to identify the line of best fit.

What is Regression ?

Predicting continuous outputs is called Regression

Suppose that we have N independent variables (X1,X2... Xn) and the dependent variable is Y. Now Imagine that you are applying linear regression by fitting the best fit line using least square error on this data. You found that the correlation coefficient for one of its variables (Say X1) with Y is 0.97. Which of the following is true for X1? -Relation between the X1 and Y is weak -Relation between the X1 and Y is strong -Relation between the X1 and Y is neutral -Correlation can't judge the relationship

Relation between the X1 and Y is strong - The absolute value of the correlation coefficient denotes the strength of the relationship. Since absolute correlation is very high it means that the relationship is strong between X1 and Y.

Describe the steps of Gradient Descent in learning a Linear Regression mdoel

Repeat until hit convergence: (1) calculate the loss and the gradient for each parameter. (1)Given the gradient, calculate the change in the parameters with the learning rate. (2)Recalculate the new gradient with the new value of the parameter.

Explain underfitting

Simple models may not be able to capture all insights on the underlying data patterns and thus may underfit

Which of the following steps in Linear Regression impacts the trade-off between under-fitting and overfitting the most: -The polynomial degree -Whether we learn the weights by matrix inversion or gradient descent -The use of a constant-term

The polynomial degree - Choosing the right degree of polynomial plays a critical role in fit of regression. If we choose a higher degree of polynomial, chances of overfit increase significantly.

T/F A loss or a cost or an Objective Function tells us how best our model approximates the training examples in consideration.

True

T/F ROC curve shows the TPR and FPR at the varying threshold.

True

We have a dataset with R records in which the i th record has one real-valued input attribute xi and one real-valued output attribute yi . In order to test our linear regression model, we choose at random some data for training, and choose a random subset of remaining samples to be a test set. Now, if we grow the training set size gradually, what do you expect will happen with the mean training and mean testing errors?

-The training error tends to increase. As more examples have to be fitted, it becomes harder to 'hit', or even come close, to all of them -The test error tends to decrease. As we take into account more examples when training, we have more information, and can come up with a model that better resembles the true behavior. More training examples lead to better generalization.

You trained a binary classifier model which gives very high accuracy on the training data, but much lower accuracy on validation data. The following may be true: 1. This is an instance of overfitting. 2. This is an instance of underfitting. 3. The training was not well regularized. 4. The training and testing examples are sampled from different distributions.

1 and 4

What's L2 regularization?

Also know as ridge... penalty terms weight is squared and there is analytical solution

What's L1 regularization?

Also known as lasso... eliminates the weights of the least important features so maintains sparsity. However, no more analytical solution.

What is Cross Validation?

Divide dataset into training , validation and testing . Cross-validation is to test the model's ability to predict new data that was not used in estimating it, in order to flag problems like overfitting or selection bias[6] and to give an insight on how the model will generalize to an independent dataset.

T/F Averaged Perceptron just ensures a faster performance.

False

T/F Cross validation can be useful, only when you have large training set.

False

T/F L1 Loss Function is used to minimize the error which is the sum of the all the squared differences between the true value and the predicted value

False

T/F L2 Loss Function is used to minimize the error which is the sum of the all the absolute differences between the true value and the predicted value.

False

T/F Margin is the min sine of angle between w and x, for any x in the training collection.

False

T/F Mini-batch and Batch Optimization schemes are essentially same.

False

Provide and explain ways to prevent over-fitting.

Regularization - Keeps the weights small, which will prevent any single input dimension to over influence the prediction. Regularized least squares - minimizing the loss function by adding a term to 'control sparsity' (controlling weight components)

What is regularization?

Regularization is a technique that is used to tackle the problem of over-fitting of the model. When a very complex model is implemented on the training data, it over-fits. At times, the simple model might not be able to generalize the data and the complex model over-fits. To address this problem, regularization is used.

T/F Averaged Perceptron is a method to avoid overfitting, wherein the final weight vector is the mean of all weight vector values at each step (iteration) of the algorithm.

True

T/F Batch Update splits the training dataset into small batches that are used to calculate model error and update model coefficients.

True

T/F Generalizability is about estimating the learned classifier's performance in both training and test time.

True

T/F L1 panelizes the coefficients more than L2 which leads to sparsity.

True

T/F Mini-Batch Update splits the training dataset into small batches that are used to calculate model error and update model coefficients.

True

T/F Optimization is a way of finding the parameters of our model that minimizes the loss function.

True

T/F Perceptron is Sensitive to the order in which the samples were presented to system while learning

True

T/F Regularization component in an objective function works on keeping the weights all small, which will prevent any single input dimension to over influence the prediction.

True

T/F Ridge Regression uses L2 norm for regularization, while Lasso uses L1.

True

T/F Separability is evaluated by margin, the larger margin is, the better discriminatory the classifier is expected to have.

True

T/F Stochastic gradient descent performs less computation per update than batch gradient descent.

True

True-False: Linear Regression is a supervised machine learning algorithm

True - Yes, Linear regression is a supervised learning algorithm because it uses true labels for training. Supervised learning algorithms should have input variable (x) and an output variable (Y) for each example.

Explain overfitting

Weights are growing larger to compensate for the noises/probable outliers


Related study sets

CPSC 3120 Python Programming Final Questions

View Set

Precalculus - Trigonometric Identities

View Set

Personal Finance Chapter 8 Quiz True False Multiple Choice

View Set

Fagen et al (Elephant training)

View Set

Path to Field Tech I and II (to advance to FT III)

View Set

Business Essentials Review (Unit 3)

View Set

Databricks Certified Associate Developer for Apache Spark 3.0 - Python

View Set

NURS 3108 - Ch 15 Depression EAQs

View Set

Municipal Securities- Analysis, Trading and Taxation

View Set