Gradient Descent for Multiple Regression Week 2 Part 2
Of the circumstances below, for which one is feature scaling particularly helpful?
Feature scaling is helpful when one feature is much larger (or smaller) than another feature.
You are helping a grocery store predict its revenue, and have data on its items sold per week, and price per item. What could be a useful engineered feature?
For each product, calculate the number of items sold times price per item.
What is a debugging step when choosing a learning rate?
Making the alpha/learning rate very small to see that the gradient descent is going down for each iteration
If you plot out the cost function values over each iteration and see how well your gradient descent is running and you see it keeps going up and down, is this a good sign?
No it is a bad sign and it tells you that gradient descent is not working properly by a bug in the code, or the learning rate is too large
What kind of regression would you use to fit curved data sets to get a model?
Polynomial regression
Which of the following is a valid step used during feature scaling?
Subtract the mean (average) from each value and then divide by the (max - min).
True/False? With polynomial regression, the predicted values f_w,b(x) does not necessarily have to be a straight line (or linear) function of the input feature x.
True
You run gradient descent for 15 iterations with alpha=0.3α=0.3 and compute J(w) after each iteration. You find that the value of J(w) increases over time. How do you think you should adjust the learning rate α?
Try a smaller value of alpha say alpha = 0.1
What is feature engineering?
Using intuition to design new features by transforming or combining original features into new ones for your model
If you have measurements for the dimensions of a swimming pool (length, width, height), which of the following two would be a more useful engineered feature? length x width x height length + width + height
length x width x height