Quiz 6: Completion of Logistic Regression, Jupyter notebook files lectures
In the cost function formula, all the h values are between 0 and 1, so the logs will be negative. That is the reason for the factor of ________ applied to the sum of the two loss terms. To convert it to a positive loss value.
-1
How to implement predict_tweet. Predict whether a tweet is positive or negative
-Given a tweet, process it, then extract the features. -Apply the model's learned weights on the features to get the logits. -Apply the sigmoid to the logits to get the prediction (a value between 0 and 1).
The build_freqs function in the utils.py file builds a specific dictionary based on the corpus of tweets. Which of the following statements correctly describes this function?
-It counts how often a word in the corpus was associated with a positive label '1' or a negative label '0' -It builds a dictionary where each key is a (word,label) tuple, and the value is the count of its frequency within the corpus of tweets.
When performing logistic regression for sentiment analysis using the method taught in jupyter lab files, you have to:
1. Create a dictionary that maps the word and the class that word is found in to the number of times that word is found in the class. 2. Performing data processing. 3. For each tweet, you have to create a positive feature with the sum of positive counts of each word in that tweet. You also have to create a negative feature with the sum of negative counts of each word in that tweet.
In the cost function formula when the model prediction is close to 1 i.e. ((h(z(θ)) = 0.9999) and the label is 0,the second term of the log loss becomes a ___________ number, which is then multiplied by the overall factor of -1 to convert it to a positive loss value.
Large negative
In the cost function formula when the model predict 0 (h(z(θ)) = 0 and the label 'y' is 1, the loss for that training example is __________ number.
Large positive
In the cost function formula defined below, what does 'h(z(θ)exp(i))' represent?
The predicted output of the logistic regression model
How to implement test_logistic_regression. please select all the correct statements from the options below.
S1: Use your 'predict_tweet' function to make predictions on each tweet in the test set S2: Given the test data and the weights of your trained model, calculate the accuracy of your logistic regression model. S3: If the prediction is > 0.5, set the model's classification 'y_hat' to 1, otherwise set the model's classification 'y_hat' to 0. S4: A prediction is accurate when the y_hat equals the test_y. Sum up all the instances when they are equal and divide by m.
In the context of logistic regression, we use gradient descent to iteratively improve our model's predictions. Based on the following formulas and descriptions, answer the question below: The gradient of the cost function J with respect to one of the weights θj is: (refer to the image) To update the weight θj, we adjust it by subtracting a fraction of the gradient determined by α: θj=θj−α×∇θjJ(θ) What does 'h' represent in the formula for the gradient of the cost function?
The predicted output of the logistic regression model
The cost function used for logistic regression is the _______ of the log loss across all training examples
average
In Python programming using z scalar variable, and h as the sigmoid of z, what is a correct definition of sigmoid function in Python code.
h = 1/(1+np.exp(-z))
In the sigmoid function it maps the input 'z' to a value that ranges between 0 and 1, and so it can be treated as a _________.
probability
Dot product perform matrix multiplication with ______ vector to column vector.
row
The extract_features function. Takes in a _______ tweet.
single
The learning rate α is a value that we choose to control how big a single ______ will be.
update
When training logistic regression, you have to perform the following operations in the desired order.
Initialize parameters, classify/predict, get gradient, update, get loss, repeat
What is log loss in the context of logistic regression?
It's a measure of how well the logistic regression model fits the training data.
In the cost function formula when the model prediction is close to 0 i.e. ((h(z(θ)) = 0.0001) and the label is 1, the second term of the log loss becomes a ___________ number, which is then multiplied by the overall factor of -1 to convert it to a positive loss value.
Large negative
When performing logistic regression on sentiment analysis, you represented each tweet as a vector of ones and zeros. However your model did not work well. Your training cost was reasonable, but your testing cost was just not acceptable. What could be a possible reason?
The vector representations are sparse and therefore it is much harder for your model to learn anything that could generalize well to the test set.
In the context of logistic regression, what is the purpose of the cost function?
To measure how well the model fits the training data
The sigmoid function is used in _________ regression for text classification.
logistic
Examples of text preprocessing?
-Removing stopwords, punctuation, handles and URLs -Lowercasing, which is the process of removing changing all capital letter to lower case. -Stemming, or the process of reducing a word to its word stem.
What is a good metric that allows you to decide when to stop training/trying to get a good model?
-When your accuracy is good enough on the test set. -When you plot the cost versus (# of iterations) and you see that your the loss is converging (i.e. no longer changes as much).
In the cost function formula when the model predict 1 (h(z(θ)) = 1 and the label 'y' is also 1, the loss for that training example is __________.
0
How to train the model in logistic regression in correct order:
1. Stack the features for all training examples into a matrix X. 2. Call `gradientDescent`, which you've implemented
In the cost function formula defined below, what does 'y(exp(i))' represent?
Actual label
What role does gradient descent play with regards to the weight in these algorithms?
Gradient descent iteratively adjusts the weights in order to minimize the cost function.