CIS 375 Quiz 2
FALSE - ITSvalidation data point
Overfitting occurs when a model models details of the training data that are characteristics of the population in general
-Performing multiple splits -Systematic swapping -Computing the mean of multiple estimated performances
what are the three part that a Cross Validation Includes
Generalization performance vs model complexity
A fitting graph plots
TRUE
A learning curve shows the generalization performance plotted against the amount of training data used.
FALSE
A linear perceptron is a nonlinear model
True
A linear regression model partitions the instance space into similar regions by a 'straight line with a slope' decision boundary. True/False
Functional Form
Nonlinear models can fail because it is very difficult to specify their ______________(equation).
What is an ROC curve used for
A receiver operating characteristic (ROC) curve is used to evaluate the properties of a diagnostic test. It is a plot of the true-positives (sensitivity) on the y-axis and the false-positives (1-specificity) on the x-axis. The area under the ROC curve (AUC) is calculated and compared to 1. Values closest to 1 indicate a good diagnostic test. ••• 642 Obstetrics and Gynecology Board Review
false
According to the following profit curves, Classifier 2 is the best when you have to target 60% of the total customer base.
False Support-Vector Machines are more robust to the outlying examples than logistic regressions. logistic regression line moves much more considerably in response to a new outlying example. logistic regression appears to be more overfitting than Support-Vector Machines.
After we have added one new, hypothetical instance (star in right chart), we can conclude that support vector machines appear to be more prone to over-fitting than logistic regression.
Sigmoids
By combining (adding) the weighted _____________ from the hidden layer neurons, any shape can be modeled.
Sparseness
Nonparametric regression models can fail because of the relative ____________ of data in higher dimensions.
15 Nodes. Pick where you see the lines begin to diverge.
Choosing Node Size Tree Classification Model, Accuracy (Y) vs Node #(X)
Obtaining New Data
Cross Validation DOES NOT INCLUDE
-can estimate class membership probability. -are based on supervised learning. -can be applied when the data are not linearly separable.
Describe the support vector machine? ( 3 of them)
TRUE
Each point in a ROC curve is based on a DIFFERENT threshold which produces a DIFFERENT classifier and corresponds to a DIFFERENT confusion matrix.
True
Each unit of a given layer in a multi-layer neural network model applies a simple model (e.g., logistic regression) to the outputs of the previous layer.
true
For logistic regression, the model can produce a numeric estimate such as probability, odds or log-odds estimates. True/False
-Receives inputs from other source. -Combines them in someway. -Performs a generally nonlinear operation on the result. -Outputs the final result.
Fundamental processing elements of a neural network is a neuron who :
Scoring
Neural networks are one of the fastest _______ models.
True
Given enough neurons and time, a neural network can model any I/O relationship, to any degree of precision
True
Given enough neurons and time, a neural network can model any I/O relationship, to any degree of precision. True?False
We penalize a training point for being on the wrong side of the decision boundary (i.e., heavily penalizing false positives and false negatives).
How would you handle points falling on the "wrong side" of the discrimination boundary?
True
In a neural network model, the function describing the input-output relationship does not need to be specified or even fully understood.
TRUE
In the confusion matrix, a false negative occurs when a classifier predicts an instance as negative when it is a positive
true
In the diagram below, we added a 'regression' node prior to running a 'neural network' node select inputs for a neural network.
False
Instead of simply computing the frequency, we often use a "smoothed" version of the frequency based estimate and its goal is to moderate the influences of leaves with only a LARGE number of instances. True/False
GiftCnt36 = 1.059, this means that for each additional donation over the last 3 years, the odds of someone donating change by 1.059.. or increase by 5.9%
Interpret odds ratio estimate of GiftCnt36
FALSE
Linear regression, logistic regression, and support vector machines are all very similar instances of the basic fundamental data mining techniques because they use the same objective function.
False
Linear regression, logistic regression, and support vector machines are all very similar instances of the basic fundamental data mining techniques because they use the same objective function. True/False
true
Logistic regression is a class probability estimation model and NOT a linear regression model.
how much penalty should be assigned to an instance based on the error (i.e., the distance from the separation boundary) . Use "hinge loss" for Support-Vector Machines
Loss function for "error penalty" determines what?
False
Neural network is a set of connected input/output units (nodes), where each "unit (or node)" has a weight associated with it.
False
Pruning is a technique for increasing the complexity of a model. True/False
numerical target variable
Regression is distinguished from classification by?
We used logistic regression to select inputs for the neural network
SAS E Minor: Adding Regression Nodes BEFORE running a neural network because:
false. Not a validation data point but a training data point)
Support vector machines penalize a validation data point for being on the wrong side of the decision boundary and beyond the margin.
True
Support-Vector Machines (SVMs) approach classification problems by finding the widest possible bar that fits between points of two different classes. True/False
Perceptron
The _____________ is a pattern-recognition device, initially designed for op1cal character recognition.
true
The following figure presents the logistic regression output using donation data. For GiftCnt 36, the odds ratio estimate equals 1.059. This means that for each additional donation in the past 36 months, the odds of donation during the 97NK campaign change by a factor of 1.059, a 5.9% increase.
False
The hinge loss used in support vector machines only becomes POSITIVE when an instance is on the WRONG SIDE of the boundary and within the margin. True/False
FALSE
The lift of a classifier means the relative advantage the classifier provides over a logistic regression model.
Hardly not. A single estimate is likely to be susceptible to chance occurrences
Then should we be confident about this single estimate of model accuracy?
Inputs
Traditional nonlinear methods are limited, with respect to the number of_______ that they can consider.
+ Start with cutoff, threshold value, where we never predict positive class (e.g., 0.999) + Sort the test set by the model predictions + Decrease cutoff, after each step count the number of true positives TP (positives with prediction above the cutoff) and false positives FP (negatives above the cutoff) + Calculate TP rate (TP/P) and FP (FP/N) rate
What are the step in generating a ROC curve?(not in order)
Pruning is a technique in machine learning that reduces the size of decision trees by removing sections of the tree that provide little power to classify instances. Pruning reduces the complexity of the final classifier, and hence improves predictive accuracy by the reduction of overfitting.
What is pruning?
1. Propagate the input activation forward through the network and calculate the output error. 2. Back propagate the error, and adjust the weights as you go. If not converged, then go to Step 1
What is the Generalized Delta Rule?
Choosing the line to minimize the margin between two classes
Which of the following does NOT describe support vector machine?
Calculate TP rate (TP/P) and FP (FP/N) rate
Which of the following is the LAST step in generating a ROC curve?
The hinge loss used by Support-Vector Machines gives zero weight to these points while the log-loss used by logistic regression gives a little bit of weight to these points
Why would Support-Vector Machines decision boundary be unaffected by a point that is correctly classified and distant from the decision boundary, but the one learned by logistic regression be affected?
15 nodes
Your assistant ran tree classification models on a given data set and reported you the following visual plot. You job is to determine the size of nodes to be included in the tree model. You should choose:
What is cross-validation?
is a model validation technique for assessing how the results of a statistical analysis will generalize to an independent data set. It is mainly used in settings where the goal is prediction, and one wants to estimate how accurately a predictive model will perform in practice. In a prediction problem, a model is usually given a dataset of known data on which training is run (training dataset), and a dataset of unknown data (or first seen data) against which the model is tested (testing dataset). The goal of cross validation is to: -define a dataset to "test" the model in the training phase (i.e., the validation dataset). - in order to limit problems like overfitting. - give an insight on how the model will generalize to an independent dataset (i.e., an unknown dataset, for instance from a real problem), etc.