CS 176 LEC 05 PART 2
Error rate
: proportion of errors made over the whole set of instances
ROC (Receiver Operating Characteristic)
A graphical approach for displaying trade off between detection rate and false alarm rate
stratification
Ensures that each class is represented with approximately equal proportions in both subsets
First step: split data into k subsets of equal size Second step: use each subset in turn for testing, the remainder for training This means the learning algorithm is applied to k different training sets
How do you do K-fold cross validation? (2 steps)
set number of folds to number of training instances
How does leave-one-out cross validation occur?
• Answer: split the data into a smaller "training" set and a validation set" (normally, the data is shuffled first) • Build models using different values of k on the new, smaller training set and evaluate them on the validation set • Pick the best value of k and rebuild the model on the full original training set
How to get a useful estimate of performance for different parameter values so that we can choose a value?
K-fold cross-validation
In cross validation ________ avoids overlapping test sets
stratification
In cross-validation ___________ reduces the estimate's variance
error rate
Natural performance measure for classification problems
Training Set, Test Set, Validation Set
Proper procedure uses three sets. What are these three sets?
• Collect probabilities for instances in test folds • Sort instances according to probabilities
Simple method of getting a ROC curve using cross validation:
• Collect probabilities for instances in test folds • Sort instances according to probabilities
Simple method of getting a ROC curve using cross validation?
>resampling of instances according to cost >weighting of instances according to costs
Simple methods for cost--sensitive learning include?
significance tests
Statistical reliability of estimated differences in performance
FALSE. without
TRUE or FALSE: CV uses sampling with replacement
True
TRUE or FALSE: Error on the training data is notnota good indicator of performance on a good indicator of performance on future data
True
TRUE or FALSE: Holdout estimate can be made more reliable by repeating the process with different subsamples
false
TRUE or FALSE: In Class based ordering, rules for rare class have lower priority
false. it can't
TRUE or FALSE: Test data can be used for parameter tuning.
True
TRUE or FALSE: The larger the test data the more accurate the error estimate
False. Larger
TRUE or FALSE: The smaller the training data, the better the classifier.
bootstrap
The __________ uses sampling with replacement to form the training set
• Sample a dataset of n instances n times with replacement to form a new dataset of n instances • Use this data as the training set • Use the instances from the original dataset that do not occur in the new training set for testing
The bootstrap uses sampling with replacement to form the training set (steps)
overall error rate
The error rates on the different iterations are averaged to yield a/an ___________
False. Expensive
True or false: Leave-one-out is extremely cheap
Split data into training and test set
What is a simple solution that can be used if a large amount of (labeled) data is available?
Stratification is not possible
What is the disadvantage of leave-one-out CV?
inner cross-validations
______ are used to choose hyperparameter values
outer cross-validation
___________ is used to estimate quality of learning process
Validation data
____________ is used to optimize parameters
resubstitution error
error rate obtained by evaluating model on training data
shuffling
holdout method reserves a certain amount for testing and method reserves a certain amount for testing and uses the remainder for training, after ____
Test set:
independent instances that have played no part in formation of classifier
success
instance's class is predicted correctly
Error
instance's class is predicted incorrectly
Holdout procedure
method of splitting original data into training and test set
Hyperparameter
parameter that can be tuned to optimize the performance of a learning algorithm
receiver operating characteristic
what does ROC mean?
stratified ten--fold crossfold cross--validationvalidation
what is the standard method for evaluation?