CS170 Midterm 2

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

What are some reasons a decision tree can overfit data

1) If there are to many branches 2) results in poor accuracy on unseen data

What are the disadvantages of decision trees?

1) May suffer from overfitting 2) classifies by rectangular partitioning 3) can be large 4) does not handle streaming data easily

What are the properties of a distance measure?

1) Symmetric 2) Consistancy 3) D(A,B) = 0 than A == B 4) Triangular inequality

What are the 4 major ways that we compare algorithms

1) accuracy 2) speed and scalability 3) robustness 4) interoperability

How do we improve accuracy when we are given X features

1) remove of number of features, so that we are only left with the most informative features 2) Create new feature from combining old features together

Describe K Fold Cross Validation

1) split data into k groups 2) For each unique group: Take the group as a hold out or test data set 3) Take the remaining groups as a training data set 4) Fit a model on the training set and evaluate it on the test set 5) Retain the evaluation score and discard the model 6) repeat 2-5 until we have computed for all k's 7) accuracy = (# of correct classifications) / (# of instances in our dataset)

nearest neighbors is sensitive to irrelevant data, how can we fix this?

1) use more training data 2) Ask an expert 3) use statistical test to determine which features are important and which features aren't 4) use feature subsets (forward selection, backward selection, bidirectional search)

How can we estimate the accuracy of a machine learning algo A) By having it guess the label of several instances for which we know the correct label B) Using k-nearest neighbors C) Using cross-fitting D) By folding the data

A, By having it guess the label of several instances for which we know the correct label

Which of the following statements is true about decision tree classifiers? A) Decision trees are prone to overfitting,especially when a tree is particularly deep. B) Decision trees cannot handle categorical features and require all features to be numerical. C) Decision tree classifiers always result in higher prediction accuracy compared to other classification algorithms. D) Decision tree classifiers are not affected by the order or arrangement of features in the dataset.

A, Decision trees are prone to overfitting, especially when a tree is particularly deep

Which of the following statements is TRUE about naive bayes classifier A) Naive Bayes assumes that all features are independent of each other B) Naive Bayes is a non-parametric classification algo C) Naive Bayes is not suitable for text classification task D) Naive Bayes requires a large amount of training data to perform well

A, Naive Bayes assumes that all features are independent of each other

Which statements are TRUE about nearest neighbor classifier (may be more than one) A) It is sensitive to irrelevant features and this can be remedied by using more training data B) It is sensitive to noise, but this can be remedied by using the k-nearest neighbors instead of 1 nearest neighbor C) It is sensitive to irrelevant features and this can be remedied by searching over feature subset D) It is NOT sensitive to noise

A,B,C

Which of the following statements are True about overfitting (may be more than 1 A) It happens when we try to reduce the classifier's error too much B) It happens when we use complex models to exactly fit the training data C) It happens when we use huge amounts of training data D) Although overfitting achieves lower error on the training data, it will have higher error on the future unseen data

A,b,d

Which statement is FALSE about Naive Bayes A) it is called naive because the feature independence assumption B) It is sensitive to irrelevant features C) A new instance will be classifier based on the probability of it belonging to each class D) it is fast and space efficient

B, it is sensitive to irrelevant features

If we have a information gain of 0, is this a bad question or a good question

Bad question

Which of the following is true about linear classifiers? A) Linear classifiers can only separate data that is linearly separable. B) Linear classifiers are not suitable for high-dimensional datasets. C) Linear classifiers can handle nonlinear relationships in the data. D) Linear classifiers always result in overfitting.

C, Linear classifiers can handle nonlinear relationships in the data

Which statement is TRUE about choosing the right classifier for a dataset A) If classifiers X and Y yield the same accuracy on the dataset, we should choose the more complex classifier B) If on a dataset, classifier X yields 99.9% accuracy and classifier Y yields 99% accuracy, we should choose classifier X. C) If classifiers X and Y yield the same accuracy on the dataset, we should choose the simpler classifier. D) The choice of classifier is not important. The only thing that affects the accuracy is feature selection and generation.

C, if classifiers X and Y yeild the same accuracy on the dataset, we should choose the simpler classifier

for the nearest neighbor algorithm what function does it use to choose the nearest neighbor?

Can use any distance function to measure the nearest.

How do we select the best attribute to split a mixed node when constructing a decision tree A) We select the attribute that results in more homogeneous clusters/nodes B) We select the attribute that results in a higher information gain C) We select the attributes that results in a greater reduction of entropy D) All of the above

D), all of the above

When solving a problem using machine learning, what should we mostly focus on? A) Finding the most informative features B) finding meaningful ways to combine features to get a new feature C) Finding the right classifier ' D) both finding the most informative features and finding meaningful ways to combine features to get a new feature

D, Both finding the most informative features and finding meaningful ways to combine features to get a new feature

How can we increase the accuracy of a machine learning algo for solving a problem? A) By choosing the right set of features B) by choosing the right classifier C) by using more training data D) by generating new features out of the existing features E) All of the above

E, All of the above

What are the advantages of decision trees?

Easy to understand and easy to generate

What is information gain

Expected reduction in entropy

What is pre-pruning?

Halts tree construction early. Do not split a node if it results in a bad goodness measure. Difficult to choose this threshold

How long does it take to construct a linear classifier model

Linear time: O(n) Where n is the number of instances in our dataset

What is the time to construct a nearest neighbor classifier model

None, constructs in real time

How long does it take to run a greedy feature selection search?

O(2^n) or 2^n - 1

What is the runtime of K-Means

O(KTN), N = Objects, T = iterations, K = Clusters

How long does it take to use a nearest neighbor classifier

O(n), not good for large datasets

How do we correct overfitting on a decision tree

Pre-pruning and post-pruning

What is Naive Bayes trying to predict?

The probability of Class C, given our observation O

What does an entropy of 1 mean?

This means it took the maximum effort to calculate, this is bad

nearest neighbors is sensitive to outliers, how do we fix this?

Use K-NN Algo, This is when we measure the K nearest neighbors instead of the singular closes neighbor. We would classify our point as the majority class Note: this will also make the algo more robust

What is k-fold cross-validation used for?

Used to estimate to accuracy of our clssifier

How long does it take to use/test a linear classifier model?

constant time: O(1)

What is data editing

data editing is what we do when we want to remove data points from nearest neighbor classifiers, this is done to speed up the algo

What is entropy?

min effort needed to label a set of objects

What does it mean if a dataset is linear separable?

problems that can be solved by a linear classifier

What is post-pruning?

removes branches from a fully grown tree. Gets a sequences of progressively pruned trees. Use a test data set to decide which one performs best.


Ensembles d'études connexes

Chapter 10: Introductions + Conclusions

View Set

Chapter 2: Chemical Bonds and Reactions

View Set

ECE Fund of Cybersecurity and Info Security: Ch 10 and 11

View Set

Social Studies Japan Quiz: Geography and Nora period

View Set

Ch. 29 Management of Patients with Nonmalignant Hematologic Disorders

View Set

Testout Chapter 05 Hardware Management

View Set

NURS 3110 - The Endocrine System Test

View Set