Decision Trees and Tree Based Methods

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

How do Random Forests differ from bagging, and why do they help reduce variance?

- random forests builds on bagging by introducing randomness in feature selection. Instead of considering all predictors at each split, only a random subset is considered. - this reduces correlation among trees, further lowering variance compared to standard bagging.

What is cost complexity pruning and how does it improve tree models?

Cost complexity pruning reduces the size of a fully grown tree by removing branches that contribute minimally to predictive accuracy. This is done by introducing a penalty tern for tree complexity and selecting the subtree that minimizes this cost. Pruning helps prevent overfitting by balancing model complexity and performance.

Can you explain how decision trees partition the predictor space and how this impacts model performance?

Decision trees recursively split the predictor space into distinct, non overlapping regions. Each split is based on a condition that maximizes homogeneity within each resulting subset. This process results in a hierarchical structure where each terminal node represents a predicted outcome. the granularity of these partitions affects model performance- too many splits can lead to overfitting, while too few may result in underfitting.

What is a Maximal Margin Classifier (MMC), and how is it constructed?

MMC is a linear classifier that finds the hyperplane maximizing the distance (margin) between two classes. It works well for linearly separable data but fails when there is overlap between classes.

How are splits determined in decision trees?

Splits are chosen by evaluating each possible division of a predictor variable and selecting the one that minimizes as impurity measure. (eg. variance for regression trees, gini index or entropy for classification trees)

Can you describe how bagging improves tree performance? What trade-offs does it introduce?

bagging(bootstrap aggregating) generates multiple decision trees using different bootstrap samples from the training data. The final prediction is made by averaging (for regression) or majority voting (for classification). advantages: reduces variance and improves stability trade- offs: the models becomes less interpretable since multiple trees are used instead of a single tree

How does the penalty parameter (budget parameter) affect the margin width, bias, and variance in SVC?

- A high penalty (C) results in a narrower margin, leading to low bias but high variance (risk of overfitting). - A low penalty results in a wider margin, increasing bias but reducing variance.

What are BART trees, and how do they combine elements of averaging and boosting?

- Bayesian Additive Regression Trees (BART) combine aspects of boosting and averaging. They fit multiple shallow trees that collectively approximate the true function. - Unlike traditional boosting, BART incorporates Bayesian priors to regularize predictions and prevent overfitting.

Boosting and how is it different from RF and bagging?

- Boosting builds trees sequentially, where each tree corrects the errors of the previous one by focusing more on misclassified observations. The final model is a weighted combination of weak learners. - Difference from bagging: Instead of independent trees, boosting sequentially adjusts weak learners to improve performance. - Difference from Random Forests: Boosting focuses on reducing bias, while Random Forests aim to reduce variance.

What are the advantages and disadvantages of different tree-based approaches? (Decision trees, bagging, random forest, random forest, boosting, BART)

- Decision Trees: Simple and interpretable but prone to overfitting. - Bagging: Reduces variance but loses interpretability. - Random Forests: Further reduces variance but computationally expensive. - Boosting: Improves accuracy but can be sensitive to noise. - BART: Combines strengths of boosting and Bayesian regularization but requires more computational resources.

How can Support Vector approaches be extended to handle multi-class classification problems?

- One vs One (OvO): trains a separate classifier for each pair of classes - One vs All(OvA): trains one classifier per class, distinguishing it from all others

How do Support Vector Machines (SVMs) use kernels to create non-linear decision boundaries?

- SVMs use kernel functions to map data into higher-dimensional spaces where a linear separator exists. - Common kernels include polynomial, radial basis function (RBF), and sigmoid kernels.

For classification trees, what are the Gini Index and entropy, how do they help increase node purity?

- The gini index measures the probability of incorrectly classifying a randomly chosen observation from a node. A lower Gini value means purer nodes. - Entropy measures the level of disorder in a node. lower entropy indicates higher purity. 0 both metrics guide the tree in making splits that result in more homogeneous(pure) nodes, improving classification accuracy.

What are some advantages and disadvantages of using decision trees compared to other models?

Advantages: easy to interpret and visualize handles both numerical and categorical data requires little preprocessing can model non-linear relationships disadvantages: prone to overfitting without pruning sensitive to small changes in data (high variance) less accurate compared to ensemble method like random forest and boosting

Out of Bag(OOB) error and how does it provide an estimate for test MSE in bagged trees and Random Forests?

OOB error is calculated using observations that were not included in a given bootstrap sample. Since each tree is trained on different bootstrap samples, OOB error provides an unbiased estimate of the model's test error without needing a separate validation set.

How is the Support Vector Classifier (SVC) an extension of MMC for non-separable cases?

SVC introduces a soft margin, allowing some misclassified points by incorporating a penalty term. This makes it suitable for datasets that are not perfectly separable.


Kaugnay na mga set ng pag-aaral

Technology in Action: Ch. 12 assessment // quiz

View Set

Principles of Management Exam 2 Set, Chapters 6-10

View Set

practice question for fluid and electrolytes

View Set

Mastery Astronomy Assign 5-Chapter 7

View Set