Ensemble Methods Overview
Random Forest
A bagging ensemble method that uses the same training algorithm for every base predictor but trains them on different subsets of the training data.
AdaBoost
A boosting ensemble method that builds base estimators sequentially by adjusting parameter weights.
Gradient Boosting
A boosting ensemble method that builds base estimators sequentially by adjusting parameter weights.
RandomForestClassifier
A classifier using a random forest ensemble with a specified number of decision trees.
Soft Voting
An aggregation method where the class with the highest average probability from base classifiers is predicted.
Hard Voting
An aggregation method where the class with the most votes from base classifiers is predicted.
Random Forest Model
An ensemble learning method where decision trees are built from bootstrap subsets and the final decision is based on majority voting.
BaggingClassifier
An ensemble method using bootstrap sampling to train multiple base predictors, typically decision trees.
Weak classifiers
Base learners in AdaBoost, often decision stumps, which are short decision trees with a single split.
Weak Learners
Base predictors in boosting, slightly better than random guessing.
Base Predicators
Base predictors in gradient boosting, added stage by stage.
Boosting
Building base estimators parallelly based on subset data and aggregating their predictions by voting or averaging.
Meta Classifier
Classifier combining outputs of base-level classifiers to make final predictions.
Meta-Level Classifier
Classifier combining outputs of base-level classifiers to make final predictions.
AdaBoostClassifier
Classifier in Scikit-Learn using AdaBoost with decision stumps as default base estimator.
GradientBoostingClassifier
Classifier in Scikit-Learn using Gradient Boosting with decision trees as base predictors.
Ensemble
Combining multiple predictors into a single prediction.
Ensemble Learning
Combining predictions from multiple models to leverage their strengths for improved accuracy.
Gradient Tree Boosting (GTB)
Common gradient boosting model using decision trees as base predictors.
Ensemble Method
Constructing a set of base predictors from training data and making predictions by aggregating individual base predictor predictions.
Ensemble Construction
Constructing ensemble from a sequence of classifiers trained in boosting rounds.
Bootstrap Aggregating (Bagging)
Creating separate data sample sets from the training dataset, training a classifier for each set, and aggregating their predictions.
Decision Stump
Default base estimator for AdaBoostClassifier in Scikit-Learn.
Voting Classifier
Ensemble classifier combining predictions by voting or averaging.
Random Forests
Ensemble method building base estimators parallelly based on subset data.
Homogeneous Ensemble
Ensemble methods like Bagging and Boosting that use the same learning algorithm to produce base learners of the same type.
Heterogeneous Ensemble
Ensemble methods like Stacking that use base learners with different learning algorithms.
Variance
Error during the prediction phase of a predictor.
Bias
Error during the training phase of a predictor.
Residual Errors
Errors made by the previous predictor in gradient boosting, used to fit the new predictor.
Underfitting
Gradient boosting's robustness to overfitting, benefits from a large number of weak estimators.
ROC Graphs
Graphical representation of the performance of classification models.
Base Predictors
Individual predictors of an ensemble that are combined to make predictions.
Training Instances
Instances used for training classifiers in ensemble methods.
Additive Models
Models built by gradient boosting algorithms, constructed stage by stage.
Boosting Rounds
Number of rounds in boosting where classifiers are trained and weighted votes are combined.
Gradient Descent
Optimization algorithm used in machine learning to minimize a function by iteratively moving in the direction of steepest descent.
Grid Search
Optimization approach to evaluate and improve models by searching through a parameter grid.
Learning Rate
Scale step length in gradient descent, a regularization technique known as shrinkage.
Aggregating
The process of combining predictions made by base predictors through methods like voting or averaging.
Stacking
Training a meta-level classifier to combine the outputs of base-level classifiers.
Out-of-bag (oob) instances
Training instances not sampled during the training process in bagging, used for evaluating the model.