Python New Package Introduction (NPI)

Ace your homework & exams now with Quizwiz!

Adaboost Regressor

Boosting is an ensemble learning strategy that transforms a group of weak learners into strong learners in order to reduce training errors. Boosting involves selecting a random sample of data, fitting it with a model, and then training it sequentially. That is, each model attempts to compensate for the shortcomings of its predecessor. Each cycle combines the weak rules of each classifier to generate one strict prediction rule. Scikit-learn: from sklearn.ensemble import AdaBoostRegressor adaboost_model = AdaBoostRegressor() // numerous param options adaboost_model.fit(x_train, y_train)

Random Forest Regressor

Random forest is a type of Supervised Machine Learning Algorithm that is commonly used in classification and regression issues. It constructs decision trees from several samples and uses their majority vote for classification and average for regression. Scikit-learn: from sklearn.ensemble import RandomForestRegressor rf_regressor = RandomForestRegressor() //numerous parameter options rf_regressor.fit(x_train, y_train)

Seaborn

Seaborn is a Python library that specializes in creating visually appealing and informative statistical graphics. It is built on top of matplotlib and offers seamless integration with pandas data structures, allowing for efficient visualization and analyiss of data stored in DataFrames import seaborn as sns

Random Forest Classifier

Random forest is a type of Supervised Machine Learning Algorithm that is commonly used in classification and regression issues. It constructs decision trees from several samples and uses their majority vote for classification and average for regression. Scikit-learn: from sklearn.ensemble import RandomForestClassifier rf_classifier = RandomForestClassifier() rf_classifier.fit(x_train, y_train)

Bagging Regressor

A Bagging regressor is an ensemble meta-estimator that fits base regressors each on random subsets of the original dataset and then aggregates their individual predictions (either by voting or by averaging) to form a final prediction. Such a meta-estimator can typically be used as a way to reduce the variance of a black-box estimator (e.g., a decision tree), by introducing randomization into its construction procedure and then making an ensemble out of it. Scikit-learn: from sklearn.ensemble import BaggingRegressor model = BaggingRegressor() //numerous parameter options model.fit(x_train, y_train)

Gradient Boosting Regressor

Boosting is an ensemble learning strategy that transforms a group of weak learners into strong learners in order to reduce training errors. Boosting involves selecting a random sample of data, fitting it with a model, and then training it sequentially. That is, each model attempts to compensate for the shortcomings of its predecessor. Each cycle combines the weak rules of each classifier to generate one strict prediction rule. Scikit-learn: from sklearn.ensemble import GradientBoostingRegressor gradientboost_model = GradientBoostingRegressor() //numerous param options gradientboost_model.fit(x_train, y_train)

Decision Tree Classification

Decision trees is non -parametic supervised learning approach that can be used for classification or regression problems. It has a tree structure that is hierarchical and consists of a root node, branches, internal nodes, and leaf nodes. Scikit-learn: from sklearn.tree import DecisionTreeClassifier dtree_classifier = DecisionTreeClassifier() // numerous parameter options dtree_classifier.fit(x_train, y_train)

Decision Tree Regression

Decision trees is non -parametic supervised learning approach that can be used for classification or regression problems. It has a tree structure that is hierarchical and consists of a root node, branches, internal nodes, and leaf nodes. Scikit-learn: from sklearn.tree import DecisionTreeRegressor dtree_regressor = DecisionTreeRegressor() // numerous parameter options dtree_regressor.fit(x_train, y_train)

Label Encoding

Intro in reference to decision trees, but useful elsewhere.Sklearn (Scikit-learn) provides a very efficient tool for encoding the levels of categorical features into numeric values. LabelEncoder encodes labels with a value between 0 and n_classes-1, where n is the number of distinct labels. If a label repeats, it assigns the same value as assigned earlier. from sklearn.preprocessing import LabelEncoder ln = LabelEncoder() ln.fit_transform(data['column']) # column should be any categorical variable

Regression Evaluation Metrics

It is necessary to obtain the accuracy on training data, But it is also important to get a genuine and approximate result on unseen data otherwise model is of no use. Below is an example of how we can check the regression model's efficiency through its respective available metrics. from sklearn.metrics import mean_squared_error, r2_score, mean_absolute_error #making predictions on the test data using the trained model y_pred = model.predict(x_test) print("mse = ", mean_sqaured_error(y_test, y_pred)) print("R^2 score = ", r2_score(y_test, y_pred)) print("mae = ", mean_absolute_error(y_test, y_pred))

NumPy

NumPy is used for working w/ numerical values as it makes it easy to apply mathematical functions. import numpy as np

Pandas

Pandas is used for data analysis tasks in Python. import pandas as pd

Xgboost Classification

XGBoost is an open-source software library that implements optimized distributed gradient boosting machine learning algorithms under the Gradient Boosting framework. It stands for Extreme Gradient Boosting, which is a scalable, distributed gradient-boosted decision tree (GBDT) machine learning library. It provides a parallel tree boosting and is the leading machine learning library for regression, classification, and ranking problems. !pip install xgboost from xgboost import XGBClassifier xgb_classifier = XGBClassifier() xgb_classifier.fit(x_train, y_train)

Xgboost Regression

XGBoost is an open-source software library that implements optimized distributed gradient boosting machine learning algorithms under the Gradient Boosting framework. It stands for Extreme Gradient Boosting, which is a scalable, distributed gradient-boosted decision tree (GBDT) machine learning library. It provides a parallel tree boosting and is the leading machine learning library for regression, classification, and ranking problems. !pip install xgboost from xgboost import XGBRegressor xgb_regressor = XGBRegressor() xgb_regressor.fit(x_train, y_train)


Related study sets

Organizational and Professional Health and Well-Being

View Set

Module 33- Oligopoly in Practice

View Set

Chapter 11 MGH: Aggregate Supply & Demand

View Set