ML Algorithms Interview Questions

Ace your homework & exams now with Quizwiz!

What are the benefits of LightGBM?

1. Histogram-Based Splitting: It groups data into histograms to evaluate potential splits, reducing computational load. 2. Leaf-Wise Tree Growth: Unlike other methods that grow trees level-wise, LightGBM chooses the leaf that leads to the most significant decrease in the loss function. 3. Exclusive Feature Bundling: It groups infrequent features, reducing the search space for optimal tree splits. 4. Gradient-Based One-Side Sampling: It uses only positive (or negative) instances during tree construction, optimizing with minimal data. 5. Efficient Cache Usage: LightGBM optimizes for internal data access, minimizing memory IO.

How does LightGBM achieve faster training and lower memory usage?

1. Leaf-wise tree growth 2. Gradient-Based One-Side Sampling - selecting data points offering most information gain potential for updates during tree building, especially when dealing with huge datasets. 3. Exclusive feature bundling 4. Enhanced feature parallelism

What are the core parameters in XGBoost that you often consider tuning?

1. Number of estimators - number of boosting rounds. Higher values could lead to overfitting 2. Max depth - the maximum depth of each tree, deeper treas can lead to overfitting 3. Subsample - represents the fraction of data to be randomly sampled for each boosting round 4. Learning rate - scales the contribution of each tree, offering control over speed and potential for better accuracy 5. Gamma - specifies the minimum loss reduction required to make a further partition 6. Alpha and lambda - control the L1 and L2 regularization terms, aiding in the case of highly-correlated features 7. Objective - defines the loss function to be optimized 8. Evaluation metric - defines the metric to be used for cross-validation and model evaluation, such as RMSE for regression and AUC for binary classification

Key features of XGBoost

1. Parallel processing 2. Feature importance 3. Handling missing data 4. Flexibility - can do regression, classification, or ranking 5. GPU support

What are the advantages of tree pruning?

1. Reduced overfitting 2. Faster computations 3. Enhanced feature evaluation - without excessive tree depth, it becomes easier to discern feature importance based on their splits 4. Improved model understanding and interpretability - simpler trees are easier to visualize and interpret

How does gradient boosting work?

1. Start with a single weak learner. 2. Sequential improvement - construct additional models to correct the errors of the previous learner. 3. Add additional trees. Each new tree is fitted on the residuals between the current target and the estimated target, effectively shifting the focus of the model to the remaining errors, making it a Gradient Descent method.

What are the differences between a gradient boosting machine (GBM) and XGBoost?

1. Tree construction - XGBoost trees are built level-wise, while GBM trees are built leaf-wise 2. Regularization - XGBoost employs both L1 and L2, while normal GBM implementations only permit L2 3. Parallelism - XGBoost offers parallelism for tree construction, node splitting, and column selection. For GBM, some libraries do support multi-threading, but for the most part tree building is sequential 4. Missing data handling - XGBoost automatically determines the best path for handling missing values during training 5. Algorithm complexity - XGBoost uses optimization techniques like column block structures, sparse-aware data representation, and out-of-core computing 7. Custom loss/evaluation metrics - XGBoost allows this, GBM generally does not

How does XGBoost handle highly imbalanced datasets?

1. Weighted loss function - assign different weights to positive and negative examples within its loss function 2. Stratified sampling - divides population into groups and sampling an equal amount from each group 3. Focused evaluation metrics - using scores other than accuracy, such as F1, precision, recall 4. Hyperparameters optimization - max delta step, gamma, subsample

What are common loss functions in classification?

1. logistic loss - commonly employed in binary classification problems, it calculates the likelihood of the predicted class, converting it to a probability with a sigmoid function 2. softmax loss - generally used for multi-class classification, it calculates a probability distribution for each class and maximizes the likelihood across all classes 3. adaptive log loss - introduced in xgboost, this loss function provides a balance between speed and accuracy

What are common loss functions in regression?

1. squared loss (L2 loss) - sum of the squared differences between actual and predicted 2. absolute loss (L1 loss) - sum of the absolute differences 3. huber loss - a combination of L1 and L2 loss that is less sensitive to outliers. It switches to L1 loss if the absolute difference is above some threshold delta

What are the differences between boosted trees and a plain random forest model?

Boosted trees iteratively build weak learners, allowing the trees to learn from past mistakes. A vanilla random forest model creates trees in parallels

Explain the concept of gradient boosting

GBMs are ensemble methods that build models in a forward stepwise manner, using decision trees as base learners. XGBoost is a variant of GBMs, optimized for speed and performance

What is the difference between level-wise trees and leaf-wise trees?

Growing trees leaf-wise (best first) or level-wise (depth-first) will result in the same tree. The difference is the ORDER in which the tree is expanded. Leaf-wise chooses splits based on their contribution to the global loss and not loss along a particular branch, it often will learn lower-error trees faster than level-wise. This distinction is only important in the case of early stopping and pruning. If you build the tree out to its full size, without stopping or pruning, the trees will converge to the same performance because they will literally build the same tree

What are some of the advantages of LightGBM over XGBoost or CatBoost?

Its key strengths lie in handling large datasets, providing faster training speeds, optimizing parameter defaults, and being lightweight for distribution.

Describe the role of shrinkage (learning rate) in XGBoost

Learning rate refers to a technique in XGBoost that influences the contribution of each tree to the final prediction. It's designed to improve the balance between model complexity and learning speed. Each prediction step in an XGBoost model is the sum of predictions from all trees. The learning rate scales the contribution of each tree, allowing the model to require fewer trees during training.

What is LightGBM?

LightGBM (Light Gradient Boosting Machine) is a distributed, high-performance gradient boosting framework designed for speed and efficiency.

How does LightGBM handle categorical features differently from other gradient boosting models?

LightGBM uses Exclusive Feature Bundling to optimize categorical data, while traditional algorithms, such as XGBoost, generally rely on approaches like one-hot encoding. - Binning of Categories: LightGBM bins categorical features, turning them into numerical values. - Splits: It can split based on these numerical values, and. - Biased Learning: This speeds up learning by favoring frequent categories.

What are the benefits to random forest over a boosted tree-based model?

Random forest is more interpretable, and often runs more quickly (due to the ability to create weak learners in parallel)

What are the benefits of XGBoost over traditional gradient boosting?

Regularization - controls model complexity to prevent overfitting, making XGBoost more robust Shrinkage - each tree's contribution is modulated, reducing the impact of outliers Cross-Validation - XGBoost internally performs cross-validation tasks to fine-tune hyperparameters like # of trees, boosting round, etc

Compare and contrast XGBoost and LightGBM

Similarities: 1. Use gradient boosting 2. Perform well out of the box 3. Can be used for both classification and regression 4. Can handle large datasets Differences: 1. LightGBM is often faster 2. LightGBM uses less memory than XGBoost 3. XGBoost uses level-wise trees, while LightGBM uses leaf-wise trees 4. LightGBM can have a tendency to overfit due to the use of deeper decision trees 5. XGBoost has more parameters that can be optimized to improve performance further XGBoost is generally considered a better choice and is most likely to be used for real-world use cases

What is the histogram-based approach used by LightGBM?

This method makes training faster and more memory-efficient by bypassing the need for sorting feature values during each split point search within decision trees. LightGBM employs the Gradient-Based One-Side Sampling (GOSS) algorithm to choose the optimal splitting point. This method focuses on the features that most effectively decrease the loss, reducing computations and memory usage.

When should you use XGBoost versus random forest?

Use XGBoost when your dataset is large or imbalanced, or when you want to spend time optimizing performance through parameter tuning Use random forest when you want to interpret your model's results, and when you're looking to create a simple and quick baseline model

How does XGBoost use tree pruning, and why?

Why: decision trees can grow too large and complex, leading to overfitting. How: 1. Pre-pruning: stops tree growth early based on user-defined hyperparameters 2. Post-pruning: consists of backward, bottom-up evaluations to remove or replace nodes that don't improve a predefined splitting criterion while minimizing a regularized cost function These measures insure that each component tree, or weak learning, is appropriately controlled in size and predictive characteristics

What is XGBoost?

eXtreme Gradient Boosting. XGBoost builds a series of trees to make predictions, and each tree corrects errors made by the previous ones. The algorithm minimizes a loss function, often the MSE for regression and log loss for classification

ML Algorithms Interview Questions

Related study sets

Consumer H chap 14 (Extra Credit)

My Lab IBUS330 Ch 3 Practice questions

Chapter 2 Chapter Questions

2x2 Table

Elascity problem set

examining patients with wrist and hand pain

Chapter 7 Quiz

infant

Chapter 53

Intro to Music Trimester 3 Questions

NUR 222 - Ch 43 - PrepU

Question set 7

FUndamentals A

Unit 37: Air Distribution & Balance

Dental Materials Chapter 4

contemp exam 2 test bank

SA 2: 9

Sociology final exam

Final Exam Review

IDIS 400