C3.ai Glossary Terms
What is R squared?
1 - (unexplained variation / total variation)
What is a hyperparameter?
A hyperparameter is a parameter whose value is set before the machine learning process begins
What is Stochastic Optimization?
A method of generating and using random variables to represent an optimization problem to produce more suitable and consistent results
What is a Gaussian Mixture Model?
A probabilistic model that assumes all data points are generated from a mixture of a finite number of Gaussian distributions with unknown parameters
What is XGBoost?
A supervised learning algorithm that attempts to accurately predict a target variable by combining an ensemble of estimates from a set of simpler and weaker models.
What is a reason to do dimensionality reduction?
Algorithms have a hard time learning patterns when there are many sources of input data relative to the amount of training data
What is a Generalized Linear Model?
An expansion of linear regression that allow different output distribution functions ("link" functions) to describe the variance of observations from the predicted values
What is the C3 AI Platform?
An open, extensible, multi-cloud platform for a wide range of skill sets to take advantage of the latest innovations in AI/ML
What are three benefits of Gaussian Mixture Models?
Are found in open-source libraries, are easy to implement, and are faster and more stable than other solutions like gradient descent in converging to a minimum
How are Shapley values useful to ML?
By interpreting a model trained on a set of features as a value function on a coalition of players, Shapley values provide a natural way to compute which features contribute to a prediction
What are Gradient-Boosted Decision Trees?
Each iteration of a decision tree involves adjusting the values of the coefficients, weights, or biases applied to each of the input variables being used to predict the target value to minimize a loss function
What are the benefits of Random Forest models?
Easy to tune with intuitive hyperparameters, easy to view the relative importance of input features, generalizes well without overfitting for sufficient number of trees
What is the False Positive Rate?
FP / (FP + TN)
What is the x-axis in the receiver operating characteristic curve?
FPR
When should F1 score be used?
For (binary) classification when data is unbalanced
What is the False Positive Rate in words?
How many actual negatives did the model get wrong?
What is LIME?
Local interpretable model-agnostic explanations is a technique that approximates any black box machine learning model with a local, interpretable model to explain each individual prediction
How can overfitting be handled?
More data including through augmentation, cross-validation, select fewer better features, regularization
How do you handle underfitting?
More training data, more model parameters, more model complexity, train longer, decrease regularization
What correlation coefficient measures the linear relationship between two variables?
Pearson's
How can overfitting be handled in Deep Learning models?
Reduce layers or nodes in hidden layers, apply regularization, dropout layers, early stopping
What correlation coefficient measures the non-linear relationship between two variables?
Spearman's
What is recall?
TP / (TP + FN)
What is precision?
TP / (TP + FP)
What is the y-axis in the receiver operating characteristic curve?
TPR
What if the F1 score formula?
The harmonic mean of precision and recall
What is Mean Absolute Percent Error (MAPE)?
The mean of absolute relative errors (expressed as percentage)
What is the Mean Absolute Error?
The sum of the absolute value of the difference between ground truth and predicted values over the sample size
What is Deep Learning?
The use of multi-layered neural networks to transform input data into successively higher-level values to produce results similar to human experts
True or False: Clustering algorithms are a type of classification technique
True
What is information leakage?
When information that should not be in the training data inflates the model's ability to learn, causing poor performance in production
What are the 10 fundamental capabilities of an enterprise AI Platform?
data aggregation, multi-cloud, edge, data virtualization, enterprise semantic model, microservices, data governance, system simulation, open platform, cross-collaboration
What are two phrases to describe an overfit model?
high variance, low bias
What are three loss functions used in classification?
hinge loss, cross-entropy loss, and KL (Kullback-Leibler) divergence loss
What are two phrases to describe an underfit model?
low variance, high bias
What are three loss functions used in regression?
mean square error loss (MSE), mean absolute error loss (MAE), and quantile loss