Machine Learning Engineer Terms

Ace your homework & exams now with Quizwiz!

Multinominal NB

A discrete distribution used whenever a feature must be represented by a whole number (for example, in natural language processing, it can be the frequency of a term)

semi-supervised learning

A model that is trained on data with some labeled and some term-9not labeled.

Stemming

A process of reducing words to their respective root forms in order to better represent them in a text mining project.

Self-Training

A way to solve tackle semi-supervised learning. The procedure in which you can take any supervised method for classification or regression and modify it to work in a semi-supervised manner, taking advantage of labeled and unlabeled data.

Bernoulli's NB

Binary distribution useful when a feature can be present or absent

How to Ensure you're not overfitting

Collect More Data Reduce Number of Features Ensemble Method Early Stopping Cross Validation

CNN (Convolutional Neural Network)

Convolutional neural networks are a specialized type of artificial neural networks that use a mathematical operation called convolution in place of general matrix multiplication in at least one of their layers. Mainly used to analyze visual imagery.

Recall

False Negative Rate TP / TP + FN Ratio of True Positives to False Negatives

Precision

False Positive Rate TP/ TP + FP Ratio of True Positives to False Positives

Type 1 Error

False Positive. You predicted presence when it wasn't=

Reinforcement Learning

Favorable or non-favorable outputs or rewards are given as a result of what the computer does. Aims to maximize rewards

Supervised Learning

Machine learning model that is trained on labeled data.

Gini Impurity

Quantify the amount of uncertainty at a single node

Information Gain

Quantify the amount that the node reduces the amount of uncertainty. Higher the worse Probability of a Random sample being classified correctly if you randomly pick a label according to the distribution of the branch

ROC curve

Receiver Operating Characteristic It shows the Models tradeoff points of False Positives to False Negatives

Bootstrap sample

Sample with replacement from the original sample, using the same sample size. Used to estimate the population statistics form a small data sample.

RNN (Recurrent Neural Networks)

Type of Neural Network that uses sequential data or time series data. They are distinguished by their "memory" as they take information from prior inputs to influence the current input and output.

Unsupervised Learning

Unlabeled data

Lemmatization

grouping words together based on their basic dictionary definition


Related study sets

core engineering 2 set Manufacturing Technology

View Set

Prefixes that pertain to numbers and amounts- Medical terminology

View Set

Lewis Ch. 23 - Integumentary Problems

View Set

chapter 8 test- the nervous system

View Set

APSY 203 ASSESSMENT 3 STUDY QUESTIONS

View Set

Creative Writing UNIT 4 SEMESTER 1

View Set