AI

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Advantages and Disadvantages of YOLO

-orders of magnitude faster than (fast/er) R-CNN methods -struggles with small objects within the image, e.g. flock of birds

Semantic Segmentation

A process where each pixel in an image is linked to a class label

Random Forests

A model that trains many decision trees on random subsets of the features, then averages out their predictions

FLOPS vs FPS

FLOPS = floating point operations per second, how fast a processor can perform arithmetic operations FPS = frames per second, how fast a graphics processor can update the screen Two processors could have the same FLOPS but if the FPS is faster on one, than the FLOPS rating is irrelevant

Fallout

FP/(TN + FP), fraction of negative examples that were misclassified, False Positive Rate

Accuracy

(TP + TN)/(P+N), fraction of examples correctly classified

Interpolated AP vs AP AUC (AP Area Under Curve) vs COCO mAP

***NEEDS WORK*** Interpolated AP: Calculated by dividing the recall into a set of discrete points (PASCAL VOC2008 used 11 points), then calculate the maximum precision value for each point (the "smoothing" step described in the previous card) and then calculate the average of these maximum precision values 11 points are used to reduce impact of wiggles/variations introduced in the ranking of examples. Drawbacks: less precise than AP, AP AUC: Calculates the rectangular integral of the precision-recall curve- computes the AP as the sum of the precision at each point where the COCO mAP:

Pooling layers

-A layer added after a nonlinearity (e.g. ReLU) has been applied to the feature maps output by a convolutional layer) -preserves the number of feature maps -downsampling operation such as averaging or keeping the maximum value for each patch of the feature map (average pooling, maxpool) -downsampling helps keep the model translation invariant, preventing it from being too sensitive to changes in the feature's location in the input image

ReLU

-Rectified linear unit -a piecewise inear function that will output the input directly if it is positive, or zero otherwise. -It has become the default activation function for many types of neural networks because a model that uses it is easier to trian and often achieves better performance

Advantages of Fast R-CNN

-faster than R-CNN because you don't have to feed 2000 region proposals to the CNN every time (CNN is done only once per image)

Gini impurity

-gini=0 "pure" if all training instances of the leaf belong to the same class G_i (of ith node) = 1 - sum_k( p_i,k * p_i,k ) where p_i,k is the ratio of class k to all instances in the ith node

Problems with R-CNN

-huge amount of training time -too slow for real-time implementation (~47 s per test image) -no learning happens in the selective algorithm stage which could lead to the generation of bad candidate region proposals

F1 Score

2 * Precision * Recall/ (Precision + Recall), a way to characterize accuracy and optimize the tradeoff between precision and recall. The harmonic mean is more sensitive to low values, so the F1 score will only be high if both recall and precision are high. Used to compare two classifiers.

Activation function

A function in a neural network that is responsible for transforming the summed weighted input from the node, into the activation or output of the node (Remember Andrew Ng's class after doing the math with the weights you apply a sigmoid activation function)

Transfer Learning

A powerful technique that lets people with smaller datasets or less computational power achieve state-of-the-art results, by taking advantage of pre-trained models that have been trained on similar, larger data sets. Because the model learned via transfer learning doesn't have to learn from scratch, it can generally reach higher accuracy with much less data and computation time than models that don't use transfer learning.

One-versus-all/One-versus-rest (OvA)

A strategy for applying a binary classifier for multi-class classification. In this strategy one classifier is used for each class (minus one) to identify that class from all the other classes. So a problem with K classes would have K-1 classifiers.

One-versus-one (OvO)

A strategy for applying a binary classifier for multi-class classification. In this strategy one classifier is used to identify that class from a second class. Thus, if there are K classes, each class has K-1 classifiers to identify it from the other classes. Thus, there are K X (K-1)/2 classifiers. This strategy is useful for algorithms that scale poorly with the size of the training set, such as SVM, when it is faster to train many classifiers on small training sets than training few classifiers on large training sets.

PCA (principal component analysis)

A technique to project data to lower dimensions. Used for data visualization for EDA and compression.

Cross-Validation

A training method where you divide the data into k-folds, and train k times, holding out a different fold each time. Each time you train on k-1 folds and evaluate on the remaining fold. You then get k evaluation scores.

Machine Learning

A way to automate the improvement of prediction algorithms based on statistics.

Decision Tree

Algorithm that can be used for both regression (leaves take on continuous values) and classification (leaves take on discrete values). Leaf nodes are the final values/classes (don't have child nodes). -decision tree's don't require feature scaling -decision trees are often considered "white box" models as opposed to neural networks or Random Forests

Autoencoders

An unsupervised learning technique that uses neural nets for representation learning. It can be thought of as a non-linear generalization of PCA: where PCA projects data to a lower dimensional hyperplane, autoencoders are able to learn non-linear manifolds. The neural net is designed to impose a bottleneck in the network which forces a compressed knowledge representation of the original input.

Feature Learning/Representation Learning

An unsupervised method of feature extraction on unlabeled data in order to learn "representations" for the unlabeled data that make the main ML task easier. For example, Word2Vec learned vector representations for words in order to be used in NLP tasks.

AUC

Area Under (ROC) Curve -method for comparing classifiers -a perfect classifier would have an AUC ROC of 1, a random classifier would have an AUC ROC of 0.5 -generally should prefer this to PR curve unless positive class is rare

Anchor

Bounding box

Ensemble Learning

Building a model on top of many other models

CART

Classification and Regression Tree Algorithm: An algorithm used to train Decision Trees (used by sklearn). -produces only binary trees

CVPR

Conference on Computer Vision and Pattern Recognition

NIPS/NeurIPS

Conference on Neural Information Processing Systems

ECCV

European Conference on Computer Vision

FPN

Feature pyramidal networks (FPN)

Fast R-CNN

ICCV 2015 1) feed input image to CNN instead of region proposals 2) From the convolution feature map we identify the region proposals and then use other layers to produce a ROI feature vector that a softmax layer than turns into class prediction and offset values for the bounding box

ICCV

International Conference on Computer Vision

IoU

Intersection over union: a metric that measures the overlap between two boundaries, e.g. how much a bounding box prediction overlaps with the ground truth label bounding box. IoU = area of overlap/area of union A chosen IoU threshold, e.g. 0.5, can be used to determine whether to classify the prediction as a true positive or a false positive

How is the kNN algorithm different from kMeans?

K-Nearest Neighbors is a supervised classification algorithm, while k-means clustering is an unsupervised clustering algorithm. While the mechanisms may seem similar at first, what this really means is that in order for K-Nearest Neighbors to work, you need labeled data you want to classify an unlabeled point into (thus the nearest neighbor part). K-means clustering requires only a set of unlabeled points and a threshold: the algorithm will take unlabeled points and gradually learn how to cluster them into groups by computing the mean of the distance between different points.

Embedding

Low dimensional, learned continuous vector representations of discrete variables. Neural network embeddings are useful because they can reduce the dimensionality of categorical variables and meaningfully represent categories in the transformed space. Neural network embeddings have 3 primary purposes: 1) Finding nearest neighbors in the embedding space. These can be used to make recommendations based on user interests or cluster categories. 2) As input to a machine learning model for a supervised task. 3) For visualization of concepts and relations between categories.

Faster R-CNN

NIPS 2015 -After image fed through CNN, a separate network is used to predict the region proposals instead of a selective search algorithm -uses 9 anchor boxes that cover 3 different scales and 3 different aspect ratios

Neural Architecture Search

Neural nets that learn to make new neural nets (usually using RL or evolutionary algorithms). -try to learn a layer/cell that can be stacked to form an NN -computationally intensive

ROC

Receiver Operator Characteristic, a plot of True Positives vs False Positives as you change the threshold between classes, in classification. (If the distribution of the true classes overlaps perfectly, then ROC plot is a straight line). If the postive class is rare, or you care more about false positives, then the Precision vs. Recall curves is more useful.

R-CNN

Region-CNN (CVPR 2014) is a CNN-based object detection approach -uses a sliding window to search every position within the image for objects -very slow Steps: 1) Uses selective search to generate about 2000 region proposals i.e. bounding boxes for image classification 2) For each bounding box image classification is done through CNN 3) Each bounding box is refined using regression

SSD

Single Shot MultiBox Detector (ECCV 2016)

Specificity

TN/(TN+FP) , fraction of negative examples correctly classified -not affected by class frequency

Precision

TP/(FP + TP) Fraction of positive predictions that are correct -affected by relative prevalence of positive and negative cases (class imbalance)

Recall

TP/(TP + FN) Fraction of all positives that were predicted to be positive, True Positive Rate -not affected by class frequency. Also called sensitivity or True Positive Rate (TPR)

Average Precision (AP)

The area (integral) under the precision-recall curve. The precision-recall curve shows how precision fluctuates as the confidence of the sample's prediction decreases. The recall always decreases. The precision fluctuates as it decreases with each false positive, and increases again with each true positive. In practice, we smooth out the fluctuations of the curve before integrating by replacing each precision value with the maximum precision value for any recall greater than the recall at that precision value. This makes the calculated AP less susceptible to small variations in the ranking.

Feature maps

The output activations of a layer given a filter. A CNN applies a convolutional filter to transform the input features into a hidden layer that acts as a new set of features to feed into the next layer. One filter/kernal results in one feature map.

AutoML

The process of automating model selection and/or hyperparameter optimization. It can also be useful in getting a baseline to know what level of performance is possible for a problem.

YOLO

You Only Look Once (CVPR 2016) -divides an image into an SxS grid, and take m bounding boxes in each part of the grid -for each bounding box the network outputs a class probability and offset value for the bounding box -the boxes above a certain threshold are selected


Kaugnay na mga set ng pag-aaral

AP Chemistry Chapter 6 + 16 Test

View Set

Chapter 57 practice questions- PrepU Drugs Affecting GI Secretions

View Set

Chapter 70: Management of Patients With Neurologic Trauma NCLEX

View Set

Chapter 7: 2 Body Planes, Directions, and Cavities

View Set