Machine Vision Exam 1

Ace your homework & exams now with Quizwiz!

What is the machine learning algorithm design? Support your explanation with diagrams

1. input X (features) 2. hidden layers 3. output layer 4. Loss function 5. Optimization

List the three machine learning methodologies with examples to each.

1) )Model a classification rule directly ex) KNN, linear classifier 2) Model the probability of class memberships given input data ex) logistic regression, P-NN 3) Make a probabilistic model of data within each class ex) naive Bayes

What is the Data-driven approach? List all phases.

1) Collect a database of images with labels 2) Use ML to train an image classifier 3) Evaluate the classifier on test images

What are the main phases of the training classifiers using the Viola-Jones-based Face Detector?

1) Data prep. Integral Image + Haar-like Features 1) Detection 2) training

3. What are the steps of K-means clustering? Support your answer with diagrams.

1) Select initial centroids at random 2) Assign each object to the cluster with the nearest centroid 3) Compute each centroid as the mean of the objects assigned to it (go back to 2) 4) Assign each object to the cluster with the nearest centroid (repeat 2 and 3 until no changes)

List the steps to train GANs with a diagram.

1. Define the problem 2. Define architecture of GAN 3. Train Discriminator on real dat for n epochs 4. Generate fake inputs for Generator and train discriminator on fake data 5. Train generator with the output of discriminator 6. Repeat step 3 to 5 for a few epochs 7. check if the fake data manually if it seems ligit

List required six steps to implement the Viola-Jones-based Face Detector.

1. Viola-Jones Face Detector 2. Haar-like features 3. Integral Image 4. Training classifier 5. Adaptive boosting algorithm (Adaboost) 6. Cascading

List and explain the convolutional neural networks' steps with a diagram for each step.

1: Convolution - An application of a filter onto an input to provide an activation map. 2: Max Pooling - Applying a non-linear downsampling on activation maps. Taking the max number in a nxn section. usually determined by the padding and shift # 3: Flattening - Converting a matrix into a vector to pass through into the Neural Network that is connected to the CNN 4: Full connection - The output of the CNN is passed into a fully connected to neural net to be processed

How would you represent the following example in a matrix multiplication format? House sizes 2104 ℎ𝜃(𝑥) = −40 + 0.25𝑥 1416 ℎ𝜃(𝑥) = 200 + 0.1𝑥 1534 ℎ𝜃(𝑥) = −150 + 0.4𝑥 852

2104 1 | | 1416 1 | | 1534 1 | * | -40 200 -150 | | 852 1 | |0.25 0.1 0.4 | 4x2 * 2x3

Define the image, then list types of image transformations

A colored image is a 3D tensor of numbers Filtering - Changes pixel value (changes range) Warping - changes pixel locations (changes domain)

Define Machine Vision

A field of AI tha trains computers to intrepert and understand the visual world. Using cameras and sensors computers can accuratly indentify and classify objects

What are the advantages of CNNs? List the common architectures of CNNs.

A natural choice to proccess images less parameters than fully connected layers uses the same parameters to process every block of the image Architectures: LeNet, ResNet, GoogLeNet, AlexNet

Define the Support Vector Machines (SVMs), then explain with a diagram how to use SVMs' tricks to convert the non-separable to the separable problem

A supervised ML algorithm which can be used for both classification or regression, Mostly used for classification Kernel Trick: Adding non-linear equations together to trick the kernel into thinking we're putting in a linear equation. EX: z = x^2 + y^2

For K Nearest Neighbors (KNNs), what is the best distance metric between data points? How many K? Write the common two formulas of distance metrics

Euclidean Distance K=1 is pretty chill Manhattan: d(I1, I2) = SUM to p | I1^p - I2^p| Euclidean: d(I1, I2) = SQRT[SUM to p (I1^p - I2^p)^2]

What are the SVM Parameters?

C (=1/lambda) Large C: low bias, high variance (small lambda) Small C: High bias, low variance (Large lambda)

Define the Adaptive boosting algorithm (Adaboost) and Cascading phases of the Viola-Jones-based Face Detector

ADABoost -multipling the features (Haar-like features) by some weight and adding all the features together F(x) = a1f(x1) + a2f(x2) + a3f(x3) +. . . Cascading - To tell you the order of the features. Takes a sub window, checks to see if a feature is detected, if it is then it keeps it and creates another subwindow to check. If the subwindow does not detect a feature it is thrown out and a new subwindow is created. If all features are found then the image is marked true for having a face

How neural networks learn? Support your discussion with a diagram.

ANNs learn by taking in values from features, multiplying said values by weights and using some activation function (most of the time) to receive and output. Then compare that output to the actual label. For more than two layered ANNs, Take each output from each layer (the value of that layer multiplied by the weight) and then add it to whatever value the current layer is. That current layer/value will be multiplied by another weight and the cycle continues

List the common activation, loss, and optimization functions for neural networks.

Activation Functions: 1. Threshold 2. Sigmoid 3. rectifier (relu) 4. Hyperbolic tangent function Loss: 1. Cross entropy Optimization: 1. Gradient Descent

What are the advantages and disadvantages of Naïve Bayes?

Advantages: Fast to train Not sensitive to irrelevant features Handles real and discrete data Handles streaming data well Disadvantages: Assumes independence of features

What are the three main loss functions that can be used for GANs? List, then explain them with equations.

Basic: ŷ = F(x) Classification(cross-entropy): L(ŷ,y) = SUM(ŷi log(yi)) Least-squares regression: L(ŷ, y) = ||ŷ - y||2

Define convolution neural networks, then list and explain the two primary layers of CNNs?

CNN- A class of deep ANN that are applied to analyzing visual imagery 1. Convolution layer: An application of a filter onto an input to provide an activation map. 2. Pooling: reduces the size of the feature map

List challenges of object recognition.

Camera Position Lightening Shapes Scaling Deformation Occlusion Background clutter Intra-class variations

What are two neural network tasks? Define them, then list examples for each task

Classification: Given an input x map it to one of k classes or categories. Used for image classification or semantic segmentation Regression: Given input x map it to a real number. Used in Depth Prediction and bounding box estimation

What is the cost function of SVMs? Draw the two diagrams that show the decision boundary of the binary classification?

Cost(h𝜃(x), y) = max(0,1 - (𝜃^T)x) if y = 1 OR max(0,1 + (𝜃^T)x) if y = 0 J(𝜃) = C[SUM to m from i = 1 [y^i *Cost1((𝜃^T)x^i) + (1 - y^i)(cost0((𝜃^T)x^i)]] + 1/2 SUM to n from j = 1 (𝜃j)^2 m= # of samples, n = # of features Decision boundary graphs: y^i = 1 Negative slope that stops at 1 y^i = 0 positive slope that starts at -1

List and define the data splits, then explain how to reduce the effect of underfitting and overfitting issues?

Data splitting is splitting the data into test sets and training sets and validation sets Training sets: Usually comprised of most of the data set.Used to minimize the loss/cost/error function Testing Set: Used to test the model after training Validation set: Used to choose best hyperparameters Reduce underfitting by training longer, include more layers or parameters per layer, or change the architecture Reduce Overfitting by including more training data, regularization, or change the architecture

What are the Haar-like Features? List the main five Features

Digital image features using in object detection. Adjacent rectangular regions at specific locations in the detection window. x = black o = white edge features oo xx or xx oo Line features oxo o x o four-rectangle features xo ox

What are the three neural network regularization strategies? Explain them

Dropout - Ignoring neurons or outputs during training Early stopping - Stopping training early, usually to prevent overfitting in large datasets Parameter Norm Penalties: J(𝜃)reg = J(𝜃) + alpha*Omega(𝜃) alpha is a hyper parameter that weights the relative contribution of the norm penalty to the value of the loss function. Changing Alpha changes how regularizied. 0 means no regularization.

What is the solution to the issue of Naïve Bayes classifier that assumes features are independent?

Estimates p(d | cj) = p(d1 | cj) p(d2 | cj) p(d3 | cj) . . . p(dn | cj) (could be wrong don't know if this is what he wants)

List the three common approaches of object recognition, then define one of them

Feature Matching Spatial Reasoning Window Classification Spatial Reasoning - The position of every part depends on the positions of all the other parts.

What are the main types of CV?

Image Classification Object detection Object tracking Semantic Segmentation Instance Segmentation

Explain the working mechanism of one CV type, then list all the class topics we have covered yet.

Image classification: Given a set of images that are all labeled with a single category, the computer should be able to predict the categories of a set of test images and measure the accuracy of the predictions.

What is the goal of using the Integral image? Explain the working mechanism of how to generate the Integral image from the source image

It's used to speed up processing time because at every pixel in the integral image we receive the area for that region. To get the initial integral image we pick any point of the original image and use that as a corner. We then add everything within that region including the sides of the "region" to get an ROI use the integral image, find the bottom right corner of the image and take that value, then go width and length +1 away from the corner and take those values. subtract values in the same column and add the two numbers you get after subtracting and that's the area of the ROI if we have a ROI of 4x3 on a 10x8 image, with corners at (8,3), (6,3), (8,6), (6,6) then we get the values of (8,2), (8,6), (5,2), (5,6). we then do (8,2) - (8,2) + (5,2) - (5,6)

What are the Laplacian of Gaussian and Derivative of Gaussian? What is the difference between them? Which one is better than the other?

Laplacian of Gaussian- A combination of laplace filtering with gaussian filtering. derivative of Gaussian - A Laplacian of Gaussian has an outcome of Zero crossing which is more accurate at localizing edges but not very convenient Derivative of gaussian provides peak edges so no zero crossing derivative of gaussian is more accurate

What are the differences between logistic regression and SVMs?

Logistic regression is meant for only classification while SVM can do both classification and regression. SVM finds a margin to reduce error and increas confidence while logsitic doesn't

What is the rule of matrix-vector multiplication? Support your answer with a visual example.

Matrix has to be like MxN * Nx1Result matrix will be Mx1

What is the rule of matrix-matrix multiplication? Support your answer with a visual example.

Matrix has to be like MxN * NxEResult matrix will be MxE

List and explain the matters and types of recognition

Matters: Learning Techniques - choice of classfier Representation - Low level: SIFT, Mid Level: Bag of words, sliding window, High Level: Contextual dependence Deep feature Data: More is alwys better Annotation is the hard part Types: -Instance Recognition: Recognizing a known object but in a new viewpoint, with clutter and occlusion -Category Recognition: Harder problem even for humans

What are the main differences between probability and statistics?

Probability: Predicting the likelihood of future events Statistics: The analysis of the frequency of past events

What is the linear shift-invariant image filtering? List the two primary filters, then explain the working mechanism of one of them

Replace each pixel by a linear combination of its neightbors (and possible its self) The combination is determined by the filter's kernel Box filter - Replaces pixels with local average, has a smoothing effect Gaussian Filter - Kernel values sampled from the 2D gaussian function. Weight falls off with distance from center pixel. Commonly used for edge detection

Define the bag-of-words/features (BOW), then list the standard BOW pipeline for the image classification

Represent a data item (document, texture, image) as a histogram over features 1) Dictionary Learning - Learn Visual words using clustering 2) Encode - Build BOW vectors for each image 3) Classify - Train adn test data using BOW

Draw and then explain the support vector machine's main elements, such as support vectors, hyperplanes, and margin.

Support Vectors: Data points that are closer to hyperplanes that influence the position and orientation of the hyper plane Hyperplanes: Decision boundaries that help classify data points. Margin: Kind of like the distance between both classes. Used to get better confidence

What does the separable filter mean? What is the main difference between the separable and non-separable filters?

The filter can be written as a product of a column and a row. Turns theorectical and too epensive to pratical under the same computational constraints The cost of convolution with non-separable filter is M^2 x N^2 The cost of convolution with separable is 2 x M^2 x N

What do we mean by 'semantic vision'?

The meaning and interpretation of details from an image. Being able to detect and understand objects present in the image.

Define the image synthesis, then list the types of synthesized images.

The process of creating new images from some form of image descriptions. Types: 1. Test patterns (scenes with two-dimensional geometric shapes) 2. image noise (images containing random pixel values) 3. Computer Graphics (scenes of images based on geometric shapes)

What is the Bayes theorem? Write the main formula, then explain each term of that formula.

The theorem of determing an event given another event has occured p(a | b) = p(b | a)p(a)\p(b) p(a | b)- The probability of event a occurring given that event b occured p(a) - probability of event a occurring p(b) - probability of event b occurring p(b | a) -probability of event b occurring given event a has occured

How to pick hyperparameters of KNNs? Draw an illustration diagram for the cross-validation.

Train, validate, test Train for original model Validate to find hyperparameters Test to understand generalizability

What are image edges? How do you differentiate a discrete image from other discrete signals?

Very sharp discontinuities in insensity you use finite differences

How to initialize the neural network parameters? What are the three possible optimization stopping conditions of neural networks?

initialize parameters: 1. Weights are initialized by randomly sampling from a standard normal distribution 2. biases initalized to 0 Stopping conditions: 1. number of iterations - how many training iterations the NN has performed 2. change 𝜃 value. Stop if 𝜃new - 𝜃old < Threshold 3. change J(𝜃) value: Stop if J(𝜃new) - J(𝜃old) < Threshold

How would you represent the following example in a matrix multiplication format? House sizes 2104 ℎ𝜃(𝑥) = −40 + 0.25𝑥 1416 1534 852

| 2104 | | 1416 | | 1534 | * [-40 0.25] | 852 | 4x1 * 1x2


Related study sets

Ch 11 Interactions with Humans and Microbes

View Set

Pamoka 1 - Aš esu studentės, o tu?

View Set

Management principles ch.12 case study

View Set

Introduction to Economics Test# 2 Monopolistic Competition and Oligopoly.

View Set

Microbiology Chapter 21 Smartbook

View Set