Data Science - Deep Learning

Ace your homework & exams now with Quizwiz!

Backpropagation

A common method of training a neural net in which the initial system output is compared to the desired output, and the system is adjusted until the difference between the two is minimized.

Stochastic Gradient Descent (SGD)

A gradient descent algorithm in which the batch size is one. In other words, SGD relies on a single example chosen uniformly at random from a data set to calculate an estimate of the gradient at each step.

natural language processing (NLP)

A technology that converts human language (structured or unstructured) into data that can be translated then manipulated by computer systems; branch of artificial intelligence

Convolutional neural networks (CNNs)

In machine learning, a convolutional neural network (CNN, or ConvNet) is a class of deep, feed-forward artificial neural network that have successfully been applied to analyzing visual imagery.

Forward propagation

In neuronal networks the process of calculating the subsequent layers of the network. Each layer depends on the calculations done on the layer before it.

Pooling

It is common to periodically introduce pooling layers in between the convolution layers. This is basically done to reduce a number of parameters and prevent over-fitting. The most common type of pooling is a pooling layer of filter size(2,2) using the MAX operation. What it would do is, it would take the maximum of each 4*4 matrix of the original image.

Padding

Padding refers to adding extra layer of zeros across the images so that the output image has the same size as the input. This is known as same padding.

Exploding Gradient Problem

The Exploding Gradient Problem is the opposite of the Vanishing Gradient Problem. In Deep Neural Networks gradients may explode during backpropagation, resulting number overflows. A common technique to deal with exploding gradients is to perform Gradient Clipping

Batches

While training a neural network, instead of sending the entire input in one go, we divide in input into several chunks of equal size randomly. Training the data on batches makes the model more generalized as compared to the model built when the entire data set is fed to the network in one go.

Multilayer Perceptron (MLP)

a Feedforward Neural Network with multiple fully-connected layers that use nonlinear activation functions to deal with data which is not linearly separable. An MLP is the most basic form of a multilayer Neural Network, or a deep Neural Networks if it has more than 2 layers

Activation Function

a function that assigns an output signal on the basis of the total input

Dropout

a regularization technique which prevents over-fitting of the network. As the name suggests, during training a certain number of neurons in the hidden layer is randomly dropped. This means that the training happens on several architectures of the neural network on different combinations of the neurons. You can think of drop out as an ensemble technique, where the output of multiple networks is then used to produce the final output.

Cost Function

mathematical description of how a cost changes with changes in the level of an activity relating to that cost

Deep Neural Networks

artificial neural networks that have a very large number of layers of nodes with millions of connections

Vanishing Gradient Problem

as one keeps adding layers to a network, the network eventually becomes untrainable

Epochs

defined as a single training iteration of all batches in both forward and back propagation. This means 1 epoch is a single forward and backward pass of the entire input data. The number of epochs you would use to train your network can be chosen by you. It's highly likely that more number of epochs would show higher accuracy of the network, however, it would also take longer for the network to converge. Also you must take care that if the number of epochs are too high, the network might be over-fit.

Learning Rate

expressed as a percent, it gives the percentage of time needed to make the next unit, based on the time it took to make the previous unit.

Long Short Term Memory Network (LSTM)

is a recurrent neural network which is optimized for learning from and acting upon time-related data which may have undefined or unknown lengths of time between events of relevance

Gradient Descent

is an algorithm to improve a hypothesis function with respect to some cost function.

Recurrent Neuron

one in which the output of the neuron is sent back to it for t time stamps. If you look at the diagram the output is sent back as input t times. The unrolled neuron looks like t different neurons connected together. The basic advantage of this neuron is that it gives a more generalized output.

Data Augmentation

refers to the addition of new data derived from the given data, which might prove to be beneficial for prediction. For example, it might be easier to view the cat in a dark image if you brighten it, or for instance, a 9 in the digit recognition might be slightly tilted or rotated. In this case, rotation would solve the problem and increase the accuracy of our model. By rotating or brightening we're improving the quality of our data

Perceptron

the simplest neural network possible: a computational model of a single neuron

Recurrent Neural Network (RNN)

used for natural language processing and for sequential data


Related study sets

Chapter 1.7 scale and proportion

View Set

Chapter 16 APUSH multiple choice

View Set

ATI Substance-Related and Addiction Disorders

View Set

CB - Chapter 11 (Attitude & Attitude Influence)

View Set

Managerial Accounting Ch 2 (3 non-calculation questions)

View Set

Sociology 2e Chapter 9, 10, 11, and 12

View Set

Chapter 25- Disorders of Renal Function

View Set