Artificial Intelligence Chapter 12

Ace your homework & exams now with Quizwiz!

Limitations

1. Local Minima 2. Over Fitting 3. Vanishing Gradient problem 4. You learn by using all the unnecessary information and speed ↓ and cost ↑

Recurrent Neural Network(RNN)

A neural network in which the previous output is input

Convolutional Structure

Multiple nodes share a weight; Zoom out with Max pooling; CNN

feature/representation learning

The hidden layer transforms the feature vector into a new feature space that is more advantageous for classification

The training set

used for generating the model that represents a hypothesis

Long Short-term memory(LSTM)

used primarily to handle long-term dependence (LSTM has selective memory capability) to handle very long sequential data

Weaknesses of Deep Learning

• It takes a lot of time and money to acquire learning data. • It does not properly interpret patterns that fall outside the range of learning data. • Since the generated model is a black box, it is difficult for humans to interpret or improve the content.

Strengths of Deep Learning

• You do not have to worry about expressing features to tell the computer • Generally, it produces better results than a human model. • Does not require advanced mathematical knowledge or programming skills • The abundance of open source algorithms makes it cheaper and faster to develop.

communicate

Interconnected neurons use electrical pulses to "____________" with each other.

Deep Learning Algorithms

• Deep Neural Network (DNN) • Convolutional Neural Network(CNN) • Recurrent Neural Network(RNN)

Characteristics of ANN

- A large number of very simple processing neuron-like processing elements - A large number of weighted connections between the elements - Distributed representation of knowledge over the connections - Knowledge is acquired by network through a learning process

Neural Signal Processing

1. Signals from connected neurons are collected by the dendrites. 2. The cells body sums the incoming signals 3. When sufficient input is received, the neuron generates an action potential 4. That action potential is transmitted along the axon to other neurons 5. If sufficient input is not received, the inputs quickly decay and no action potential is generated. 6. Timing is clearly important

Fully Connected Structure

All connections between two layer nodes; DNN

Imitate Human Brain

Builds learning algorithms that mimic the brain

the training phase

During ______________, the algorithm selects the nodes from the deep neural network to be dropped (activation value set to 0)

Multi-Layer Perceptron Characteristics

Hidden layer, Activation Function, Back propagation method for error correction

Vanishing Gradient problem

In gradient decent and backpropagation methods, each of the MLP weights receives an update proportional to the partial derivative of the error function with respect to the current weight in each iteration of training

Minsky's Perceptrons

Pointed out the limitation of perceptron and suggested a solution to overcome the problem by using multi layered structure

Recurrent Structure

Return the output value and status value of one node and input it further; Effect with simple storage device; RNN

Deep Learning Characteristics

Separate training per layer, Stack up a trained hidden layer, Strong at local minima, Overfitting can be avoided even with less data, Hidden layers can learn to have desired characteristics

the hidden layer

The first layer is called the input layer, the last layer is the output layer, and the middle layers are called ____________

Feed-forward Networks

The input signals are propagated to the output layer in the forward pass and the weights are optimized in a recursive manner in order to train the model for generalizing the new input data based on the training set provided as input.

Recurrent-feedback Networks

The output of one forward propagation is fed as input for the next iteration of training for sequences of data. Ex) text, speech, or any other form of audio input

Sigmoid function

The output value is between 0 and 1. This function takes a geometrical shape of S and hence the name sigmoid

filter

The weight used in the process of creating the feature map

sequential data

While static data is acquired at any one moment and is of fixed length, __________________ is dynamic and usually variable length

The pooling layer

________________ plays a role of reducing the dimension size of the image data by the down-sampling method

Artificial Neural Networks

a computational system inspired by the structure, processing method, learning ability of a biological brain

A binary classifier

a function which can decide whether or not an input, represented by a vector of numbers, belongs to some specific class

DBN(deep belief network)

a generative graphical model, composed of multiple layers of latent variables ("hidden units"), with connections between the layers

Deep Learning

a set of algorithms in machine learning that attempt to learn in multiple levels, corresponding to different levels of abstraction. It typically uses artificial neural networks

The Gradient Descent method

a technique to observe the descending direction of the slope in order to find the position of the minimum value of the function and repeat the examination while moving little by little

Perceptron

an algorithm for supervised learning of binary classifiers; introduces concepts such as nodes, weights, and layers and learning

Stochastic gradient descent(SGD)

an iterative technique that can distribute the work units and get us to the global minima in a computationally optimal way

Human brain

anatomical connections to graph theory

The receptors

collect information from the environment

Deep Neural Network(DNN)

commonly consists of at least two hidden layers between input and output - Multi-Layer Perceptron(MLP) - The number of layers can be very high

Over Fitting

excessive learning (usually occurs when data is insufficient) - Drop out or more sufficient data is required

The effectors

generate interactions with the environment - e.g. activate muscles

Convolution Neural Network(CNN)

networks that have a known grid-like network topology, such as image data that are essentially a 2-dimensional grid of pixels

The circulating edge

responsible for transmitting the information generated at the moment of t-1 at the moment of t

Local Minima

stops learning even though it has not reached to the global minima- Stochastic Gradient Descent (SGD) Method is required

Convolution

the process of obtaining a specific feature in the local domain

Rectified Linear Unit

the simplest, computationally optimized, and hence most popularly used activation function for the ANNs. The output value is 0 for all negative input and the same as the value of input for positive input

Neuron

the smallest information processing unit in the brain

The validation set

used to test the efficiency of the hypothesis function or the trained model for


Related study sets

Cells and their Organelles - Eukaryotic Cells

View Set

Tofugu JLPT N5 Vocabulary Sentences

View Set

Medium/Heavy Duty Truck Engines, Fuel, & Computerized Management Systems Chpt 6 (5th edition-Sean Bennett)

View Set