Neural Networks and Deep Learning
Perceptron Model
-Coined by Rosenblatt in 1958 -Perceptron = an Artificial Neuron, an algorithm, a simple model of a biological neuron in an artificial neural network -The basis of neural networks
Deep Learning (disadvantages)
-Computationally intensive and expensive (eg. requires GPU for parallel processing) -Considered "opaque" or "black box" (difficult to deduce/understand the output, less explainable) -Requires large data to train (at least 100 for images)
Deep Learning (advantages)
-Features are typically automatically selected -Considered robust and can handle natural variation of data -Applicable to different data types (images, texts, numeric, etc.) -Scalable: can perform computation in large volumes of data in parallel (requires GPU) and save times
Anatomy of Perceptron model
-Input signal (x1, x2,... xn): Sourced from input data -Weights (w1, w2,... wn): Multiplied by input signal to give various 'importance levels' to the input -Bias (b): Ability to offset the summed input signals (added to total sum, not individual inputs) -Activation function (f, or, a): Decide whether neuron is activated -Output (y): y>0 on, y=0 off
Biological Neurons
A nerve cell to transmit electrical signal or impulses -Dendrites: receive input signals from nearby neurons -If the signal is detected at the axon hillock is large enough to pass a certain threshold, the neuron will "fire" (generating an action potential) indicating the neuron is activated -Neural network = collection of neurons
Deep Learning (Summary)
ANN: -Tabular or text data as input -Inputs and outputs assumed independent -No parameter/weight sharing -No recurrent connection CNN -Image data as input -Inputs and outputs assumed independent -Parameter/weight sharing -No recurrent connection RNN -Sequence data as input -Outputs are dependent on prior elements within sequence -Parameter/weight sharing -Recurrent connection
ANN (definition)
An interconnected group of nodes, inspired by a simplification of neurons in a brain
AI Hierarchy
Artificial Intelligence -> Machine Learning -> Neural Networks -> Deep Learning
Deep Neural Networks
Artificial neural networks with multiple hidden layers and multiple output layers -Implementation of "back-propagation" = a procedure to adjust/fine-tuning the weights in neural networks based on the error rate in the previous iteration
Convolutional Neural Networks (definition)
CNN is a deep neural network that is designed to process structured array data (images) and able to learn complex objects and patterns
CNN (examples of application)
Computer vision Image classification Object recognition Face recognition
Types of deep neural networks:
Convolutional neural networks (CNN) Recurrent neural networks (RNN)
ANN is also known as:
Feedforward neural networks (FFNN) Multilayer perceptron (MLPs) 'Vanilla' neural network
Artificial Neurons
First introduced by McCullock and Pitts 1943 Each neuron is characterized as being 'on' or 'off' (mimicking action potential firing) -Regulated by the activation function (similar to axon hillock)
Organization of Layers (ANN)
Input Layer: First layer Hidden Layer: In between, neither input or output Output Layer: Final layer Activation function is computed in each and every node
CNN (anatomy)
Input layer: Usually an image Stacking hidden layers, typically consist of: -Convolution layer -Activation layer (eg. ReLU layer) -Pooling layer -Fully connected layer -Final activation layer (SoftMax) Output layer: Probabilistic distribution -SoftMax outputs probability
Examples of RNN application:
Language translation Natural language processing Speech recognition -Such as Siri, Alexa, google translate, voice search Image captioning
Deep Learning
Learning algorithms to train deep neural networks
ANN learning algorithms use principles of:
Linear regression: -Each neuron/node is a linear combination of input (x), weights (w), and bias (b) Logistic regression: -Activation function determines whether output is on or off Gradient descent: acts as a barometer to gauge the accuracy in each iteration -Goal: to yield the smallest possible errors (cost function)
CNN (architecture)
Multi-layered feed-forward neural network Stacking MANY hidden layers on top of each other, in sequence Sequential design allows CNN to learn hierarchical features
Hebbian principle
Neurons that fire together wire together
ANN hyperparameters
Number of layers Number of iterations
CNN (important milestone: LeNet-5)
Published by Yann LeCun in 1998 (now is VP for Meta) Using CNN with back propagation to recognize handwritten numbers -Successfully applied it in identifying handwritten zip code number provided by the US Postal Service Become known as "LeNet"
Artificial Neural Network (ANN)
Putting together multiple artificial neurons to form a network as a learning algorithm to make decision (mimicking how the brain works) -Artificial neurons are modeled after biological neurons
Recurrent Neural Networks (definition)
RNN is a deep neural network that uses sequential or time series data to make prediction
RNN has loops to persist information
Recurrent = perform the same task for every element in the sequence and output elements are dependent on previous elements or states
Goal of ANN
Simulate multi-layered approach to processing various inputs and make decisions based on given inputs
ANN can be applied to:
Supervised, unsupervised or reinforcement learning tasks Classification or regression problem
RNN (anatomy)
The same as vanilla neural network (ANN), with each node acts as a "memory cell" -Recurrent = prone to vanishing gradient problem -Diminishing the gradients and causing the learning process to become degenerate
Neural networks are represented as a connected combination of:
Units or nodes -Each unit or node is a neuron -Logically organized into one or more layers
The number of hidden layers determine the _____________
depth of neutral networks
Equation of Perceptron Model
y = (wx + b) Add bias (b) to TOTAL sum