AI-12

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Describe feature characteristics for DL in terms of feature engineering

Separate training per layer Stack up a trained hidden layer Strong at local minima Overfitting can be avoided even with less data Hidden layers can learn to have desired characteristics Can insert layers to absorb deformation Connected by weight sharing to reduce the number of weights

What are the solutions for the Vanishing Gradient problem?

1. Layer-wise pre-training: provides initial weight closer to global optima; increased tightness between layers to better propagate the gradient. 2. Rectified Linear Unit (ReLU). 3. Long Short-term Memory.

What are the three major limitations of MLP?

1.Local Minima: stop learning even though it has not reached to the global minima. Local Minima problem is improved by Stochastic Gradient Descent. SGD is an iterative technique that can distribute the work units and get us to the global minima in a computationally optimal way. 2. Overfitting: excessive learning (usually occurs when data is insufficient) - more sufficient data is required. 3. Vanishing Gradient Problem.

Describe DL model

A common architecture of ANNs is to create a network of interconnected layers where each neuron i within layer k is connected to each neuron j within the next layer k +1 with a weight wij. Activations of a single layer can be transmitted to next layer by matrix multiplication of the weight matrix W that contains the weights wij. DNN commonly consists of at least two hidden layers between input and output. Multi-Layer Perceptron(MLP) - the number of layers can be very high (hundreds).

1. Are ANNs exactly the same as biological neurons in terms of information storage and processing?

Although it cannot be stated with 100% certainty that the ANNs are an exact replica in terms of memory and processing logic, there is evidence in medical science that the basic building block of a brain is a neuron, and neurons are interconnected. When the external stimulus is obtained or when is generated by the involuntary processes, the neurons react by communicating with each other by the transmission of neuro signals. Although the functioning of the brain is very complex and far from fully understood, the theory of ANNs has been evolving and we are seeing a great deal of success in modeling some of the very complex problems that were not possible with traditional programming models. In order to make modern machines that possess the cognitive abilities of the human brain, there needs to be more research and a much better understanding of the biological neural networks.

What is the Backpropagation Method?

Backpropagation is the method used in ANNs to calculate a gradient that is needed in the calculation of the weights to be used in the network.

Formulate learning problem of perceptron

Binary classifier is a function which can decide whether or not an input, represented by a vector of numbers, belongs to some specific class. Perceptron is an algorithm for supervised learning of binary classifiers. Perceptron introduces concepts such as nodes, weights, layers and learning. It is a primitive neural network but it is important component of modern neural networks including DL. It has input and output layers. Since input layer doesn't operate, perception is considered as single layer structure.

Which activation functions are most commonly used in building the ANNs?

Commonly used activation functions within the ANNs are: Sigmoid function: The output value is between 0 and 1. This function takes a geometrical shape of S and hence the name sigmoid. Tanh function: The hyperbolic tangent function (tanh) is a slight variation of the sigmoid function that is 0- centered. Rectified Linear Unit (ReLu): This is the simplest, computationally optimized, and hence most popularly used activation function for the ANNs. The output value is 0 for all negative input and the same as the value of input for positive input.

Describe how Convolutional Neural Networks (CNNs) work?

Convolutional neural networks are networks that have a known grid-like network topology, such as image data that are essentially a 2-dimensional grid of pixels. The name "Convolutional" indicates that the network employs a mathematical operation called a convolution, which is a specialized kind of linear operation. CNN: • Automatically extract high-level and relevant spatial/temporal features • Convolutional neural networks are state of the art for image classification, and are applied successfully to many domains (text mining, time series analysis, genomics, ...) • Main types of layers: Convolution, pooling, and fully connected • Procedure of Learning Process

What are some of the implementation areas of deep neural networks?

Deep learning can be applied in variety of fields, such as automatic speech recognition, image recognition, natural language processing, medical

What is the difference between machine learning and deep learning?

Deep learning is a specialized implementation of machine learning as an abstract concept. Deep learning algorithms model the real-world data and they take a layered approach in creating the models. Each layer in the network specializes in a specific part of the input signal, starting from the high-level, more generic features in the initial layers, to the deeper and granular features in the subsequent layers toward the output layer. These networks are capable of training themselves based on some of the popular algorithms, such as backpropagation. A key difference between deep learning and machine learning is the performance with respect to the addition of data. The deep learning algorithms keep improving with the addition of training data. Typically, deep learning algorithms need more time and computation power to train compared to the traditional machine learning models.

Make a comparison of Human neuron and Artificial neuron

Human: 10 billion neurons, 60 trillions synapses, distributed processing, nonlinear processing, parallel processing. Artificial: Central processing, arithmetic operation(linearity), sequential processing.

What is the meaning of model overfitting?

Model overfitting occurs when the model is learning the input and cannot generalize on the new input data. Once this happens, the model is virtually not usable for real-world problems. The overfitting can be identified by the variation in model accuracy between the runs on training and validation datasets.

What is the major limitation of perceptron?

Perceptron can not handle non-linear separations. Minsky's "Perceptrons" Pointed out the limitation of perceptron and suggested a solution to overcome the problem by using multi layered structure.

What are RNNs and where are they used?

RNNs are the recurrent neural networks that utilize the output of one forward pass through the network as an input for the next iteration. RNNs are used when the input are not independent of each other. As an example, a language translation model needs to predict the next possible word based on the previous sequence of words. ANNs have great significance

How is RNN used as generative model of machine learning?

Recurrent Neural Network is a class of ANN where connections between nodes form a directed graph along a sequence. This allows it to exhibit temporal dynamic behavior for a time sequence. RNNs can user their internal state memory to process sequences of inputs. This makes them applicable to tasks like speech recognition.

What are the basic building blocks of an ANN?

The ANN consists of various layers. The layer that receives input from the environment (independent variables) is consumed by the input layer. There is a final layer that emits output of the model based on the generalization of the training data. This layer is called the output layer. In between the input and output layers there can be one or many layers that process the signals. These layers are called hidden layers. The nodes within each of the layers are connected by synapses or connectors. Each of the connectors has an optimum weight so as to reduce the value of the cost function that represents the accuracy of the neural network.

Formulate learning problem of MLP

The MLP learning algorithm computes the weights U1 and U2 based on the given training sample, that is, parameter set θ = {U1, U2} in case of two layers perceptron. Optimization target error function

What is the need for nonlinearity within an ANN?

The neural networks are mathematical models where the input are multiplied by the synapses weights and the sum of all the node connection products constitutes the value on a node. However, if we do not include nonlinearity with activation function, multi-layer neural networks will not exist. In that case, the model can be represented with a single hidden layer. We will be able to model very simple problems with linear modeling. In order to model more complex, real-world problems, we need multiple layers and hence nonlinearity within the activation functions.

What is the Feature Spatial Transformation in Multi-Layered Perceptron(MLP)?

The single layer perception had problem with linear separation and it did not even learn a simple XOR operation because XOR is not linear function. As a result, the theory of Multi- Layer Perception emerged. It is a model that expresses a complex form of equations by constructing a Multi-Layer Perception by stacking two or more layers of Perception. The first layer is called the input layer, the last layer is the output layer and the middle layers are called the hidden layers. Activation functions: a linear equation represents the simplest form of a mathematical model and is not representative of real-world scenarios. Without an activation function, a NN will have very limited capability to learn and model unstructured datasets such as images and videos. Hidden layer - Feature Spatial Transformation. Back propagation method for error correction.

Discuss learning process on smartphones

Training and Inference Are Two Vital Components of AI on Smartphones. Learning Process on ARM: Automatic Speech Recognition(ASR). Active research area for over 50 years! Applications Advanced ASR Keyword spotting Apple - Face ID

What is Vanishing Gradient problem?

Vanishing Gradient Problem: in Gradient Descent and Backpropagation methods, each of the MLP weights receives an update proportional to the partial derivative of the error function with respect to the current weight in each iteration of training.

Why do we need non-linear activation functions in deep neural networks?

Within the real-world, stochastic environments, and feature spaces, nonlinearities are more common than linear relationships. The neural networks learn by learning about the features with a layered structure where each layer stores a specific feature set from the training data. With a linear activation function applied at all the nodes within different layers, the linearity can be aggregated in one layer and there is no point in having a multilayered network. Without a multilayered network, it is not possible to model the stochastic input and generalize the model.


Kaugnay na mga set ng pag-aaral

Exam FX Stimulation, ExamFX Chapter 20: Surety Bonds and General Bond Concepts, ExamFX Chapter 21: Workers Compensation - General Concepts, NV Casualty Policies, Bond, Terms, Exam FX P&C Questions, Insurance, chatper commercial coverages, ExamFX Auto...

View Set

Chapter 13 Summary & Questions/Keypoints

View Set