RNN and LSTM
To connect the encoder with decoder layer in many to many sequence problems, which layer is used:
· Repeat vector
What is TRUE about LSTM gates?
· The Forget Gate, in LSTM maintains or deletes data from the information cell.
The forget gate
· The forget gate 𝑓𝑡 is responsible for deciding what information should be removed from the cell state (memory). · The forget gate is controlled by a sigmoid function.
The Input Gate
· The input gate is responsible for deciding what information should be stored in the cell state. · The input gate is controlled by a sigmoid function.
The Output Gate
· The output gate is responsible for deciding what information should be taken from the cell state to give as an output.
contains information from the past, which is not contained in the current input
· hidden layer RNN
LSTM cell consists of three special gates called
· input gate · output gate · forget gate
What is NOT TRUE about RNNs?
RNNs are very robust against vanishing gradient problem.
Long Short-Term Memory Architecture
The Long Short-Term Memory is composed of a linear unit surrounded by three logistic gates. The name for these gates varies from place to place, but the most usual names for them are: · the "Input" or "Write" Gate, which handles the writing of data into the information cell · the "Output" or "Read" Gate, which handles the sending of data back onto the Recurrent Network · the "Keep" or "Forget" Gate, which handles the maintaining and modification of the data stored in the information cell
Long Short-Term Memory Model
The Long Short-Term Memory was an abstraction of how computer memory works. It is "bundled" with whatever processing unit is implemented in the Recurrent Network, although outside of its flow, and is responsible for keeping, reading, and outputting information for the model.
What is difference between CNN and RNN?
The main difference between a CNN and an RNN is the ability to process temporal information, the data that comes in sequences, such as a sentence. Recurrent neural networks are designed for this very purpose, while convolutional neural networks are incapable of effectively interpreting temporal information.
How does Long Short-Term Memory work?
The way it works is simple: you have a linear unit, which is the information cell itself, surrounded by three logistic gates responsible for maintaining the data. One gate is for inputting data into the information cell, one is for outputting data from the input cell, and the last one is to keep or forget data depending on the needs of the network.
The LSTM gates are responsible for:
These three gates are responsible for deciding what information to add; output and forget from the memory. With these gates, LSTM effectively keeps information in the memory only as long as they required.
What is a Recurrent Neural Network?
· A recurrent neural network (RNN) is a class of artificial neural networks where connections between nodes form a directed or undirected graph along a temporal sequence. This allows it to exhibit temporal dynamic behavior. Derived from feedforward neural networks, RNNs can use their internal state (memory) to process variable length sequences of inputs. · It is a Neural Network that can recur to itself, and is proper for handling sequential data
Examples of time series
· Dow-Jones Industrial Average · Electricity demand for a city · Air temperature in a building Speech, text, video
Applications of RNN
· Estimating temperatures from weather data · Natural Language Processing · Video context retriever · Speech Recognition
What application(s) is(are) suitable for RNNs?
· Estimating temperatures from weather data · Natural Language Processing · Video context retriever · Speech Recognition
The candidate state or internal state vector
· It is vector usually called 𝑔𝑡 that controls what data to write to the cell-state. · It is regulated by the tanh function.
The shape of the feature set passed to the LSTM's input layer should be:
· Number of Records, Timesteps, Features
Why are RNNs susceptible to issues with their gradients?
· Numerical computation of gradients can drive into instabilities · Gradients can grow exponentially · Propagation of errors due to the recurrent characteristic
Image to text description is an example of:
· One to Many Sequence Problems
What is TRUE about RNNs?
· RNNs are VERY suitable for sequential data. · RNNs need to keep track of states, which is computationally expensive.