DL: Convolutional Neural Networks "CNNs"
What are the working steps of the convolutional layer?
1. convolutional operation 2. activation function 3. pooling operation 4. strides
What are the working steps for CNN's?
1. input image 2. convolutional layers 3. activation function 4. pooling layers 5. fully connected layers 6. output
How do fully connected layers work?
1. input to a fully connected layer is a flattened array of the output from the previous layer. For example, if the previous layer is a convolutional layer with 64 feature maps, and each feature map is 28x28, then the input to the fully connected layer would be a flattened array of size 642828 2. fully connected layer then applies a weight matrix and a bias term to the input, producing an output output = activation_function(weight_matrix * input + bias_term)
What is the most common activation function used in this step?
Rectified Linear Unit (ReLU), which sets negative values to zero and leaves positive values unchanged this introduces non-linearity into the network and helps capture more complex features
What happens in the activation function step?
a non-linear activation function is applied element-wise to the output of each filter
What are CNNs?
a type of deep neural network that are commonly used for image/video recognition, classification, and processing
What happens in the strides step?
another method that can be used to reduce the spatial dimensions of the feature map
How do CNN's work?
by using a series of convolutional layers to extract features from the input image, followed by one or more fully connected layers to perform classification or regression
What are fully connected layers used for?
in fully connected layers, every neuron in the previous layer is connected to every neuron in the current layer typically used at the end of the network to perform classification or regression tasks based on the features learned by the preceding convolutional and pooling layer
What is the typical input to a CNN?
input to a CNN is typically a 2D image represented as a matrix of pixel values
What are convolutions? convolutional layers?
mathematical operation of combining input data with a set of learnable filters or kernels to produce a new set of features convolutional layers are a type of layer in a neural network that perform this operation
What is the most common pooling operation used in this step?
max pooling, which takes the maximum value of each local region in the feature map
What is an outcome of the convolutional operation?
operation produces a set of feature maps, which highlight different patterns or features in the input image
How do you calculate the convolution output size?
output_size = (input_size - kernel_size + 2*padding) / stride + 1
Why is pooling needed?
the feature maps produced by the convolutional layer can be quite large, which can lead to high computational costs and overfitting
What happens in the convolutional operation step?
the layer applies a set of learnable filters (also known as kernels or weights) to the input image each filter slides over the image and performs a dot product between its values and the values of the pixels in the local region of the input image
What are strides?
the number of pixels that the filter moves horizontally and vertically at each step
What happens in the pooling step?
the output of the activation function is downsampled using a pooling operation, which reduces the spatial dimension of the output while preserving the most important features
What occurs in the output step in CNNs?
the output of the fully connected layer is often passed through a softmax function to obtain a probability distribution over the classes the class with the highest probability is then selected as the final prediction
What is the purpose of the fully connected layer?
to combine the features learned by the convolutional layers and make a final prediction about the input image