Lecture 6 - Gradient Descent Learning
Error Surface
A graphical or mathematical representation of the relationship between model parameters and the associated loss or error. It visualizes how changes in parameter values affect the performance of a machine learning model
Stochastic Gradient Descent
A variant of the gradient descent algorithm. Instead of using the entire training dataset in each iteration, SGD randomly selects a single data point or a small batch of data points to compute and update the gradient
Loss Functions
Also known as cost functions, are used to quantify the error between predicted values and actual target values. They serve as a measure of how well a model is performing
ReLU
An activation function commonly used in artificial neural networks. It replaces all negative input values with zero and passes positive values unchanged
Least Mean Square (LMS) Algorithm
An adaptive filtering technique used to minimize the mean squared error between predicted and actual values by iteratively adjusting model parameters based on the gradient of the error
Gradient Descent
An optimization algorithm used to minimize a cost function or loss function. It iteratively adjusts model parameters in the direction of the steepest decrease in the function's gradient
Momentum
An optimization technique used in gradient descent algorithms. It adds a fraction of the previous gradient to the current gradient update, which helps smooth out oscillations and accelerates convergence
Adaptive Moment (ADAM) Algorithm
Combines the concepts of momentum and adaptive learning rates to efficiently update model parameters during training. It adjusts learning rates individually for each parameter