RE: What does learning rate refer to in deep learning?

I only see 0.1 or 0.01

Add Comment
2 Answers
Learning rate in deep learning and most learning algorithms is a hyper-parameter that determines the step size at each iteration while moving towards a minimum of a loss function. Essentially, it controls how much we are adjusting the weights with the respect to the loss gradient. With learning rate values like 0.1 and 0.01 you mentioned, the model learns by adjusting the weights by 10% and 1% of the gradient, respectively. Therefore, learning rate is directly related to how fast or slow a network trains. However, it's important to set a careful balance for learning rate. If it's too large, the algorithm might overshoot the minimum and diverge. Conversely, if it's too small, the algorithm will need more iterations to converge to the minimum, meaning it will work slowly. A common practice is to gradually decrease the learning rate, allowing the model to settle more comfortably into the global (or local) minimum. Procedures for changing the learning rate during training are called learning schedule or learning rate schedule. Examples include: time-based decay, drop-based decay, etc. To find an optimal learning rate, consider using grid search or a learning rate finder tool that will try multiple learning rates and see which one provides the least loss.
Answered on July 17, 2023.
Add Comment

Your Answer

By posting your answer, you agree to the privacy policy and terms of service.