Scheduler with gradient clipping #54

JinraeKim · 2023-12-05T02:19:51Z

Motivation and description

I think most implementations would require gradient clipping.
For example, Step (exp-decay with step size) with gradient clipping setting a lower bound of the learning rate.

These were common in previous versions of Flux.
So I think it would be very useful if such functionality is provided by default with keyword arguments.

If it is already implemented, I don't think the current documentation describes it well as I couldn't find it in the docs.

Possible Implementation

No response

The text was updated successfully, but these errors were encountered:

ToucheSir · 2023-12-05T15:18:39Z

https://discourse.julialang.org/t/learning-rate-scheduler-with-the-new-interface-of-flux/107142 is a cross-post of this, so closing the GH issue in favour of the more appropriate platform for questions.

ToucheSir closed this as completed Dec 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scheduler with gradient clipping #54

Scheduler with gradient clipping #54

JinraeKim commented Dec 5, 2023

ToucheSir commented Dec 5, 2023

Scheduler with gradient clipping #54

Scheduler with gradient clipping #54

Comments

JinraeKim commented Dec 5, 2023

Motivation and description

Possible Implementation

ToucheSir commented Dec 5, 2023