Example of how to use with Optimisers.jl #34

theabhirath · 2022-05-31T05:02:03Z

Hi, I've been trying to use this package with Optimisers.jl (specifically, I've been trying to use a Step schedule with a Scheduler but I seem to be getting errors that suggest that this setup works with the Flux optimisers, and not with Optimisers.jl for now. Is there a way to write code that works with Optimisers.jl?

The text was updated successfully, but these errors were encountered:

darsnack · 2022-05-31T15:15:50Z

Constructing optimizers from Optimisers.jl is cheap and simple, since the state is de-coupled. Something like this would work:

struct Scheduler{T, F}
    constructor::F
    schedule::T
end

_get_opt(scheduler::Scheduler, t) = scheduler.constructor(scheduler.schedule(t))

Optimisers.init(o::Scheduler, x::AbstractArray) =
    (t = 1, opt = Optimisers.init(_get_opt(o, 1), x))

function Optimisers.apply!(o::Scheduler, state, x, dx)
    opt = _get_opt(o, state.t)
    new_state, new_dx = Optimisers.apply!(opt, state.opt, x, dx)

    return (t = state.t + 1, opt = new_state), new_dx
end

opt = Scheduler(Step(init_lr, decay)) do lr
    Momentum(lr)
end
st = Optimisers.setup(opt, model)

theabhirath · 2022-05-31T17:45:53Z

I tried this and my model stopped training 😬 It's stuck after one epoch

darsnack · 2022-05-31T22:39:18Z

That's weird. Which rule are you using? And can you post the value of st after setup for a small model? I would also print opt inside the definition for apply!.

darsnack · 2022-05-31T22:52:16Z

I would also confirm that training doesn't stall with a reasonable fixed LR first.

theabhirath · 2022-06-01T02:05:23Z

Whoops, nevermind, figured it out. I set step_sizes = 25 for Step, but naturally since Step is being called every step and not every epoch step_sizes has to be (25 * dataset_size) / batch_size. Reducing it by 0.1 every step meant that the learning rate was on the order of 10^-40 before even starting the second epoch 😅 It's training now, thank you!

darsnack · 2022-06-01T02:41:03Z

Ah good to know. You should check out Interpolator and the docs for it (under complex schedules). Just a slightly less error prone/cleaner way of specifying schedules in epochs that will be iterated by mini-batch.

ToucheSir · 2023-12-23T05:40:31Z

Now that we have Optimisers.adjust!, should Scheduler be modernized and adopted as the recommended (easy) way to schedule parameters with Flux + Optimisers.jl?

Qiyu-Zh · 2024-01-02T21:24:31Z

I meet a problems that scheduler can not be setup. Could you share me more details?

darsnack · 2024-01-02T21:40:17Z

The details for calling setup are in the original code above. Can you share the error?

Qiyu-Zh · 2024-01-02T21:50:28Z

darsnack · 2024-01-02T21:53:19Z

Modify the original code as

struct Scheduler{T, F} <: Optimisers.AbstractRule
    constructor::F
    schedule::T
end

Qiyu-Zh · 2024-01-02T22:39:41Z

Great, thanks!

Qiyu-Zh · 2024-02-01T19:58:24Z

Why there is the question the EXP not compatible with float?

darsnack mentioned this issue Jun 3, 2022

Deprecate Flux.Optimisers and implicit parameters in favour of Optimisers.jl and explicit parameters FluxML/Flux.jl#1986

Closed

8 tasks

ToucheSir mentioned this issue Aug 14, 2022

Using a wrapper + mutation to "implicitly" update scheduled parameters? #35

Open

darsnack mentioned this issue Mar 31, 2023

Interface for gradient accumulation FluxML/Optimisers.jl#130

Closed

darsnack mentioned this issue Dec 15, 2023

Treat Flux dependency as an Extension #53

Closed

darsnack mentioned this issue Dec 23, 2023

Add initial Optimisers.jl scheduler #55

Merged

2 tasks

darsnack closed this as completed in #55 Feb 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Example of how to use with Optimisers.jl #34

Example of how to use with Optimisers.jl #34

theabhirath commented May 31, 2022

darsnack commented May 31, 2022

theabhirath commented May 31, 2022

darsnack commented May 31, 2022 •

edited

Loading

darsnack commented May 31, 2022

theabhirath commented Jun 1, 2022 •

edited

Loading

darsnack commented Jun 1, 2022

ToucheSir commented Dec 23, 2023

Qiyu-Zh commented Jan 2, 2024

darsnack commented Jan 2, 2024

Qiyu-Zh commented Jan 2, 2024

darsnack commented Jan 2, 2024

Qiyu-Zh commented Jan 2, 2024

Qiyu-Zh commented Feb 1, 2024

Example of how to use with Optimisers.jl #34

Example of how to use with Optimisers.jl #34

Comments

theabhirath commented May 31, 2022

darsnack commented May 31, 2022

theabhirath commented May 31, 2022

darsnack commented May 31, 2022 • edited Loading

darsnack commented May 31, 2022

theabhirath commented Jun 1, 2022 • edited Loading

darsnack commented Jun 1, 2022

ToucheSir commented Dec 23, 2023

Qiyu-Zh commented Jan 2, 2024

darsnack commented Jan 2, 2024

Qiyu-Zh commented Jan 2, 2024

darsnack commented Jan 2, 2024

Qiyu-Zh commented Jan 2, 2024

Qiyu-Zh commented Feb 1, 2024

darsnack commented May 31, 2022 •

edited

Loading

theabhirath commented Jun 1, 2022 •

edited

Loading