Docs on SGHMC / SGLD? #2270

Janssena · 2024-06-20T15:21:20Z

Hey,

There are stochastic samplers in the Turing codebase, but there is no information how to actually use these.

Do they actually work? How do you define mini-batches?

I could add something to the BNN tutorial about it if someone could explain to me how these samplers work?

Thanks

Red-Portal · 2024-06-25T00:49:34Z

No, there is currently no support for mini-batching out of the box. Also, it is generally not a great idea to spin BNNs with Turing since we currently don't support GPUs.

patrickm663 · 2024-07-14T20:02:16Z

On very small datasets (+-100 observations), I've managed to get okay results at a fraction of a time to NUTS. The NNs I've tested aren't very large either (e.g. 4 layers of 8 neurons each). In my experience, it is very sensitive to random variation, so I've had to make a point of setting a random number seed when tweaking parameters and resampling, and the step size needed to be around 1e-3 to avoid NaNs, so try setting that smaller if you get that with the default configs.

Big caveat though: convergence of individual weights isn't very strong, but I've found it doesn't make a huge impact on results. Here, NUTS is the clear winner in y experience though it takes much, much longer to sample.

Invoking SGLD looks something like this:

N_sims = 10_000
ch_SGLD = sample(
	Xoshiro(323), 
	bayes_nn(Xs, ys), 
	SGLD(; stepsize=PolynomialStepsize(1e-3), adtype=AutoTracker()), 
	N_sims; 
	discard_adapt=false)

(takes about 8 secs p/10 000 samples on my laptop)

Agree though that mini-batching would be great one day along with GPU support to tackle larger datasets.

Red-Portal · 2024-07-14T20:15:01Z

Hi, @patrickm663 , just to be clear, what you just did is technically not SGLD but unadjusted Langevin/Langevin Monte Carlo. SGLD is the special case where minibatching is used. It is very unlikely for unadjusted Langevin to be competitive against NUTS for such high dimensional problems.

patrickm663 · 2024-07-14T20:19:16Z

Hi, thanks very much for clarifying! I did not realise that

…

________________________________ From: Kyurae Kim ***@***.***> Sent: Sunday, 14 July 2024 22:15 To: TuringLang/Turing.jl ***@***.***> Cc: Patrick Moehrke ***@***.***>; Mention ***@***.***> Subject: Re: [TuringLang/Turing.jl] Docs on SGHMC / SGLD? (Issue #2270) Hi, @patrickm663<https://github.com/patrickm663> , just to be clear, what you just did is technically not SGLD but unadjusted Langevin/Langevin Monte Carlo. SGLD is the special case where minibatching is used. Without minibatching, it is very unlikely for unadjusted Langevin to be competitive against NUTS for such high dimensional problems. — Reply to this email directly, view it on GitHub<#2270 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ASSHES2UV5C6PKOJVWE5UFTZMLL5VAVCNFSM6AAAAABJUGBAIOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRXGQ3TCNRQGQ>. You are receiving this because you were mentioned.Message ID: ***@***.***>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docs on SGHMC / SGLD? #2270

Docs on SGHMC / SGLD? #2270

Janssena commented Jun 20, 2024

Red-Portal commented Jun 25, 2024

patrickm663 commented Jul 14, 2024

Red-Portal commented Jul 14, 2024 •

edited

Loading

patrickm663 commented Jul 14, 2024 via email

Docs on SGHMC / SGLD? #2270

Docs on SGHMC / SGLD? #2270

Comments

Janssena commented Jun 20, 2024

Red-Portal commented Jun 25, 2024

patrickm663 commented Jul 14, 2024

Red-Portal commented Jul 14, 2024 • edited Loading

patrickm663 commented Jul 14, 2024 via email

Red-Portal commented Jul 14, 2024 •

edited

Loading