Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs on SGHMC / SGLD? #2270

Open
Janssena opened this issue Jun 20, 2024 · 4 comments
Open

Docs on SGHMC / SGLD? #2270

Janssena opened this issue Jun 20, 2024 · 4 comments

Comments

@Janssena
Copy link

Hey,

There are stochastic samplers in the Turing codebase, but there is no information how to actually use these.

Do they actually work? How do you define mini-batches?

I could add something to the BNN tutorial about it if someone could explain to me how these samplers work?

Thanks

@Red-Portal
Copy link
Member

No, there is currently no support for mini-batching out of the box. Also, it is generally not a great idea to spin BNNs with Turing since we currently don't support GPUs.

@patrickm663
Copy link

On very small datasets (+-100 observations), I've managed to get okay results at a fraction of a time to NUTS. The NNs I've tested aren't very large either (e.g. 4 layers of 8 neurons each). In my experience, it is very sensitive to random variation, so I've had to make a point of setting a random number seed when tweaking parameters and resampling, and the step size needed to be around 1e-3 to avoid NaNs, so try setting that smaller if you get that with the default configs.

Big caveat though: convergence of individual weights isn't very strong, but I've found it doesn't make a huge impact on results. Here, NUTS is the clear winner in y experience though it takes much, much longer to sample.

Invoking SGLD looks something like this:

N_sims = 10_000
ch_SGLD = sample(
	Xoshiro(323), 
	bayes_nn(Xs, ys), 
	SGLD(; stepsize=PolynomialStepsize(1e-3), adtype=AutoTracker()), 
	N_sims; 
	discard_adapt=false)

(takes about 8 secs p/10 000 samples on my laptop)

Agree though that mini-batching would be great one day along with GPU support to tackle larger datasets.

@Red-Portal
Copy link
Member

Red-Portal commented Jul 14, 2024

Hi, @patrickm663 , just to be clear, what you just did is technically not SGLD but unadjusted Langevin/Langevin Monte Carlo. SGLD is the special case where minibatching is used. It is very unlikely for unadjusted Langevin to be competitive against NUTS for such high dimensional problems.

@patrickm663
Copy link

patrickm663 commented Jul 14, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants