Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on speed of fitting #66

Open
viktor2 opened this issue Oct 25, 2023 · 2 comments
Open

Question on speed of fitting #66

viktor2 opened this issue Oct 25, 2023 · 2 comments

Comments

@viktor2
Copy link

viktor2 commented Oct 25, 2023

This is my first experience with falkon.
I installed falkon version 0.8.4 with torch 2.1, cuda 12.1, keops 2.1.2.
i have 2 gpus: A6000 and rtx 3090. i can see the gpus being utilized during training, with memory utilized to full capacity.
everything seems to work. but kind of slow.
my dataset is about 10m (1e7) samples and 70 features. I use M=10,000 for the number of random points. Fitting takes about 600 seconds.
Is this about how long it should take? I want to make sure I am not off by an order or two orders of magnitude because of some weird issue.

@Giodiro
Copy link
Contributor

Giodiro commented Oct 26, 2023

Hi @viktor2
I'm not 100% sure, two tries you could to to check:

  1. only use the better GPU (I guess A6000) with CUDA_SET_VISIBLE_DEVICES
  2. try setting debug=True in the FalkonOptions class - it will print timings for each part of the model. The preconditioner should be very fast to compute (as 10k points are few), and the iterations could be a bit slower.

Another thing to keep in mind is to make sure the data is in float32 precision not float64!

@viktor2
Copy link
Author

viktor2 commented Oct 26, 2023

Thank you for the suggestions. What timing would you get for a dataset of this size? I am sure you benchmark performance for various dataset sizes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants