-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Could Lucene's default Directory (FSDirectory.open
) somehow preload .vec
files?
#13551
Comments
I'm trying to understand this sentence:
|
It wouldn't solve the issue, only mitigate it, but hopefully cold start performance gets better when we start leveraging
FWIW, it's not that the read-ahead is less agressive for mmap than traditional I/O, it's that since recently |
Oh sorry I used the wrong term (thank you @rmuir for clarifying!): it's not a swap storm I'm seeing, it's a page storm. The OS has plenty of free ram (reported by
+1 -- it would be a simple change. But I worry if it would do more harm than good in some cases, e.g. if there are truly cold HNSW cases where the application plans to suffer through paging to identify the subset of hot vectors? I don't know if that is even a thing -- a bunch of dark matter vectors that never get visited? I guess we do know that HNSW can and does sometimes produce disconnected graphs, but I think those dark matter islands are "typically" smallish.
+1! Especially because indexing them is quite costly!
+1 -- that would be a big help especially when paging in from fast SSDs since these devices have high IO concurrency.
Ahhh, that makes sense! And I think it is still correct to |
I don't think we should change anything here in MMapDirectory. It is all available and easy to do for one that wants to do this. Elasticserach is doing this for some files, but we have no default on preloading anything. For preloading you can pass MMapDirectory#setPreload and give it a lamda expression to select on which file extensions you want preloading: https://lucene.apache.org/core/9_11_0/core/org/apache/lucene/store/MMapDirectory.html#setPreload(java.util.function.BiPredicate) The default is: https://lucene.apache.org/core/9_11_0/core/org/apache/lucene/store/MMapDirectory.html#NO_FILES |
It is trivial: mmapDir.setPreload((name,ctx) -> name.endsWith(".vec")); So no change needed in the API or behaviour of Lucene. |
@uschindler At Amazon, we implemented the One approach would be to somehow limit vector files from being folded into compound files by instructing Lucene not to do it.. I was wondering if you had thoughts here/have thought about this approach before? The slightly tricky bit here is we only want to preload the vector data from the compound files. |
Tracing
So it looks like it applies the @gautamworah96 -- maybe open a spinoff issue exploring whether we could offer preloading of just slices within a CFS file? |
Opened #13967 |
Late to the party!!! I want to dig a little further on the distinction between the implementation of preload and prefetch, at least with the mmap implementation. The former touches every page (effectively forcing into the cache), while the latter simply advises of the access pattern. For sequential access, a combination of sequential advice and prefetch works really well, but not so well for random access. Random access, at least for the HNSW graph and the related vector data (be it quantised or not), really needs to be resident in memory. Preload gives us that, at least at the time of opening. Prefect does not.
For optimized vector search use cases, assuming enough RAM, I was thinking that a separate |
Description
This is really a "discussion" issue. I'm not sure at all that the idea is feasible:
I've been testing
luceneutil
with heavy KNN indexing (Cohere wikipediaen
768 dimension vectors) and one dismal part of the experience is the swap storm caused by HNSW's random access to the raw vectors stored in the.vec
files for each segment on "cold start" searching.Even when the box has plenty of RAM to hold all
.vec
files, the swap storm takes minutes on the nightly benchy box, even with a fast NVMe SSD holding the index. Even if the index is freshly built, the OS doesn't seem to cache the.vec
files since they appear to be "write once", until the searching starts up, and then the swap storm begins. This was withFLOAT32
vectors ... I suspect the problem is less severe withint4
orint8
compressed vectors (haven't tested).At Amazon (customer facing product search) we also see this swap storm when cold starting the searching process even after NRT segment replication has just copied the files locally: they don't stay "hot" in the OS in that case either (looks like "write once" to the OS).
Lucene already has an awesome feature to "fix" this:
MMapDirectory.setPreload
. It will pre-touch all pages associated with that file on open, so the OS caches them "once" on startup, much more efficiently than the random access HNSW. But this only makes sense for applications/users that know they have enough RAM (we will test this at Amazon to see if it helps our cold start problems). For myluceneutil
tests, simplycat /path/to/index/*.vec >> /dev/null
(basically the same as.setPreload
I think) fixed/sidestepped the swap storm.(The Linux kernel's less aggressive default readahead for memory-mapped IO vs traditional NIO is likely not helping either? Is this really a thing? I have not tested
NIOFSDirectory
to see if the swap storm is lessened).Longer term we have discussed using AKNN algorithms that are more disk-friendly (e.g. DiskANN), but shorter t erm, I'm wondering if we could somehow help users with a default
Directory
(fromFSDirectory.open
) that somehow / sometimes preloads.vec
files? It's not easy to do -- you wouldn't know up front that the application will do KNN searching at all. And, maybe only certain vectors in the.vec
will ever be accessed and so you need to pay that long random access price to find and cache just those ones.The text was updated successfully, but these errors were encountered: