You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a use case where I run ColBERT on CPU on a couple thousand documents. For this I don't use PLAID but the encode and search_encoded_docs methods and the search works fast enough, the problem is that encoding all these documents on CPU takes time and I don't want to encode everything everytime I deploy the model so I developed a way for saving and loading these encodings:
I'd like this, same workflow as you and similar solution but having it be built in would be great. maybe having it be compatible with overwrite_index for cache invalidation would also be a good idea?
This is coming as part of the overhaul I semi-announced on twitter (just on twitter, to stay lowkey...)
I have no exact ETA but these features will be available on the overhaul branch (which isn't installable right now as it'll crash, but will be very soon) within the next couple weeks.
If you have just ~2k documents and want to improve latency, the best way forward will most likely to use the HNSW index that'll ship as the native indexing mechanism for any collections under ~5k documents. It gets performance more or less matching exact search while being quite a bit quicker. Otherwise, something pretty similar to your mechanism will be added for loading/saving in-memory encodings.
I have a use case where I run ColBERT on CPU on a couple thousand documents. For this I don't use PLAID but the
encode
andsearch_encoded_docs
methods and the search works fast enough, the problem is that encoding all these documents on CPU takes time and I don't want to encode everything everytime I deploy the model so I developed a way for saving and loading these encodings:https://github.com/ChatFAQ/ChatFAQ/blob/cc19e4b85198062888d6320e59276db31461f4e9/chat_rag/chat_rag/retrievers/colbert_retriever.py#L163
If interested I could improve and integrate this into the
RAGPretrainedModel
orColBERT
classes and make a PR.The text was updated successfully, but these errors were encountered: