Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] Change vector input from IndexInput to RandomAccessInput #13981

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

dungba88
Copy link
Contributor

@dungba88 dungba88 commented Nov 8, 2024

Description

This PR changes the input of vector reader from IndexInput to RandomAccessInput, as we mostly random-access the vectors with seek() followed by readFloats().

One thing I'm uncertain is the need for cloning RandomAccessInput.

TODO:

  • Add tests

fixes #13938

@dungba88 dungba88 changed the title Change vector input from IndexInput to RandomAccessInput [DRAFT] Change vector input from IndexInput to RandomAccessInput Nov 8, 2024
assert values instanceof ByteVectorValues;
if (!(slice instanceof IndexInput input)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit : input is not used

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used in the next block though

input = FilterIndexInput.unwrapOnlyTest(input);

@@ -77,4 +85,6 @@ default void readBytes(long pos, byte[] bytes, int offset, int length) throws IO
* @see IndexInput#prefetch
*/
default void prefetch(long offset, long length) throws IOException {}

Object clone();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change this to RandomAccessInput clone(); so you don't have to cast in all places?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This baffles me a lot! I can't let the return type to be RandomAccessInput because it will conflict with the DataInput.clone() which returns DataInput.

Adding the clone method seems a bit weird to me, but I'm not sure if there is a better way to go around it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure we should add a clone() here to random access input. This is a wide ranging public interface in lucene. Adding something here should be carefully considered.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's not ideal, but clone is needed for copy() method. If not, maybe we need to have a CloneableRandomAccessInput?

Copy link

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

@github-actions github-actions bot added the Stale label Nov 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move vector search from IndexInput to RandomAccessInput
3 participants