You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Specifically, the TFDS version indexes specific portions of the wiki articles (and in some cases two different parts of the article are linked by the same query id) and that MTEB/BEIR just takes the wiki article as a whole, but more importantly that the corpus text for the articles does not necessarily contain the text from the original target sentences/passages/subsections (but instead is just first x chars/tokens/or something).
Also it is worth noting that all of the qrels are scored as 1 in the MTEB version regardless of original rater annotations.
Since MTEB derived its preprocessing from BEIR, we are guessing that the discrepancy has started from BEIR.
I think it would be great investigating this and if it is an issue indeed then create an updated version of the Task to supersede it similar to Touchev3
From @jhyuklee:
I think it would be great investigating this and if it is an issue indeed then create an updated version of the Task to supersede it similar to Touchev3
mteb/mteb/tasks/Retrieval/eng/Touche2020Retrieval.py
Line 54 in 3ff38ec
The text was updated successfully, but these errors were encountered: