add Clustering dataset for Indic languages #532

jaygala24 · 2024-04-23T19:11:20Z

Checklist for adding MMTEB dataset

Reason for dataset addition:

asparius

Looks fine!

mteb/tasks/Clustering/multilingual/IndicReviewsClusteringP2P.py

jaygala24 · 2024-04-24T02:52:52Z

This is the first Indic dataset in the Clustering task category. All the 13 Indic languages are new ones in this case. So, the total points would be 2 + 4 x 13 = 54 points. I have added the points for the dataset contribution and PR review.

add Indic clustering dataset

1533003

jaygala24 changed the title ~~add Indic clustering dataset~~ add Clustering dataset for Indic languages Apr 23, 2024

asparius reviewed Apr 23, 2024

View reviewed changes

mteb/tasks/Clustering/multilingual/IndicReviewsClusteringP2P.py Outdated Show resolved Hide resolved

jaygala24 added 2 commits April 24, 2024 08:10

update module import statement

492642c

add points for the contribution

45c4c61

KennethEnevoldsen merged commit dc9ba24 into embeddings-benchmark:main Apr 24, 2024
6 of 7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add Clustering dataset for Indic languages #532

add Clustering dataset for Indic languages #532

jaygala24 commented Apr 23, 2024 •

edited

Loading

asparius left a comment

jaygala24 commented Apr 24, 2024

add Clustering dataset for Indic languages #532

add Clustering dataset for Indic languages #532

Conversation

jaygala24 commented Apr 23, 2024 • edited Loading

Checklist for adding MMTEB dataset

asparius left a comment

Choose a reason for hiding this comment

jaygala24 commented Apr 24, 2024

jaygala24 commented Apr 23, 2024 •

edited

Loading