-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MTEB Evaluation Running Time #140
Comments
Hi @stefanhgm, Yes, unfortunately evaluating 7B models on MTEB is an extremely long and arduous process. The only thing that can help speed up the evaluation is multi-GPU setup, in case that is available. The library support multi-GPU evaluation without any code changes. |
Hi @vaibhavad, Thanks for coming back on this! My experience on 4 GPUs is that it only get ~2.5x faster. Can you maybe give me an estimate on the overall running time or the time you needed to run on DBPedia if that's available? Otherwise I will just try it again with a longer time interval or more GPUs. Thank you! |
Unfortunately, I don't remember the running time of DBPedia and I don't have the log files anymore. However, I do remember that out of all tasks, MSMARCO took the longest, which has 7 hours on 8 A100 GPUs. So DBPedia will be less than that. |
Hi @stefanhgm, I just ran evaluation of DBPedia for Llama 3.1 8B model, it took 2.5 hours on 8 X H100 80GB GPUs. |
Thank you! That was helpful. |
Hi @vaibhavad, sorry, I stumbled across another issue: Do we actually have to run the tasks for the I use the following code snippet to run the MTEB benchmark in
I looked for alternatives to filter only for the Thank you! |
I am now trying it with the following code only using the
|
Hi @stefanhgm, |
Hi @stefanhgm ,
Just the test suffices, I believe you already figured out a way to run on just dev/test sets with MTEB package. Let me know if you need anything else. |
Hi @nasosger sorry for the very late reply. I basically changed the things I pointed out earlier. Here is my
|
@stefanhgm Hi there, I hope you're doing well. I was wondering if you might be able to share your multi-card test code with me? I'd really appreciate it if you could. |
Hi @BtlWolf I think the above script ran on multiple gpus automatically for me. I checked my running commands and there is no part where I explicitly enabled the multi-gpu setting. |
@stefanhgm
Traceback (most recent call last):
|
@stefanhgm I have solved this problem and need to add if name=="main" to mteb: |
Hi everyone!
Thanks for developing LLM2Vec and making the source code available.
I was trying to reproduce
LLM2Vec-Meta-Llama-3-8B-Instruct-mntp-supervised
and train a model based on Llama 3.1 8B. I trained both models and now want to obtain results on the MTEB benchmark for comparison. Unfortunately, it seems to take very long to run the benchmark using the LLM2Vec models. I am currently done with the tasksCQADupstackWordpressRetrieval
andClimateFever
(also see #135) and the next task (I think it isDBPedia
) takes over 48h on a single A100 80GB. Is this the expected behavior? Can you share some insights about the running times of LLM2Vec on MTEB or share advice on how to speed it up?I use the below snippet to run MTEB based on the script you provided:
Thanks for any help!
The text was updated successfully, but these errors were encountered: