Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passing a limit doesn't randomly sample, but rather takes dataset[:limit], introducing dataset bias #2598

Open
aalpat1 opened this issue Dec 27, 2024 · 0 comments

Comments

@aalpat1
Copy link

aalpat1 commented Dec 27, 2024

Hi,

I've noticed that the limit passed to evaluate method doesn't randomly select from the dataset, as I would expect, but rather just takes the first n samples where n is the limit

I understand how this works better with cached results, but it does introduce a sampling bias when calculating results, b/c although the guidance is to only use for testing, it is very useful for datasets which have a large amount of samples (e.g. MMLU).

Would it possible to pass an additional param that allows for random sampling instead of first N? Alternatively, pass in a HuggingFace dataset to the evaluate method?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant