-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Webdataset updates #75
Conversation
Note: I've only tested the benchmark code with a single model, so I haven't run the complete experiments. There are some minor differences in numbers from what @rom1504 gave me, but no differences from what I get with the original datasets when I run them myself. |
Nice, will check it out |
Really cool, thanks @djghosh13! For differences in numbers, might be related to this issue #59 |
I see, yeah, I think the differences I saw were also in the 0.001s range. |
Are datasets getting cached locally? Where and is it tweakable ? |
I forgot to add this to the readme. By default, no, but there is a new |
I hadn't actually tested it before, but it looks like it will save the .tar files inside the specified cache directory in a subdirectory that's named like |
ok if it doesn't by default, it's good |
I'll test this |
ok so one thing here |
#76 a minor point, but quite nice for UX |
yeah this is much faster than the file based option will run to the end and compare numbers, if all good will merge |
ok it does work, let's go |
(Mostly) addresses issues #52 and #67
clip_benchmark_export_wds --retrieval
)import clip_benchmark.webdataset_builder
)benchmark/README.md
is to use webdatasetNot completed:
voc2007_multilabel