This repo aims to find and update the missing model cards for Hugging face datasets.
If you find this a worth while pursute, feel free to reach out and let's try to make the Hugging face datasets complete 😉
# install poetry
git clone --recurse-submodules --remote-submodules [email protected]:Hugging-Face-Supporter/datacards.git
cd datacards
git submodule update
poetry install
poetry shell
python datacards/main.py
- Look into how to provide multiple answers in model card (ex. Glue dataset)
- Find the datasets that are missing information by parsing the README
- Find ways to know what categories are valid answers
- Create method to filter for missing datasets
- Incorporate the argparse to filter for certiain things
- Toggle between datasets to annotate.
- Save modified files to the README again
- Once done, find ways to create automatic PR to Hugging face datasets
- Incorporate the Huggingface Hub API