This document introduces the preparation of ImageNet1k and flowers102
Dataset | train dataset size | valid dataset size | category |
---|---|---|---|
flowers102 | 1k | 6k | 102 |
ImageNet1k | 1.2M | 50k | 1000 |
- Data format
Please follow the steps mentioned below to organize data, include train_list.txt and val_list.txt
# delimiter: "space"
# the following the content of train_list.txt
train/n01440764/n01440764_10026.JPEG 0
...
# the following the content of val_list.txt
val/ILSVRC2012_val_00000001.JPEG 65
...
After downloading data, please organize the data dir as below
PaddleClas/dataset/ILSVRC2012/
|_ train/
| |_ n01440764
| | |_ n01440764_10026.JPEG
| | |_ ...
| |_ ...
| |
| |_ n15075141
| |_ ...
| |_ n15075141_9993.JPEG
|_ val/
| |_ ILSVRC2012_val_00000001.JPEG
| |_ ...
| |_ ILSVRC2012_val_00050000.JPEG
|_ train_list.txt
|_ val_list.txt
Download Data then decompress:
jpg/
setid.mat
imagelabels.mat
Please put all the files under PaddleClas/dataset/flowers102
generate generate_flowers102_list.py and train_list.txt和val_list.txt
python generate_flowers102_list.py jpg train > train_list.txt
python generate_flowers102_list.py jpg valid > val_list.txt
Please organize data dir as below
PaddleClas/dataset/flowers102/
|_ jpg/
| |_ image_03601.jpg
| |_ ...
| |_ image_02355.jpg
|_ train_list.txt
|_ val_list.txt