Evaluation Framework for Zero-Shot Learning Methods

Use this bibtex to cite this repository:

@INPROCEEDINGS{9663762,
  author={Patrício, Cristiano and Neves, João C.},
  booktitle={Proceedings of the IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS)}, 
  title={ZSpeedL - Evaluating the Performance of Zero-Shot Learning Methods using Low-Power Devices}, 
  year={2021},
  pages={1-8},
  doi={10.1109/AVSS52988.2021.9663762}}

1. Datasets ⬆️

The datasets used to evaluate the ZSL methods can be downloaded here. (MD5)

Dataset	No. Classes	No. Instances	No. Attributes	Download Images
AWA1	50	30,475	85	Not available
AWA2	50	37,322	85	AWA2 Images
CUB	200	11,788	312	CUB Images
SUN	717	14,340	102	SUN Images
APY	32	15,339	64	a-Yahoo Images / a-Pascal Images
LAD	230	78,017	359	LAD Images

2. Methods ⬆️

We select six state-of-the-art ZSL methods, including projection-based methods (ESZSL, SAE, and DEM), and generative methods (f-CLSWGAN, TF-VAEGAN, and CE-GZSL).

2.1 ESZSL

📄 Paper: http://proceedings.mlr.press/v37/romera-paredes15.pdf

Class: ESZSL(args)

Arguments:

<dataset> : {AWA1, AWA2, CUB, SUN, APY, LAD}
<dataset_path> : {'./datasets/}
<filename> : name of the features file (without the file extension)
<alpha> : int value [-3,3] 
<gamma> : int value [-3,3]
<att_split> : for the LAD dataset, specify which split is to be evaluated, in the following format "_{i}", i = {0,1,2,3,4}

How to Run:

In the main scope of main.py, insert the following code:

ESZSL(dataset="AWA1", filename="MobileNetV2")

Hyperparameters:

Dataset	Hyperparameter
AWA1	Alpha=3, Gamma=0
AWA2	Alpha=3, Gamma=0
CUB	Alpha=2, Gamma=0
SUN	Alpha=2, Gamma=2
APY	Alpha=3, Gamma=-1
LAD	Alpha=3, Gamma=1

2.2 SAE

📄 Paper: https://arxiv.org/pdf/1704.08345.pdf

Class: SAE(args)

Arguments:

<dataset> : {AWA1, AWA2, CUB, SUN, APY, LAD}
<dataset_path> : {'./datasets/}
<filename> : name of the features file (without the file extension)
<lamb_ZSL> : float value, default=2
<lamb_GZSL> : float value, default=2
<setting> : Type of evaluation {V2S, S2V}
<att_split> : for the LAD dataset, specify which split is to be evaluated, in the following format "_{i}", i = {0,1,2,3,4}

How to Run:

In the main scope of main.py, insert the following code:

SAE(dataset="AWA1", filename="MobileNetV2")

Hyperparameters:

Dataset	Setting	Lambda (ZSL)	Lambda (GZSL)
AWA1	V2S	3.0	3.2
AWA2	V2S	0.6	0.8
CUB	V2S	100	80
SUN	V2S	0.32	0.32
aPY	V2S	2.0	0.16
LAD	V2S	51.2	51.2

Dataset	Setting	Lambda (ZSL)	Lambda (GZSL)
AWA1	S2V	0.8	0.8
AWA2	S2V	0.2	0.2
CUB	S2V	0.2	0.2
SUN	S2V	0.16	0.08
aPY	S2V	4.0	2.56
LAD	S2V	6.4	6.4

2.3 DEM

📄 Paper: https://arxiv.org/pdf/1611.05088.pdf

Class: DEM(args)

Arguments:

<dataset> : {AWA1, AWA2, CUB, SUN, aPY}
<dataset_path> : {'./datasets/}
<filename> : name of the features file (without the file extension)
<lamb> : float value, default=1e-3
<lr> : float value, default=1e-4
<batch_size> : batch size, default=64
<hidden_dim> : Dimension of the hidden layer, default=1600
<att_split> : for the LAD dataset, specify which split is to be evaluated, in the following format "_{i}", i = {0,1,2,3,4}

How to Run:

In the main scope of main.py, insert the following code:

DEM(dataset="AWA1", filename="MobileNetV2")

Hyperparameters:

Dataset	Hidden Dim	Lambda	Learning Rate
AWA1	1600	1e-3	1e-4
AWA2	1600	1e-3	1e-4
CUB	1600	1e-2	1e-4
SUN	1600	1e-5	1e-4
aPY	1600	1e-4	1e-4
LAD	1600	1e-4	1e-4

2.4 f-CLSWGAN

📄 Paper: https://arxiv.org/pdf/1712.00981.pdf

Original version (f-CLSWGAN/orig/)

Run instructions:

python original_f-CLSWGAN.py --download_mode
python original_f-CLSWGAN.py --train_classifier
python original_f-CLSWGAN.py --train_WGAN

Modified version 🆕

Class: f_CLSWGAN(args)

Arguments:

<dataset> : {AWA1, AWA2, CUB, SUN, APY, LAD}
<dataroot> : {'./datasets/}
<image_embedding> : name of the features file (without the file extension)
<class_embedding> : name of the class embedding ("att" by default)
<split_no> : for the LAD dataset, specify which split is to be evaluated, in the following format "_{i}", i = {0,1,2,3,4}
<attSize> : size of the attribute annotations
<resSize> : size of the features
<nepoch> : number of epochs
<lr> : learning rate
<beta1> : beta1 for Adam optimizer
<batch_size> : input batch size
<cls_weight> : weight of the classification loss
<syn_num> : number of features to generate per class
<ngh> : size of the hidden units in generato
<ndh> : size of the hidden units in discriminator
<lambda1> : gradient penalty regularizer, following WGAN-GP
<classifier_checkpoint> : tells which ckpt file of tensorflow model to load
<nz> : size of the latent z vector

How to Run:

In the main scope of main.py, insert the following code:

f_CLSWGAN(dataset="AWA1", filename="MobileNetV2")

Hyperparameters:

Dataset	lambda1	cls_weight
AWA1	10	0.01
AWA2	10	0.01
CUB	10	0.01
SUN	10	0.01
APY	10	0.01
LAD	10	0.01

2.5 TF-VAEGAN

📄 Paper: https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123670477.pdf

Pre-requisites:

Python 3.6
Pytorch 0.3.1
torchvision 0.2.0
h5py 2.10
scikit-learn 0.22.1
scipy=1.4.1
numpy 1.18.1
numpy-base 1.18.1
pillow 5.1.0

Original version

Note: The requirements to run the script are present in the scope of the main function in original_tf-vaegan.pyscript.

Run instructions:

python original_tf-vaegan.py --download_mode
python original_tf-vaegan.py --train

Modified version 🆕

Class: TF_VAEGAN(args)

Arguments:

<dataset> : {AWA1, AWA2, CUB, SUN, APY, LAD}
<dataroot> : {'./datasets/}
<gammaD> : weight on the W-GAN loss
<gammaG> : weight on the W-GAN loss
<image_embedding> : name of the features file (without the file extension)
<class_embedding> :  name of the class embedding ("att" by default)
<syn_num> : number of features to generate per class
<ngh> : size of the hidden units in generator
<ndh> : size of the hidden units in discriminator
<lambda1> : gradient penalty regularizer, following WGAN-GP
<nclass_all> : number of all classes
<split> : for the LAD dataset, specify which split is to be evaluated, in the following format "_{i}", i = {0,1,2,3,4}
<batch_size> : input batch size
<nz> : size of the latent z vector
<latent_size> : size of the latent units in discriminator
<attSize> : size of semantic features
<resSize> : size of visual features
<lr> : learning rate to train GANs
<classifier_lr> : learning rate to train softmax classifier
<recons_weight> : recons_weight for decoder
<feed_lr> : learning rate to train GANs
<dec_lr> : learning rate to train GANs
<feedback_loop> : iterations on feedback loop
<a1> : weight of the feedback layers
<a2> : weight of the feedback layers

How to Run:

In the main scope of main.py, insert the following code:

TF_VAEGAN(dataset="AWA1", filename="MobileNetV2")

Hyperparameters:

Dataset	Hidlrden	syn_num	gammaD	gammaG	classifier_lr	recons_weight	feed_lr	dec_lr	a1	a2
AWA1	1e-5	1800	10	10	1e-3	0.1	1e-4	1e-4	0.01	0.01
AWA2	1e-5	2400	10	10	13-e	0.1	1e-4	1e-4	0.01	0.01
CUB	1e-4	2400	10	10	1e-3	0.01	1e-5	1e-4	1	1
SUN	1e-5	400	1	10	5e-4	0.01	1e-4	1e-4	0.1	0.01
APY	1e-5	300	10	10	1e-3	0.1	1e-4	1e-4	0.01	0.01
LAD	1e-5	1800	10	10	1e-3	0.1	1e-4	1e-4	0.01	0.01

2.6 CE-GZSL

📄 Paper: https://arxiv.org/pdf/2103.16173.pdf

Pre-requisites:

Python 3.6
Pytorch 1.2.0
scikit-learn

Original version:

Note: The requirements to run the script are present in the scope of the main function in original_ce-gzsl.pyscript.

Run instructions:

python original_ce-gzsl.py --download_mode
python original_ce-gzsl.py --train

Modified version 🆕

Class: CE_GZSL(args)

Arguments:

<dataset> : {AWA1, AWA2, CUB, SUN, APY, LAD}
<dataroot> : {'./datasets/}
<image_embedding> : name of the features file (without the file extension)
<class_embedding> :  name of the class embedding ("att" by default)
<split> : for the LAD dataset, specify which split is to be evaluated, in the following format "_{i}", i = {0,1,2,3,4}
<batch_size> : the number of the instances in a mini-batch
<nepoch> : number of epochs
<attSize> : size of semantic features
<resSize> : size of visual features
<nz> : size of the Gaussian noise
<lr> : learning rate to train GANs
<embedSize> : size of embedding h
<syn_num> : number synthetic features for each class
<outzSize> : size of non-liner projection z
<nhF> : size of the hidden units comparator network F
<ins_weight> : weight of the classification loss when learning G
<cls_weight> : weight of the score function when learning G
<ins_temp> : temperature in instance-level supervision
<cls_temp> : temperature in class-level supervision
<nclass_all> : number of all classes
<nclass_seen> : number of seen classes

How to Run:

In the main scope of main.py, insert the following code:

CE_GZSL(dataset="AWA1", filename="MobileNetV2")

Hyperparameters:

Dataset	syn_num	ins_temp	cls_temp	batch-size
AWA1	1800	0.1	0.1	4096
AWA2	2400	10	1	4096
CUB	300	0.1	0.1	2048
SUN	400	0.1	0.1	1024
APY	300	0.1	0.1	1024
LAD	1800	0.1	0.1	1024

3. Special Case: LAD dataset ⬆️

LAD dataset must be evaluated on each of the five available splits (att_splits_0.mat, att_splits_1.mat, att_splits_2.mat, att_splits_3.mat, att_splits_4.mat). This means that each ZSL method should be executed for each of the provided splits.

For example, if we want to evaluate LAD dataset with SAE method, we need to run the following code:

# Run SAE algorithm for LAD dataset
SAE(dataset="LAD", filename="MobileNetV2", att_split="_0")  # att_splits_0
SAE(dataset="LAD", filename="MobileNetV2", att_split="_1")  # att_splits_1
SAE(dataset="LAD", filename="MobileNetV2", att_split="_2")  # att_splits_2
SAE(dataset="LAD", filename="MobileNetV2", att_split="_3")  # att_splits_3
SAE(dataset="LAD", filename="MobileNetV2", att_split="_4")  # att_splits_4

After execute the above code, we end up with ten .txt files containing the predictions for each evaluated split.

# ZSL
preds_SAE_ZSL_att_0.txt
preds_SAE_ZSL_att_1.txt
preds_SAE_ZSL_att_2.txt
preds_SAE_ZSL_att_3.txt
preds_SAE_ZSL_att_4.txt

# GZSL
preds_SAE_GZSL_att_0.txt
preds_SAE_GZSL_att_1.txt
preds_SAE_GZSL_att_2.txt
preds_SAE_GZSL_att_3.txt
preds_SAE_GZSL_att_4.txt

3.1 Evaluating LAD

After evaluating each of the five splits with the chosen ZSL algorithm, the final classification is performed by averaging the results of the five super-classes.

First, compute the Top-1 accuracy for LAD:

# Evaluate LAD
compute_zsl_acc_lad(split=0, att_split="att_splits_0", preds_file="preds_SAE_att_0.txt")
compute_zsl_acc_lad(split=1, att_split="att_splits_1", preds_file="preds_SAE_att_1.txt")
compute_zsl_acc_lad(split=2, att_split="att_splits_2", preds_file="preds_SAE_att_2.txt")
compute_zsl_acc_lad(split=3, att_split="att_splits_3", preds_file="preds_SAE_att_3.txt")
compute_zsl_acc_lad(split=4, att_split="att_splits_4", preds_file="preds_SAE_att_4.txt")

# The above code returns a file named results_ZSL.txt, containing the accuracy for each of the 5 super-classes.

evaluate_LAD_ZSL("results_ZSL.txt")

And then, the Harmonic mean is calculated:

compute_harmonic_SAE_LAD(split=0, att_split="att_splits_0", preds="preds_SAE_GZSL_att_0.txt")
compute_harmonic_SAE_LAD(split=1, att_split="att_splits_1", preds="preds_SAE_GZSL_att_1.txt")
compute_harmonic_SAE_LAD(split=2, att_split="att_splits_2", preds="preds_SAE_GZSL_att_2.txt")
compute_harmonic_SAE_LAD(split=3, att_split="att_splits_3", preds="preds_SAE_GZSL_att_3.txt")
compute_harmonic_SAE_LAD(split=4, att_split="att_splits_4", preds="preds_SAE_GZSL_att_4.txt")

# The above code returns two files named results_seen.txt and results_unseen.txt, containing the accuracy for seen classes and the accuracy for unseen classes, respectively.

evaluate_LAD_GZSL(filename_seen="results_seen.txt", filename_unseen="results_unseen.txt")

However, for the remaining ZSL methods, the results on the generalized setting are obtained through the use of compute_harmonic_acc_lad(split, att_split, preds_seen_file, preds_unseen_file, seen) instead of compute_harmonic_SAE_LAD(split, att_split, preds=).

4. Extracting Custom Features ⬆️

In order to extract features for the datasets using a custom CNN architecture, run the following code:

python feature_extraction.py --model "MobileNet" --dataset_path "/path/to/dataset/*.jpg"

The output from above execution of the previous script is a .npy file containing the features.

After extracting the features, it is necessary to create a Matlab dictionary so that it can be evaluated along the ZSL methods. To do that, run the following code:

python custom_features.py --dataset AWA2 --dataroot "/path/to/dataset/" --features "MobileNet-AWA2-features" --features_path "/path/to/npy/features/file"

The output from the execution of the previous script is a .mat file that can be passed as a parameter for evaluating the ZSL methods using custom features.

5. Optimizing TensorFlow Models with TensorRT ⬆️

When running deep learning models on low-power devices such as the Jetson Nano, it is desirable that the models are optimized to take advantage of GPU capabilities. Thus, we provide two python scripts to optimize TensorFlow models (1.x and 2.x) using TensorRT optimization.

TF 1.x models:

In the case of TensorFlow 1.x models, please refer to the Convert_TF1.x_Models_to_TensorRT.py (TF_models_optimization/Convert_TF1.x_Models_to_TensorRT.py).

Note: The TF 1.x model to be optimized must be saved as follows:

saver = tf.train.Saver()
sess = tf.Session()
sess.run(tf.global_variables_initializer())

...
# Save Model
saver.save(sess, "model_xpto_tf1")

TF 2.x models:

In the case of TensorFlow 2.x models, please refer to the Convert_TF2.x_Models_to_TensorRT.py (TF_models_optimization/Convert_TF2.x_Models_to_TensorRT.py).

Note: The TF 2.x model to be optimized must be saved using:

tf.saved_model.save(model, 'xpto_saved_model')

6. Evaluating the Computational Performance of ZSL methods ⬆️

To measure the time consumed in the visual feature extraction, run the following code snippet:

python feature_extraction_inference_time/main.py

To measure the time consumed by different ZSL methods for classifying a test image, run the following code snippet:

Note: You need to download and decompress this file (MD5) into the zsl_methods_inference_time/ directory so that DEM model can be evaluated.

python zsl_methods_inference_time/main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Evaluation Framework for Zero-Shot Learning Methods

Table of Contents:

1. Datasets ⬆️

2. Methods ⬆️

2.1 ESZSL

2.2 SAE

2.3 DEM

2.4 f-CLSWGAN

2.5 TF-VAEGAN

Pre-requisites:

2.6 CE-GZSL

Pre-requisites:

3. Special Case: LAD dataset ⬆️

3.1 Evaluating LAD

4. Extracting Custom Features ⬆️

5. Optimizing TensorFlow Models with TensorRT ⬆️

6. Evaluating the Computational Performance of ZSL methods ⬆️

Files

README.md

Latest commit

History

README.md

File metadata and controls

Evaluation Framework for Zero-Shot Learning Methods

Table of Contents:

1. Datasets ⬆️

2. Methods ⬆️

2.1 ESZSL

2.2 SAE

2.3 DEM

2.4 f-CLSWGAN

2.5 TF-VAEGAN

Pre-requisites:

2.6 CE-GZSL

Pre-requisites:

3. Special Case: LAD dataset ⬆️

3.1 Evaluating LAD

4. Extracting Custom Features ⬆️

5. Optimizing TensorFlow Models with TensorRT ⬆️

6. Evaluating the Computational Performance of ZSL methods ⬆️