Improving neural network representations using human similarity judgments

Improving neural network representations using human similarity judgments

Environment setup and dependencies

We recommend to create a virtual environment (e.g., human_alignment), including all dependencies, via conda

$ conda env create --prefix /path/to/conda/envs/human_alignment --file envs/environment.yml
$ conda activate human_alignment
$ pip install git+https://github.com/openai/CLIP.git

Alternatively, dependencies can be installed via pip,

$ conda create --name human_alignment python=3.9
$ conda activate human_alignment
$ pip install --upgrade pip
$ pip install -r requirements.txt
$ pip install git+https://github.com/openai/CLIP.git

Repository structure

root
├── envs
├── └── environment.yml
├── data
├── ├── __init__.py
├── ├── cifar.py
├── └── things.py
├── utils
├── ├── __init__.py
├── ├── analyses/*.py
├── ├── evaluation/*.py
├── └── probing/*.py
├── models
├── ├── __init__.py
├── ├── custom_mode.py
├── └── utils.py
├── .gitignore
├── README.md
├── main_embedding_sim_eval.py
├── main_embedding_triplet_eval.py
├── main_model_comparison.py
├── main_model_sim_eval.py
├── main_model_triplet_eval.py
├── main_probing.py
├── requirements.txt
├── search_temp_scaling.py
├── show_triplets.py
└── visualize_embeddings.py

Usage

Run evaluation script on things triplet odd-one-out task with some pretrained model.

$ python main_model_triplet_eval.py --data_root /path/to/data/name \ 
--dataset name \
--model_names resnet101 vgg11 clip_ViT-B/32 clip_RN50 vit_b_16 \
--module logits \
--overall_source thingsvision \
--sources torchvision torchvision custom custom torchvision  \
--model_dict_path /path/to/model_dict.json \
--batch_size 128 \
--distance cosine \
--out_path /path/to/results \
--device cpu \
--verbose \
--rnd_seed 42

Run evaluation script on multi-arrangement similarity judgements with some pretrained model.

$ python main_model_sim_eval.py --data_root /path/to/data/name \ 
--dataset name \
--model_names resnet101 vgg11 clip_ViT-B/32 clip_RN50 vit_b_16 \
--module logits \
--overall_source thingsvision \
--sources torchvision torchvision custom custom torchvision  \
--model_dict_path /path/to/model_dict.json \
--batch_size 118 
--out_path /path/to/results \
--device cpu \
--verbose \
--rnd_seed 42 \

Downstream Task Evaluations

We evaluate the transformation matrix obtained by probing on the Things task on various downstream tasks.

CLIP Retrieval

We evaluate text -> image retrieval on the Flickr30K dataset. To compute the embeddings for all CLIP models, run:

python main_retrieval_init.py --embeddings_dir /home/space/datasets/things/downstream/clip-retrieval/retrieval_embeddings \
                              --data_root /home/space/datasets/things/downstream/clip-retrieval/flickr30k_images

(The embeddings are already computed on the TU cluster, so no need to run this step when working on the TU Cluster)

To evaluate the embeddings with and without transforms:

python main_retrieval_eval.py --out retrieval_results.csv \
                              --update_transforms \
                              --embeddings_dir /home/space/datasets/things/downstream/clip-retrieval/retrieval_embeddings \
                              --data_root /home/space/datasets/things/downstream/clip-retrieval/flickr30k_images

--concat_weight can be used to concat the transformed and normal embeddings and weigh the transformed ones. --transform_path can be used to change the path from which the transformation matrices are loaded.

Anomaly Detection

We evaluate nearest neighbor based anomaly detection on various datasets. The following script can be used to evaluate all models with all transforms on all datasets:

python main_ad_runner.py

For individual runs on one model/dataset with more control, the main_anomaly_detection.py script can be used.

Few-Shot Classification

We evaluate the few-shot classification performance on various datasets. First, representations need to be extracted via main_extract_fs_datasets.py. Then, main_fewshot.py can be run to generate the results. For example:

python3 main_fewshot.py \
--data_root "/path/to/data/name" \
--task none \
--dataset SUN397 \
--input_dim 256 \
--module penultimate \
--model_names "OpenCLIP_ViT-L-14_laion2b_s32b_b82k" \
--sources custom \
--model_dict_path "/path/to/model_dict.json" \
--n_test 50 \
--n_reps 5 \
--n_classes 397 \
--out_dir "/path/to/output" \
--embeddings_root "/path/to/extracted_data/name" \
--transforms_root "/path/to/transforms" \
--things_embeddings_path "/path/to/things/model_features_per_source.pkl" \
--transform_type "without"

Model names and sources are obtained from the thingsvision library. For cifar100, the task may be set to coarse, while for imagenet one of (living17, entity13, entity30, nonliving26) is expected. transforms_root only needs to be set for evaluating transforms other than without.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Improving neural network representations using human similarity judgments

Environment setup and dependencies

Repository structure

Usage

Downstream Task Evaluations

CLIP Retrieval

Anomaly Detection

Few-Shot Classification

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 272 Commits
data		data
downstream		downstream
scripts		scripts
transforms		transforms
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
find_best_probes.py		find_best_probes.py
main_ad_runner.py		main_ad_runner.py
main_anomaly_detection.py		main_anomaly_detection.py
main_clip_feature_extraction.py		main_clip_feature_extraction.py
main_embedding_sim_eval.py		main_embedding_sim_eval.py
main_embedding_triplet_eval.py		main_embedding_triplet_eval.py
main_extract_fs_datasets.py		main_extract_fs_datasets.py
main_fewshot.py		main_fewshot.py
main_from_scratch.py		main_from_scratch.py
main_global_probing.py		main_global_probing.py
main_glocal_probing.py		main_glocal_probing.py
main_glocal_probing_efficient.py		main_glocal_probing_efficient.py
main_glocal_probing_final.py		main_glocal_probing_final.py
main_imagenet_feature_extraction.py		main_imagenet_feature_extraction.py
main_model_sim_eval.py		main_model_sim_eval.py
main_model_triplet_eval.py		main_model_triplet_eval.py
main_retrieval_eval.py		main_retrieval_eval.py
main_retrieval_init.py		main_retrieval_init.py
parse_results.py		parse_results.py
plot_results.py		plot_results.py
requirements.txt		requirements.txt
visualize_embeddings.py		visualize_embeddings.py

License

LukasMut/gLocal

Folders and files

Latest commit

History

Repository files navigation

Improving neural network representations using human similarity judgments

Environment setup and dependencies

Repository structure

Usage

Downstream Task Evaluations

CLIP Retrieval

Anomaly Detection

Few-Shot Classification

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages