Issue dumpinmg results for sugarcrepe benchmark #128

escorciav · 2024-10-30T15:35:44Z

There is an issue dumping results for all the tasks/subsets of sugarcrepe output json, no?

It runs over all the split but only retain results for sugar_crepe/swap_obj

$ clip_benchmark eval --model ViT-B-16 --pretrained laion400m_e32 --dataset=sugar_crepe --output=vitb16_sugarcrepe.json --dataset_root ~/datasets/coco
Models: [('ViT-B-16', 'laion400m_e32')]
Datasets: ['sugar_crepe/add_att', 'sugar_crepe/add_obj', 'sugar_crepe/replace_att', 'sugar_crepe/replace_obj', 'sugar_crepe/replace_rel', 'sugar_crepe/swap_att', 'sugar_crepe/swap_obj']
Languages: ['en']
Running 'image_caption_selection' on 'sugar_crepe/add_att' with the model 'laion400m_e32' on language 'en'
Dataset size: 692
Dataset split: test
  0%|                                                                   | 0/11 [00:00<?, ?it/s]/home/SERILOCAL/v.castillo/projects/genai-research/clip-benchmark/clip_benchmark/metrics/image_caption_selection.py:55: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  with torch.no_grad(), autocast():but the output json might not be generated as you intented.

It runs over all the split but only retain results for `sugar_crepe/swap_obj`

$ clip_benchmark ce_rel', 'sugar_crepe/swap_att', 'sugar_crepe/swap_obj']
Languages: ['en']
Running 'image_caption_selection' on 'sugar_crepe/add_att' with the model 'laion400m_e32' on language 'en'
Dataset size: 692
Dataset split: test
0%| | 0/11 [00:00<?, ?it/s]/home/SERILOCAL/v.castillo/projects/genai-research/clip-benchmark/clip_benchmark/metrics/image_caption_selection.py:55: FutureWarning: torch.cuda.amp.autocast(args...) is deprecated. Please use torch.amp.autocast('cuda', args...) instead.
with torch.no_grad(), autocast():
100%|██████████████████████████████████████████████████████████| 11/11 [00:02<00:00, 5.36it/s]
Dump results to: vitb16_sugarcrepe.json
Running 'image_caption_selection' on 'sugar_crepe/add_obj' with the model 'laion400m_e32' on language 'en'
Dataset size: 2062
Dataset split: test
100%|██████████████████████████████████████████████████████████| 33/33 [00:04<00:00, 6.94it/s]
Dump results to: vitb16_sugarcrepe.json
Running 'image_caption_selection' on 'sugar_crepe/replace_att' with the model 'laion400m_e32' on language 'en'
Dataset size: 788
Dataset split: test
100%|██████████████████████████████████████████████████████████| 13/13 [00:02<00:00, 6.06it/s]
Dump results to: vitb16_sugarcrepe.json
Running 'image_caption_selection' on 'sugar_crepe/replace_obj' with the model 'laion400m_e32' on language 'en'
Dataset size: 1652
Dataset split: test
100%|██████████████████████████████████████████████████████████| 26/26 [00:04<00:00, 6.15it/s]
Dump results to: vitb16_sugarcrepe.json
Running 'image_caption_selection' on 'sugar_crepe/replace_rel' with the model 'laion400m_e32' on language 'en'
Dataset size: 1406
Dataset split: test
100%|██████████████████████████████████████████████████████████| 22/22 [00:03<00:00, 6.07it/s]
Dump results to: vitb16_sugarcrepe.json
Running 'image_caption_selection' on 'sugar_crepe/swap_att' with the model 'laion400m_e32' on language 'en'
Dataset size: 666
Dataset split: test
100%|██████████████████████████████████████████████████████████| 11/11 [00:02<00:00, 5.28it/s]
Dump results to: vitb16_sugarcrepe.json
Running 'image_caption_selection' on 'sugar_crepe/swap_obj' with the model 'laion400m_e32' on language 'en'
Dataset size: 245
Dataset split: test
100%|████████████████████████████████████████████████████████████| 4/4 [00:01<00:00, 3.82it/s]
Dump results to: vitb16_sugarcrepe.json

Models: [('ViT-B-16', 'laion400m_e32')]
Datasets: ['sugar_crepe/add_att', 'sugar_crepe/add_obj', 'sugar_crepe/replace_att', 'sugar_crepe/replace_obj', 'sugar_crepe/replace_rel', 'sugar_crepe/swap_att', 'sugar_crepe/swap_obj']
Languages: ['en']
Running 'image_caption_selection' on 'sugar_crepe/add_att' with the model 'laion400m_e32' on language 'en'
Dataset size: 692
Dataset split: test
  0%|                                                                   | 0/11 [00:00<?, ?it/s]/home/SERILOCAL/v.castillo/projects/genai-research/clip-benchmark/clip_benchmark/metrics/image_caption_selection.py:55: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  with torch.no_grad(), autocast():
100%|██████████████████████████████████████████████████████████| 11/11 [00:02<00:00,  5.36it/s]
Dump results to: vitb16_sugarcrepe.json
Running 'image_caption_selection' on 'sugar_crepe/add_obj' with the model 'laion400m_e32' on language 'en'
Dataset size: 2062
Dataset split: test
100%|██████████████████████████████████████████████████████████| 33/33 [00:04<00:00,  6.94it/s]
Dump results to: vitb16_sugarcrepe.json
Running 'image_caption_selection' on 'sugar_crepe/replace_att' with the model 'laion400m_e32' on language 'en'
Dataset size: 788
Dataset split: test
100%|██████████████████████████████████████████████████████████| 13/13 [00:02<00:00,  6.06it/s]
Dump results to: vitb16_sugarcrepe.json
Running 'image_caption_selection' on 'sugar_crepe/replace_obj' with the model 'laion400m_e32' on language 'en'
Dataset size: 1652
Dataset split: test
100%|██████████████████████████████████████████████████████████| 26/26 [00:04<00:00,  6.15it/s]
Dump results to: vitb16_sugarcrepe.json
Running 'image_caption_selection' on 'sugar_crepe/replace_rel' with the model 'laion400m_e32' on language 'en'
Dataset size: 1406
Dataset split: test
100%|██████████████████████████████████████████████████████████| 22/22 [00:03<00:00,  6.07it/s]
Dump results to: vitb16_sugarcrepe.json
Running 'image_caption_selection' on 'sugar_crepe/swap_att' with the model 'laion400m_e32' on language 'en'
Dataset size: 666
Dataset split: test
100%|██████████████████████████████████████████████████████████| 11/11 [00:02<00:00,  5.28it/s]
Dump results to: vitb16_sugarcrepe.json
Running 'image_caption_selection' on 'sugar_crepe/swap_obj' with the model 'laion400m_e32' on language 'en'
Dataset size: 245
Dataset split: test
100%|████████████████████████████████████████████████████████████| 4/4 [00:01<00:00,  3.82it/s]
Dump results to: vitb16_sugarcrepe.json

The text was updated successfully, but these errors were encountered:

escorciav · 2024-10-30T15:38:47Z

not big deal. One can dispatch a bash/whatever script along these lines:

#!/bin/bash

# Define the model and pretrained settings
model="ViT-B-16"
model_name=vitb16
pretrained="laion400m_e32"
dataset_root="/home/SERILOCAL/v.castillo/datasets/coco"
# SugarCrepe the tasks
tasks=("add_att" "add_obj" "replace_att" "replace_obj" "replace_rel" "swap_att" "swap_obj")

for task in "${tasks[@]}"
do
    # Construct the dataset and output paths
    dataset="sugar_crepe/$task"
    output="${model_name}_sugarcrepe-$task.json"

    # Run the command
    clip_benchmark eval --model $model --pretrained $pretrained --dataset=$dataset --output=$output --dataset_root $dataset_root
done

then ask a llm to merge them 😆 . Perhaps the clip even merge json. Getting familiar with it atm 😊
Thanks for putting this together 🤟

escorciav · 2024-10-30T16:59:29Z

my bad it's related to the output using template along these lines should fix it --output='out_{dataset}.json'

cc @mehdidc

escorciav mentioned this issue Oct 30, 2024

Add compositionality benchmarks #97

Open

escorciav closed this as completed Oct 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue dumpinmg results for sugarcrepe benchmark #128

Issue dumpinmg results for sugarcrepe benchmark #128

escorciav commented Oct 30, 2024 •

edited

Loading

escorciav commented Oct 30, 2024 •

edited

Loading

escorciav commented Oct 30, 2024

Issue dumpinmg results for sugarcrepe benchmark #128

Issue dumpinmg results for sugarcrepe benchmark #128

Comments

escorciav commented Oct 30, 2024 • edited Loading

escorciav commented Oct 30, 2024 • edited Loading

escorciav commented Oct 30, 2024

escorciav commented Oct 30, 2024 •

edited

Loading

escorciav commented Oct 30, 2024 •

edited

Loading