Performs normalization of the text embedding twice #90

Adamdad · 2023-04-03T07:00:57Z

The zeroshot_classification.py script includes code (https://github.com/LAION-AI/CLIP_benchmark/blob/main/clip_benchmark/metrics/zeroshot_classification.py#L50) that performs normalization of the text embedding twice. Specifically, the F.normalize function from PyTorch is called to normalize the text embedding along the last dimension, and the resulting tensor is then averaged along the first dimension to obtain a single embedding vector. However, the code then repeats the normalization step on this single embedding vector using class_embedding.norm(). This second normalization appears to be redundant and can be safely removed.

class_embeddings = model.encode_text(texts)
class_embedding = F.normalize(class_embeddings, dim=-1).mean(dim=0)
class_embedding /= class_embedding.norm() # Repeat Normalization

djghosh13 · 2023-04-10T21:20:44Z

I believe the second norm is necessary; note that class_embedding is the mean of multiple class_embeddings and isn't guaranteed to be normalized.

mehdidc · 2023-07-07T22:45:24Z

Indeed, after averaging normalization is not guaranteed anymore, so this is still needed.

mehdidc closed this as completed Jul 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performs normalization of the text embedding twice #90

Performs normalization of the text embedding twice #90

Adamdad commented Apr 3, 2023

djghosh13 commented Apr 10, 2023

mehdidc commented Jul 7, 2023

Performs normalization of the text embedding twice #90

Performs normalization of the text embedding twice #90

Comments

Adamdad commented Apr 3, 2023

djghosh13 commented Apr 10, 2023

mehdidc commented Jul 7, 2023