SHERL

PyTorch implementation for ECCV2024 paper of “SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning”.

It is built on top of the UniPT, LST, VSE-infty, CLIP-ViL, CLIP4Clip, MDETR and Awesome_Pretraining_Transfering.

If any problems, please contact me at [email protected]. ([email protected] is deprecated)

Introduction

We propose an innovative strategy called SHERL for resource-limited scenarios. It decouples the entire adaptation into two successive and complementary processes. In the early route, intermediate outputs are consolidated via an anti-redundancy operation, enhancing their compatibility for subsequent interactions; thereby in the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead and regulate these fairly flexible features into more adaptive and powerful representations for new domains.

The framework and applications of SHERL:

Task & Model Details

Image-Text Retrieval: VSE-infty with the strongest combination of a BERT-base model and a ResNeXt-101(32×8d) backbone pre-trained on Instagram (WSL).

Video-Text Retrieval: CLIP4Clip with the pre-trained CLIP network using Text Transformer and ViT-B/32 models.

Question Answering: CLIP-ViL that utilizes the CLIP image backbone and encodes the text into the word embedding sequence, followed by a cross-modal Transformer.

Visual Grounding: MDETR with a pre-trained ResNet-101 vision encoder, a RoBERTa-base text encoder, and a query-based encoder-decoder Transformer.

Language-only Tasks: LST with pre-trained T5-Base, T5-Large, T5-3B text encoders, using a standard encoder-decoder Transformer.

Please refer to their respective README.md file for the detailed settings.

Guidance for Applications

We summarize the positions where SHERL is defined and invoked in each work (similar to UniPT):
We hope these help you quickly realize your idea beyond SHERL.

Reference

If SHERL is useful for your research, please cite the following paper:

    @article{Diao2024SHERL,
        title={SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning},
        author={Diao, Haiwen and Wan, Bo and Jia, Xu and Zhuge, Yunzhi and Zhang, Ying and Lu, Huchuan and Chen, Long},
        journal={arXiv preprint arXiv:2407.07523},
        year={2024}
    }

License

Apache License 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
0-docs		0-docs
CLIP-ViL		CLIP-ViL
CLIP4Clip		CLIP4Clip
Ladder-Side-Tuning		Ladder-Side-Tuning
MDETR		MDETR
VSE-infty		VSE-infty
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SHERL

Introduction

Task & Model Details

Guidance for Applications

Reference

License

About

Releases

Packages

Languages

License

Paranioar/SHERL

Folders and files

Latest commit

History

Repository files navigation

SHERL

Introduction

Task & Model Details

Guidance for Applications

Reference

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages