Ultimate-Awesome-Transformer-Attention

This repo contains a comprehensive paper list of Vision Transformer & Attention, including papers, codes, and related websites.
This list is maintained by Min-Hung Chen. (Actively keep updating)

If you find some ignored papers, feel free to create pull requests, open issues, or email me.
Contributions in any form to make this list more comprehensive are welcome.

If you find this repository useful, please consider citing and ★STARing this list.
Feel free to share this list with others!

[Update: February, 2023] Added all the related papers from ICLR 2023!
[Update: December, 2022] Added attention-free papers from Networks Beyond Attention (GitHub) made by Jianwei Yang
[Update: November, 2022] Added all the related papers from NeurIPS 2022!
[Update: October, 2022] Split the 2nd half of the paper list to README_2.md
[Update: October, 2022] Added all the related papers from ECCV 2022!
[Update: September, 2022] Added the Transformer tutorial slides made by Lucas Beyer!
[Update: June, 2022] Added all the related papers from CVPR 2022!

------ (The following papers are move to README_2.md) ------

Survey

"A Survey on Visual Transformer", TPAMI, 2022 (Huawei). [Paper]
"A Comprehensive Study of Vision Transformers on Dense Prediction Tasks", VISAP, 2022 (NavInfo Europe, Netherlands). [Paper]
"Vision-and-Language Pretrained Models: A Survey", IJCAI, 2022 (The University of Sydney). [Paper]
"Vision Transformers in Medical Imaging: A Review", arXiv, 2022 (Covenant University, Nigeria). [Paper]
"A Comprehensive Survey of Transformers for Computer Vision", arXiv, 2022 (Sejong University). [Paper]
"Vision-Language Pre-training: Basics, Recent Advances, and Future Trends", arXiv, 2022 (Microsoft). [Paper]
"Vision+X: A Survey on Multimodal Learning in the Light of Data", arXiv, 2022 (Illinois Institute of Technology, Chicago). [Paper]
"Vision Transformers for Action Recognition: A Survey", arXiv, 2022 (Charles Sturt University, Australia). [Paper]
"VLP: A Survey on Vision-Language Pre-training", arXiv, 2022 (CAS). [Paper]
"Transformers in Remote Sensing: A Survey", arXiv, 2022 (MBZUAI). [Paper][Github]
"Medical image analysis based on transformer: A Review", arXiv, 2022 (NUS, Singapore). [Paper]
"3D Vision with Transformers: A Survey", arXiv, 2022 (MBZUAI). [Paper][GitHub]
"Vision Transformers: State of the Art and Research Challenges", arXiv, 2022 (NYCU). [Paper]
"Transformers in Medical Imaging: A Survey", arXiv, 2022 (MBZUAI). [Paper][GitHub]
"Multimodal Learning with Transformers: A Survey", arXiv, 2022 (Oxford). [Paper]
"Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives", arXiv, 2022 (CAS). [Paper]
"Transformers in 3D Point Clouds: A Survey", arXiv, 2022 (University of Waterloo). [Paper]
"A survey on attention mechanisms for medical applications: are we moving towards better algorithms?", arXiv, 2022 (INESC TEC and University of Porto, Portugal). [Paper]
"Efficient Transformers: A Survey", arXiv, 2022 (Google). [Paper]
"Are we ready for a new paradigm shift? A Survey on Visual Deep MLP", arXiv, 2022 (Tsinghua). [Paper]
"Vision Transformers in Medical Computer Vision - A Contemplative Retrospection", arXiv, 2022 (National University of Sciences and Technology (NUST), Pakistan). [Paper]
"Video Transformers: A Survey", arXiv, 2022 (Universitat de Barcelona, Spain). [Paper]
"Transformers in Medical Image Analysis: A Review", arXiv, 2022 (Nanjing University). [Paper]
"Recent Advances in Vision Transformer: A Survey and Outlook of Recent Work", arXiv, 2022 (?). [Paper]
"Transformers Meet Visual Learning Understanding: A Comprehensive Review", arXiv, 2022 (Xidian University). [Paper]
"Image Captioning In the Transformer Age", arXiv, 2022 (Alibaba). [Paper][GitHub]
"Visual Attention Methods in Deep Learning: An In-Depth Survey", arXiv, 2022 (Fayoum University, Egypt). [Paper]
"Transformers in Vision: A Survey", ACM Computing Surveys, 2021 (MBZUAI). [Paper]
"Survey: Transformer based Video-Language Pre-training", arXiv, 2021 (Renmin University of China). [Paper]
"A Survey of Transformers", arXiv, 2021 (Fudan). [Paper]
"A Survey of Visual Transformers", arXiv, 2021 (CAS). [Paper]
"Attention mechanisms and deep learning for machine vision: A survey of the state of the art", arXiv, 2021 (University of Kashmir, India). [Paper]

Name		Name	Last commit message	Last commit date
Latest commit History 518 Commits
How-to-PR.md		How-to-PR.md
README.md		README.md
README_2.md		README_2.md

RTU4673/Awesome-Transformer-Attention

Folders and files

Latest commit

History

Repository files navigation

Ultimate-Awesome-Transformer-Attention

Overview

Survey

Image Classification / Backbone

Replace Conv w/ Attention

Pure Attention

Conv-stem + Attention

Conv + Attention

Vision Transformer

General Vision Transformer

Efficient Vision Transformer

Conv + Transformer

Training + Transformer

Robustness + Transformer

Model Compression + Transformer

Attention-Free

MLP-Series

Other Attention-Free

Analysis for Transformer

Detection

Object Detection

3D Object Detection

Multi-Modal Detection

HOI Detection

Salient Object Detection

Other Detection Tasks

Segmentation

Semantic Segmentation

Depth Estimation

Object Segmentation

Other Segmentation Tasks

Video (High-level)

Action Recognition

Action Detection/Localization

Action Prediction/Anticipation

Video Object Segmentation

Video Instance Segmentation

Other Video Tasks

Multi-Modality

Visual Captioning

Visual Question Answering

Visual Grounding

Multi-Modal Representation Learning

Multi-Modal Retrieval

Multi-Modal Generation

Visual Document Understanding

Other Multi-Modal Tasks

Citation

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages