Skip to content

This repository contains the official code for "Flexible Biometrics Recognition: Bridging the Multimodality Gap through Attention, Alignment and Prompt Tuning," presented at CVPR 2024.

License

Notifications You must be signed in to change notification settings

MIS-DevWorks/FBR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Flexible Biometrics Recognition: Bridging the Multimodality Gap through Attention, Alignment and Prompt Tuning

Leslie Ching Ow Tiong*,1, Dick Sigmund*,2, Chen-Hui Chan3, Andrew Beng Jin Teoh†,4
1Samsung Electronics, 2AIDOT Inc., 3Korea Institute of Science and Technology, 4Yonsei University
*These authors contributed equally
Corresponding author

Main YouTube Video


Introduction

This repository contains the source code for the contributions in this article Flexible Biometrics Recognition: Bridging the Multimodality Gap through Attention, Alignment and Prompt Tuning, which is accepted by CVPR 2024.

MFA-ViT

Flexible Biometric Recognition (FBR) is designed to advance conventional face, periocular, and multimodal face-periocular biometrics across both intra- and cross-modality recognition tasks. FBR strategically utilizes the Multimodal Fusion Attention (MFA) and Multimodal Prompt Tuning (MPT) mechanisms within the Vision Transformer architecture.

MFA facilitates the fusion of modalities, ensuring cohesive alignment between facial and periocular embeddings while incorporating soft-biometrics to enhance the model’s ability to discriminate between individuals. The fusion of three modalities is pivotal in exploring interrelationships between different modalities. MPT serves as a unifying bridge, intertwining inputs and promoting cross-modality interactions while preserving their distinctive characteristics.


Dataset

We utilize the VGGFace2 and MAAD datasets to train this model, which are available as follows:

Requirements

  1. Anaconda3
  2. PyTorch
  3. RTDL
  4. Natsort

Coding Usage

  • For training, please run main.py with the given configurations in config.py
$ python main.py --training_mode --dataset_name "VGGFace2"
  • For evaluation, please run main.py with the given configurations in config.py
$ python main.py --dataset_name "other"

For better understanding, please refer to this code example directory for further details. In this directory, we provide two examples: code usage(Dataset).py and code usage(MFA-ViT).py.

Compatibility

We tested the codes with:

  1. PyTorch 1.13.1 with and without GPU, under Ubuntu 18.04/20.04 and Anaconda3 (Python 3.8 and above)
  2. PyTorch 1.12.0 with and without GPU, under Windows 10 and Anaconda3 (Python 3.8 and above)

Pretrained Download

Our pretrained model can be accessed via here. Upon downloading, it should be decompressed and configured within a directory named pretrained to ensure proper setup.

License

This work is an open-source under MIT license.

Citing

@InProceedings{FBR_2024_CVPR,
    author    = {Tiong, Leslie Ching Ow and Sigmund, Dick and Chan, Chen-Hui and Teoh, Andrew Beng Jin},
    title     = {Flexible Biometrics Recognition: Bridging the Multimodality Gap through Attention, Alignment and Prompt Tuning},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    year      = {2024},
    pages     = {267--276}
}

About

This repository contains the official code for "Flexible Biometrics Recognition: Bridging the Multimodality Gap through Attention, Alignment and Prompt Tuning," presented at CVPR 2024.

Topics

Resources

License

Stars

Watchers

Forks

Languages