Flexible Biometrics Recognition: Bridging the Multimodality Gap through Attention, Alignment and Prompt Tuning
Leslie Ching Ow Tiong*,1, Dick Sigmund*,2, Chen-Hui Chan3, Andrew Beng Jin Teoh†,4
1Samsung Electronics, 2AIDOT Inc., 3Korea Institute of Science and Technology, 4Yonsei University
*These authors contributed equally
†Corresponding author
This repository contains the source code for the contributions in this article Flexible Biometrics Recognition: Bridging the Multimodality Gap through Attention, Alignment and Prompt Tuning, which is accepted by CVPR 2024.
Flexible Biometric Recognition (FBR) is designed to advance conventional face, periocular, and multimodal face-periocular biometrics across both intra- and cross-modality recognition tasks. FBR strategically utilizes the Multimodal Fusion Attention (MFA) and Multimodal Prompt Tuning (MPT) mechanisms within the Vision Transformer architecture.
MFA facilitates the fusion of modalities, ensuring cohesive alignment between facial and periocular embeddings while incorporating soft-biometrics to enhance the model’s ability to discriminate between individuals. The fusion of three modalities is pivotal in exploring interrelationships between different modalities. MPT serves as a unifying bridge, intertwining inputs and promoting cross-modality interactions while preserving their distinctive characteristics.
We utilize the VGGFace2 and MAAD datasets to train this model, which are available as follows:
- For training, please run
main.py
with the given configurations in config.py
$ python main.py --training_mode --dataset_name "VGGFace2"
- For evaluation, please run
main.py
with the given configurations in config.py
$ python main.py --dataset_name "other"
For better understanding, please refer to this code example directory for further details. In this directory, we provide two examples: code usage(Dataset).py
and code usage(MFA-ViT).py
.
We tested the codes with:
- PyTorch 1.13.1 with and without GPU, under Ubuntu 18.04/20.04 and Anaconda3 (Python 3.8 and above)
- PyTorch 1.12.0 with and without GPU, under Windows 10 and Anaconda3 (Python 3.8 and above)
Our pretrained model can be accessed via here. Upon downloading, it should be decompressed and configured within a directory named pretrained to ensure proper setup.
This work is an open-source under MIT license.
@InProceedings{FBR_2024_CVPR,
author = {Tiong, Leslie Ching Ow and Sigmund, Dick and Chan, Chen-Hui and Teoh, Andrew Beng Jin},
title = {Flexible Biometrics Recognition: Bridging the Multimodality Gap through Attention, Alignment and Prompt Tuning},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2024},
pages = {267--276}
}