Papers for Talking Head Generation, released codes collections.
Any addition or bug about talking head generation,please open an issue, pull requests or e-mail me by [email protected]
. If you are researching in talking head generation task, you can add my discord account: Fa-Ting Hong#6563 for better communication and cooperations.
🔥I am currently seeking a job or postdoctoral position. If you are interested in my qualifications and experience, please feel free to contact me. 🔥
- VoxCeleb1 [
Download link
]. - VoxCeleb2 [
Download link
]. - Faceforensics++ [
Download link
]. - CelebV [
Download link
]. - TalkingHead-1KH [
Download link
]. - LRW (Lip Reading in the Wild) [
Download link
]. - MEAD [
Download link
]. - CelebV-HQ [
Download link
]. - CHDTF [
Download link
].
- [Face2face] Face2face: Real-time face capture and reenactment of RGB videos,
CVPR 2016
.
- [ReenactGAN] ReenactGAN: Learning to Reenact Faces via Boundary Transfer,
ECCV 2018
. [Code]. - [X2Face] X2Face: A network for controlling face generation by using images, audio, and pose codes,
ECCV 2018
. [Code], [Project].
- [FOMM] First order motion model for image animation,
NeurIPS 2019
. [Code]. - [NeuralHead]Few-Shot Adversarial Learning of
Realistic Neural Talking Head models,
ICCV 2019
. [Code]. - [Monkey-Net]Animating Arbitrary Objects via Deep Motion Transfer,
CVPR 2019 Oral
. [Code], [Project]. - [fs-vid2vid]Few-shot Video-to-Video Synthesis,
NeurIPS 2019
. [Code], [Project].
-
[MeshG] Mesh Guided One-shot Face Reenactment Using Graph Convolutional Networks,
ACM Multimedia 2020
. [Code]. -
[MarioNETte] MarioNETte: Few-shot Face Reenactment Preserving Identity of Unseen Targets,
AAAI 2020
. [Project]. -
[CrossID-GAN] Learning Identity-Invariant Motion Representations for Cross-ID Face Reenactment,
CVPR 2020
.
-
[face-vid2vid] One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing,
CVPR 2021 Oral
. [Project]. -
[S2D] Sparse to Dense Motion Transfer for Face Image Animation,
ICCV 2021
. -
[SAFA] SAFA: Structure Aware Face Animation,
3DV 2021
. [Code] -
[SAA] Self-appearance-aided Differential Evolution for Motion Transfer,
arXiv 2021
. -
[PIRenderer]PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering,
ICCV 2021
. [Code] -
[FaceGAN]FACEGAN: Facial Attribute Controllable rEenactment GAN,
WACV 2021
. -
[F^3A-GAN]F3A-GAN: Facial Flow for Face Animation With Generative Adversarial Networks,
IEEE TIP 2021
. -
[FACIAL]FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning,
ICCV 2021
. -
[MRAA] Motion Representations for Articulated Animation,
CVPR 2021
. [Code] -
[HeadGAN]HeadGAN: One-shot Neural Head Synthesis and Editing,
ICCV 2021
. [Project]
-
[DaGAN]Depth-Aware Generative Adversarial Network for Talking Head Video Generation,
CVPR 2022
. [Code], [Project] -
[TPSM]Thin-Plate Spline Motion Model for Image Animation,
CVPR 2022
. [Code] -
[StyleHEAT]StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pretrained StyleGAN,
ECCV 2022
. [Code], [Project] -
[MegaPortraits]MegaPortraits: One-shot Megapixel Neural Head Avatars,
ACM MM 2022
. [Project] -
[DAM]Structure-Aware Motion Transfer with Deformable Anchor Model,
CVPR 2022
. [Code] -
[StyleMask]StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face Reenactment,
FG, 2023
. [Code] -
[CoRF]Controllable Radiance Fields for Dynamic Face Synthesis,
Arxiv 2022
. -
[AniFaceGAN]Animatable 3D-Aware Face Image Generation for Video Avatars,
NeurIPS 2022
. [Project] -
[IW]Implicit Warping for Animation with Image Sets,
NeurIPS 2022
. [Project] -
[HifiHead]HifiHead: One-Shot High Fidelity Neural Head Synthesis with 3D Control,
IJCAI 2022
. -
Face Animation with Multiple Source Images,
Arxiv 2022
. -
[MetaPortrait]MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation,
Arxiv 2022
. -
Compressing Video Calls using Synthetic Talking Heads,
BMVC 2022
. [Project] -
Finding Directions in GAN’s Latent Space for Neural Face Reenactment,
BMVC 2022
. [Project] [Code] -
[LIA]Latent Image Animator: Learning to Animate Images via Latent Space Navigation,
ICLR 2022
. [Project] [Code]
-
[AVFR-GAN]Audio-Visual Face Reenactment,
WACV 2023
. [Code], [Project] -
[TS-Net]Cross-identity Video Motion Retargeting with Joint Transformation and Synthesis,
WACV 2023
. [Code] -
[MCNET]Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head Video Generation,
ICCV 2023
. [Project] [Code]
-
[X-Portrait] X-Portrait: Expressive Portrait Animation with Hierarchical Motion Attention,
arXiv 2024
. -
[LivePortrait] LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control [Code] [Project]
-
[EMOPortraits] EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars,
CVPR 2024
. [Code], [Project] -
[SMA] Synergizing Motion and Appearance: Multi-Scale Compensatory Codebooks for Talking Head Video Generation,
CVPR 2024
. [Project]
- [LRW] Lip Reading in the Wild,
ACCV 2016
.
- [Synthesizing-Obama] Synthesizing Obama: Learning Lip Sync From Audio,
SIGGRAPH 2017
. [Project]. - [You-Said-That?] You Said That?: Synthesising Talking Faces From Audio,
IJCV 2019
. [Code]. - Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion,
SIGGRAPH 2017
. - A Deep Learning Approach for Generalized Speech Animation,
SIGGRAPH 2017
.
- Lip Movements Generation at a Glance,
ECCV 2018
. [Code]. - [VisemeNet] VisemeNet: Audio-Driven Animator-Centric Speech Animation,
SIGGRAPH 2018
.
- [DAVS] Talking Face Generation by Adversarially Disentangled Audio-Visual Representation,
AAAI 2019
. [Code]. - [ATVGnet] Hierarchical Cross-modal Talking Face Generation with Dynamic Pixel-wise Loss,
CVPR 2019
. [Code]
- [Wav2Lip] A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild,
ACM Multimedia 2020
. [Code], [Project]. - [RhythmicHead] Talking-head Generation with Rhythmic Head Motion,
ECCV 2020
. [Code]. - [MakeItTalk] MakeItTalk: Speaker-Aware Talking-Head Animation,
SIGGRAPH Asia 2020
. [Code], [Project]. - [Neural Voice Puppetry] Neural Voice Puppetry: Audio-driven Facial Reenactment,
ECCV 2020
. [Code], [Project]. - [MEAD] MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation,
ECCV 2020
. [Code], [Project]. - Realistic Speech-Driven Facial Animation with GANs,
IJCV 2020
.
- [PC-AVS] Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation,
CVPR 2021
. [Code], [Project]. - [IATS]Imitating Arbitrary Talking Style for Realistic Audio-Driven Talking Face Synthesis,
ACM Multimedia 2021
.. - [EVP] Audio-Driven Emotional Video Portraits,
CVPR 2021
. [Code] - [FAU] Talking Head Generation with Audio and Speech Related Facial Action Units,
arxiv 2021
. - [Speech2Talking-Face] Speech2Talking-Face: Inferring and Driving a Face with Synchronized Audio-Visual Representation,
IJCAI 2021
. - [IATS] Imitating Arbitrary Talking Style for Realistic Audio-Driven Talking Face Synthesis,
ACM MM 2021
. - [LSP] Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation,
ACM TOG 2021
. [Code] - [Audio2head] Audio2head: Audio-driven one-shot talking-head generation with natural head motion,
ArXiv 2021
.
- [GC-AVT] Expressive Talking Head Generation with Granular Audio-Visual Control ,
CVPR 2022
. - Talking Face Generation with Multilingual TTS,
CVPR 2022
. [Demo Track]. - [EAMM] EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model,
SIGGRAPH 2022
. - [SPACEx] SPACEx 🚀: Speech-driven Portrait Animation with Controllable Expression,
arXiv 2022
. [Project]CVPR 2023
- [AV-CAT] Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers,
SIGGRAPH Asia 2022
. - [MemFace] Memories are One-to-Many Mapping Alleviators in Talking Face Generation,
arXiv 2022
.
-
[Diffused Heads] Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation,
Arxiv 2023
. [Project] 🔥Diffusion🔥 -
[DiffTalk] DiffTalk: Crafting Diffusion Models for Generalized Talking Head Synthesis,
Arxiv 2023
. [Project] [Code] 🔥Diffusion🔥 -
[READ] [READ Avatars: Realistic Emotion-controllable Audio Driven Avatars](READ Avatars: Realistic Emotion-controllable Audio Driven Avatars),
Arxiv 2023
. -
[DAE-Talker] DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder,
Arxiv 2023
. 🔥Diffusion🔥 -
[EmoGen] Emotionally Enhanced Talking Face Generation,
Arxiv 2023
. [Code] -
[TalkLip] Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert,
CVPR 2023
. [Code] -
[StyleSync] StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator,
CVPR 2023
. [Project] [Code] -
[GeneFace++] GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation,
arXiv 2023
. [Project] [Code] -
[MODA] MODA: Mapping-Once Audio-driven Portrait Animation with Dual Attentions,
ICCV 2023
. -
[VividTalk] VividTalk: One-Shot Audio-Driven Talking Head Generation Based on 3D Hybrid Prior,
Arxiv 2023
. [Project] [Code] -
[IP_LAP] IP_LAP: Identity-Preserving Talking Face Generation with Landmark and Appearance Priors,
CVPR 2023
. [Code] -
[HyperLips] HyperLips: Hyper Control Lips with High Resolution Decoder for Talking Face Generation ,
CVPR 2023
. [Code] -
[EAT] Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation,
ICCV 2023
. [Project] [Code] -
[SadTalker] SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Talking Head Animation,
CVPR 2023
. [Project] [Code]
-
[Real3DPortrait] Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis ,
ICLR 2024
. [Project] [Code] -
[EMO] Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions ,
arXiv 2024
. [Project] [Code] -
[Style2Talker] Style2Talker: High-Resolution Talking Head Generation with Emotion Style and Art Style ,
AAAI 2024
. -
[SaaS] Say Anything with Any Style,
AAAI 2024
. -
[MuseTalk] Real-Time High Quality Lip Synchorization with Latent Space Inpainting, [Code].
-
[VASA-1] VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time,
arXiv 2024
. [Project]. -
[THQA] THQA: A Perceptual Quality Assessment Database for Talking Heads,
arXiv 2024
. [Code]. -
[Talk3D] Talk3D: High-Fidelity Talking Portrait Synthesis via Personalized 3D Generative Prior,
arXiv 2024
. [Code] [Project] -
[EDTalk] EDTalk: Efficient Disentanglement for Emotional Talking Head Synthesis,
arXiv 2024
. [Code] [Project] -
[AniPortrait] AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animations,
arXiv 2024
. [Code] -
[FlowVQTalker] FlowVQTalker: High-Quality Emotional Talking Face Generation through Normalizing Flow and Quantization,
arXiv 2024
. -
[FaceChain-ImagineID] FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio,
arXiv 2024
. [Code] -
[Hallo] Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation,
arXiv 2024
. [Code] -
[EchoMimic]EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditions,
arXiv 2024
. [Code], [Project] -
[RealTalk]RealTalk: Real-time and Realistic Audio-driven Face Generation with 3D Facial Prior-guided Identity Alignment Network,
arXiv 2024
. -
[Emotional Conversation]Emotional Conversation: Empowering Talking Faces with Cohesive Expression, Gaze and Pose Generation,
arXiv 2024
. -
[Make Your Actor Talk]Make Your Actor Talk: Generalizable and High-Fidelity Lip Sync with Motion and Appearance Disentanglement,
arXiv 2024
. -
[FD2Talk]FD2Talk: Towards Generalized Talking Head Generation with Facial Decoupled Diffusion Model,
arXiv 2024
. -
[ReSyncer]ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually Synced Facial Performer,
arXiv 2024
. -
[StyleSync]Style-Preserving Lip Sync via Audio-Aware Style Reference,
arXiv 2024
. -
[Loopy]Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency,
arXiv 2024
. [Project] -
[DAWN]DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation,
arXiv 2024
. [Project], [Code] -
[EchoMimicV2]EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation,
arXiv 2024
. [Code], [Project] -
[LetsTalk]Latent Diffusion Transformer for Talking Video Synthesis,
arXiv 2024
. [Code], [Project] -
[IF-MDM]Implicit Face Motion Diffusion Model for High-Fidelity Realtime Talking Head Generation,
arXiv 2024
. [Project] -
[INFP]Audio-Driven Interactive Head Generation in Dyadic Conversations,
arXiv 2024
. [Project] -
[MEMO]Memory-Guided Diffusion for Expressive Talking Video Generation,
arXiv 2024
. [Project], [Code] -
[FLOAT] Generative Motion Latent Flow Matching for Audio-driven Talking Portrait,
arXiv 2024
. [Project] -
[Hallo3]Highly Dynamic and Realistic Portrait Image Animation with Diffusion Transformer Networks,
arXiv 2024
.
-
[DFA-NeRF] DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering,
arxiv, 2021
. -
[NerFACE] NerFACE: Dynamic Neural Radiance Fields for Monocular 4D Facial Avatar Reconstruction,
CVPR 2021 Oral
. [Code], [Project] -
[AD-NeRF] AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis,
ICCV 2021
. [Code], [Code]
-
[SSP-NeRFF] Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation,
arxiv, 2022
. -
[HeadNeRF] HeadNeRF: A Real-time NeRF-based Parametric Head Model,
CVPR 2022
. [Code], [Project] -
[IMavatar] I M Avatar: Implicit Morphable Head Avatars from Videos,
CVPR 2022
. [Code] -
[ROME] Realistic One-shot Mesh-based Head Avatars,
ECCV 2022
. -
[FNeVR] FNeVR: Neural Volume Rendering for Face Animation,
Arxiv 2022
. [Code] -
[3DFaceShop] 3DFaceShop: Explicitly Controllable 3D-Aware Portrait Generation,
Arxiv 2022
. [Code],[Project] -
[Next3D] Generative Neural Texture Rasterization for 3D-Aware Head Avatars,
Arxiv 2022
.[Project] -
[NeRFInvertor] NeRFInvertor: High Fidelity NeRF-GAN Inversion for Single-shot Real Image Animation,
Arxiv 2022
. -
[DFRF] Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis,
ECCV 2022
. [Code]
- [CVTHead] CVTHead: One-shot Controllable Head Avatar with Vertex-feature Transformer,
WACV 2024
. [Code]. - [Head3D] 3D-Aware Talking-Head Video Motion Transfer,
WACV 2024
.
- [DiscoFaceGAN
] Disentangled and Controllable Face Image Generation via 3D Imitative-Contrastive Learning
,
CVPR 2020 Oral
. [Code].