awesome-talking-head-generation

Papers for Talking Head Generation, released codes collections.

Any addition or bug about talking head generation,please open an issue, pull requests or e-mail me by [email protected]. If you are researching in talking head generation task, you can add my discord account: Fa-Ting Hong#6563 for better communication and cooperations.

🔥I am currently seeking a job or postdoctoral position. If you are interested in my qualifications and experience, please feel free to contact me. 🔥

Related Group

Datasets

VoxCeleb1 [Download link].
VoxCeleb2 [Download link].
Faceforensics++ [Download link].
CelebV [Download link].
TalkingHead-1KH [Download link].
LRW (Lip Reading in the Wild) [Download link].
MEAD [Download link].
CelebV-HQ [Download link].
CHDTF [Download link].

Image-driven

Audio-driven

2016

[LRW] Lip Reading in the Wild, ACCV 2016.

2017

[Synthesizing-Obama] Synthesizing Obama: Learning Lip Sync From Audio, SIGGRAPH 2017. [Project].
[You-Said-That?] You Said That?: Synthesising Talking Faces From Audio, IJCV 2019. [Code].
Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion, SIGGRAPH 2017.
A Deep Learning Approach for Generalized Speech Animation, SIGGRAPH 2017.

2018

Lip Movements Generation at a Glance, ECCV 2018. [Code].
[VisemeNet] VisemeNet: Audio-Driven Animator-Centric Speech Animation, SIGGRAPH 2018.

2019

[DAVS] Talking Face Generation by Adversarially Disentangled Audio-Visual Representation, AAAI 2019. [Code].
[ATVGnet] Hierarchical Cross-modal Talking Face Generation with Dynamic Pixel-wise Loss, CVPR 2019. [Code]

2020

[Wav2Lip] A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild, ACM Multimedia 2020. [Code], [Project].
[RhythmicHead] Talking-head Generation with Rhythmic Head Motion, ECCV 2020. [Code].
[MakeItTalk] MakeItTalk: Speaker-Aware Talking-Head Animation, SIGGRAPH Asia 2020. [Code], [Project].
[Neural Voice Puppetry] Neural Voice Puppetry: Audio-driven Facial Reenactment, ECCV 2020. [Code], [Project].
[MEAD] MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation, ECCV 2020. [Code], [Project].
Realistic Speech-Driven Facial Animation with GANs, IJCV 2020.

2021

[PC-AVS] Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation, CVPR 2021. [Code], [Project].
[IATS]Imitating Arbitrary Talking Style for Realistic Audio-Driven Talking Face Synthesis,ACM Multimedia 2021..
[EVP] Audio-Driven Emotional Video Portraits, CVPR 2021. [Code]
[FAU] Talking Head Generation with Audio and Speech Related Facial Action Units, arxiv 2021.
[Speech2Talking-Face] Speech2Talking-Face: Inferring and Driving a Face with Synchronized Audio-Visual Representation, IJCAI 2021.
[IATS] Imitating Arbitrary Talking Style for Realistic Audio-Driven Talking Face Synthesis, ACM MM 2021.
[LSP] Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation, ACM TOG 2021. [Code]
[Audio2head] Audio2head: Audio-driven one-shot talking-head generation with natural head motion, ArXiv 2021.

2022

[GC-AVT] Expressive Talking Head Generation with Granular Audio-Visual Control , CVPR 2022.
Talking Face Generation with Multilingual TTS, CVPR 2022. [Demo Track].
[EAMM] EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model, SIGGRAPH 2022.
[SPACEx] SPACEx 🚀: Speech-driven Portrait Animation with Controllable Expression, arXiv 2022. [Project] CVPR 2023
[AV-CAT] Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers, SIGGRAPH Asia 2022.
[MemFace] Memories are One-to-Many Mapping Alleviators in Talking Face Generation, arXiv 2022.

2023

[Diffused Heads] Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation, Arxiv 2023. [Project] 🔥Diffusion🔥
[DiffTalk] DiffTalk: Crafting Diffusion Models for Generalized Talking Head Synthesis, Arxiv 2023. [Project] [Code] 🔥Diffusion🔥
[READ] [READ Avatars: Realistic Emotion-controllable Audio Driven Avatars](READ Avatars: Realistic Emotion-controllable Audio Driven Avatars), Arxiv 2023.
[DAE-Talker] DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder, Arxiv 2023. 🔥Diffusion🔥
[EmoGen] Emotionally Enhanced Talking Face Generation, Arxiv 2023. [Code]
[TalkLip] Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert, CVPR 2023. [Code]
[StyleSync] StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator, CVPR 2023. [Project] [Code]
[GeneFace++] GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation, arXiv 2023. [Project] [Code]
[MODA] MODA: Mapping-Once Audio-driven Portrait Animation with Dual Attentions, ICCV 2023.
[VividTalk] VividTalk: One-Shot Audio-Driven Talking Head Generation Based on 3D Hybrid Prior, Arxiv 2023. [Project] [Code]
[IP_LAP] IP_LAP: Identity-Preserving Talking Face Generation with Landmark and Appearance Priors, CVPR 2023. [Code]
[HyperLips] HyperLips: Hyper Control Lips with High Resolution Decoder for Talking Face Generation , CVPR 2023. [Code]
[EAT] Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation, ICCV 2023. [Project] [Code]
[SadTalker] SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Talking Head Animation, CVPR 2023. [Project] [Code]

2024

[Real3DPortrait] Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis , ICLR 2024. [Project] [Code]
[EMO] Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions , arXiv 2024. [Project] [Code]
[Style2Talker] Style2Talker: High-Resolution Talking Head Generation with Emotion Style and Art Style , AAAI 2024.
[SaaS] Say Anything with Any Style, AAAI 2024.
[MuseTalk] Real-Time High Quality Lip Synchorization with Latent Space Inpainting, [Code].
[VASA-1] VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time, arXiv 2024. [Project].
[THQA] THQA: A Perceptual Quality Assessment Database for Talking Heads, arXiv 2024. [Code].
[Talk3D] Talk3D: High-Fidelity Talking Portrait Synthesis via Personalized 3D Generative Prior, arXiv 2024. [Code] [Project]
[EDTalk] EDTalk: Efficient Disentanglement for Emotional Talking Head Synthesis, arXiv 2024. [Code] [Project]
[AniPortrait] AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animations, arXiv 2024. [Code]
[FlowVQTalker] FlowVQTalker: High-Quality Emotional Talking Face Generation through Normalizing Flow and Quantization, arXiv 2024.
[FaceChain-ImagineID] FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio, arXiv 2024. [Code]
[Hallo] Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation, arXiv 2024. [Code]
[EchoMimic]EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditions, arXiv 2024. [Code], [Project]
[RealTalk]RealTalk: Real-time and Realistic Audio-driven Face Generation with 3D Facial Prior-guided Identity Alignment Network, arXiv 2024.
[Emotional Conversation]Emotional Conversation: Empowering Talking Faces with Cohesive Expression, Gaze and Pose Generation, arXiv 2024.
[Make Your Actor Talk]Make Your Actor Talk: Generalizable and High-Fidelity Lip Sync with Motion and Appearance Disentanglement, arXiv 2024.
[FD2Talk]FD2Talk: Towards Generalized Talking Head Generation with Facial Decoupled Diffusion Model, arXiv 2024.
[ReSyncer]ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually Synced Facial Performer, arXiv 2024.
[StyleSync]Style-Preserving Lip Sync via Audio-Aware Style Reference, arXiv 2024.
[Loopy]Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency, arXiv 2024. [Project]
[DAWN]DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation, arXiv 2024. [Project], [Code]
[EchoMimicV2]EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation, arXiv 2024. [Code], [Project]
[LetsTalk]Latent Diffusion Transformer for Talking Video Synthesis, arXiv 2024. [Code], [Project]
[IF-MDM]Implicit Face Motion Diffusion Model for High-Fidelity Realtime Talking Head Generation, arXiv 2024. [Project]
[INFP]Audio-Driven Interactive Head Generation in Dyadic Conversations, arXiv 2024. [Project]
[MEMO]Memory-Guided Diffusion for Expressive Talking Video Generation, arXiv 2024. [Project], [Code]
[FLOAT] Generative Motion Latent Flow Matching for Audio-driven Talking Portrait, arXiv 2024. [Project]
[Hallo3]Highly Dynamic and Realistic Portrait Image Animation with Diffusion Transformer Networks, arXiv 2024.

Nerf & 3D

2021

[DFA-NeRF] DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering, arxiv, 2021.
[NerFACE] NerFACE: Dynamic Neural Radiance Fields for Monocular 4D Facial Avatar Reconstruction, CVPR 2021 Oral. [Code], [Project]
[AD-NeRF] AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis, ICCV 2021. [Code], [Code]

2022

[SSP-NeRFF] Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation, arxiv, 2022.
[HeadNeRF] HeadNeRF: A Real-time NeRF-based Parametric Head Model, CVPR 2022. [Code], [Project]
[IMavatar] I M Avatar: Implicit Morphable Head Avatars from Videos, CVPR 2022. [Code]
[ROME] Realistic One-shot Mesh-based Head Avatars, ECCV 2022.
[FNeVR] FNeVR: Neural Volume Rendering for Face Animation, Arxiv 2022. [Code]
[3DFaceShop] 3DFaceShop: Explicitly Controllable 3D-Aware Portrait Generation, Arxiv 2022. [Code],[Project]
[Next3D] Generative Neural Texture Rasterization for 3D-Aware Head Avatars, Arxiv 2022.[Project]
[NeRFInvertor] NeRFInvertor: High Fidelity NeRF-GAN Inversion for Single-shot Real Image Animation, Arxiv 2022.
[DFRF] Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis, ECCV 2022. [Code]

2024

[CVTHead] CVTHead: One-shot Controllable Head Avatar with Vertex-feature Transformer, WACV 2024. [Code].
[Head3D] 3D-Aware Talking-Head Video Motion Transfer, WACV 2024.

Parameter-Based

2020

[DiscoFaceGAN ] Disentangled and Controllable Face Image Generation via 3D Imitative-Contrastive Learning , CVPR 2020 Oral. [Code].

Survey

2020

What comprises a good talking-head video generation?: A Survey and Benchmark.

2024

A Comparative Study of Perceptual Quality Metrics for Audio-driven Talking Head Videos [Code].

Name		Name	Last commit message	Last commit date
Latest commit History 112 Commits
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

awesome-talking-head-generation

Related Group

Datasets

Image-driven

2016

2018

2019

2020

2021

2022

2023

2024

Audio-driven

2016

2017

2018

2019

2020

2021

2022

2023

2024

Nerf & 3D

2021

2022

2024

Parameter-Based

2020

Survey

2020

2024

Star History

About

Releases

Packages

Contributors 14

harlanhong/awesome-talking-head-generation

Folders and files

Latest commit

History

Repository files navigation

awesome-talking-head-generation

Related Group

Datasets

Image-driven

2016

2018

2019

2020

2021

2022

2023

2024

Audio-driven

2016

2017

2018

2019

2020

2021

2022

2023

2024

Nerf & 3D

2021

2022

2024

Parameter-Based

2020

Survey

2020

2024

Star History

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 14

Packages