Stars
[CVPR 2024] This is the official source for our paper "SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis"
Pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing"
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation (SIGGRAPH Asia 2021)
[CVPR 2022] FaceFormer: Speech-Driven 3D Facial Animation with Transformers
The source code and data of the paper "HIST: A Graph-based Framework for Stock Trend Forecasting via Mining Concept-Oriented Shared Information".
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
Open-Sora: Democratizing Efficient Video Production for All
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
A simple, high-quality voice conversion tool focused on ease of use and performance.
Real-time end-to-end singing voice conversion system based on DDSP (Differentiable Digital Signal Processing)
Simple, Unified Repository for Retrieval-based Voice Conversion
so-vits-svc fork with realtime support, improved interface and more features.
SoftVC VITS Singing Voice Conversion
Arxiv - Partial Large Kerenl CNNs for Efficient Super-Resolution
Easily train a good VC model with voice data <= 10 mins!
Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
🍦 ChatTTS-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.
The best way to write secure and reliable applications. Write nothing; deploy nowhere.
Multilingual Voice Understanding Model
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
This repository is the implementation of the HiPAMA architecture, introduced in the paper, Hierarchical Pronunciation Assessment with Multi-Aspect Attention (ICASSP 2023).
A non-native English corpus for pronunciation scoring task