Stars
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
Code Repository for MeshAvatar: Learning High-quality Triangular Human Avatars from Multi-view Videos (ECCV 2024)
DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention
VideoTetris: Towards Compositional Text-To-Video Generation
V-Express aims to generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images.
Vidu4D: Single Generated Video to High-Fidelity 4D Reconstruction with Dynamic Gaussian Surfels
Official implementation of "MoMask: Generative Masked Modeling of 3D Human Motions (CVPR2024)"
Code of [CVPR 2024] "Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling"
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Official implementation of "En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D Synthetic Data"
Code and dataset for photorealistic Codec Avatars driven from audio
[CVPR 2024] Official repository for "Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians"
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
[CVPR 2023] POEM: Reconstructing Hand in a Point Embedded Multi-view Stereo
Official code for "Mesh Density Adaptation for Template-based Shape Reconstruction". SIGGRAPH 2023
ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation (NeurIPS 2023 Spotlight)
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
A fast and flexible implementation of Rigid Body Dynamics algorithms and their analytical derivatives
Training and Evaluation Code for "Mixture of Volumetric Primitives for Efficient Neural Rendering"
Code for Text2Human (SIGGRAPH 2022). Paper: Text2Human: Text-Driven Controllable Human Image Generation
Code for "Neural 3D Reconstruction in the Wild", SIGGRAPH 2022 (Conference Proceedings)
ASSET: Autoregressive Semantic Scene Editing with Transformers at High Resolutions (SIGGRAPH 2022 - Journal Track)
[TPAMI 2023] Recovering 3D Human Mesh from Monocular Images: A Survey
[CVPR'22] ICON: Implicit Clothed humans Obtained from Normals
Official respository for "Band-limited Coordinate Networks for Multiscale Scene Representation" | CVPR 2022