-
ByteDance
- San Jose
- http://chongyangma.com/
Stars
📹 A more flexible CogVideoX that can generate videos at any resolution and creates videos from images.
An open-source impl. of Large Reconstruction Models
High-resolution models for human tasks.
Various AI scripts. Mostly Stable Diffusion stuff.
Official codes of VEnhancer: Generative Space-Time Enhancement for Video Generation
real time face swap and one-click video deepfake with only a single image
Text-to-video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Understand Human Behavior to Align True Needs
I2V-Adapter: A General Image-to-Video Adapter for Diffusion Models
📺 An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusion
Official code for PuLID: Pure and Lightning ID Customization via Contrastive Alignment
Official Implementation for "MyVLM: Personalizing VLMs for User-Specific Queries" (ECCV 2024)
🎥 Python and OpenCV-based scene cut/transition detection program & library.
Official Pytorch Implementation for "TokenFlow: Consistent Diffusion Features for Consistent Video Editing" presenting "TokenFlow" (ICLR 2024)
Open-Sora: Democratizing Efficient Video Production for All
Transparent Image Layer Diffusion using Latent Transparency
[ECCV 2024] Official Repository for DiffiT: Diffusion Vision Transformers for Image Generation
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
[ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
Hosts the Multiface dataset, which is a multi-view dataset of multiple identities performing a sequence of facial expressions.
LAVIS - A One-stop Library for Language-Vision Intelligence