GitHub - megvii-research/megactor

MegActor: Harness the Power of Raw Video for Vivid Portrait Animation

Shurong Yang^*, Huadong Li^*, Juhao Wu^*, Minhao Jing^*†, Linze Li, Renhe Ji^‡, Jiajun Liang^‡, Haoqiang Fan

MEGVII Technology

^*Equal contribution ^†Lead this project ^‡Corresponding author

News & TODO List

[TODO] The code of MegActor-Sigma will be cooming soon.
[🔥🔥🔥 2024.08.28] Arxiv MegActor-Sigma paper are released.
[✨✨✨ 2024.07.02] For ease of replication, we provide a 10-minute dataset available on Google Drive, which should yield satisfactory performance..
[🔥🔥🔥 2024.06.25] Training setup released. Please refer to Training for details.
[🔥🔥🔥 2024.06.25] Integrated into OpenBayes, see the demo. Thank OpenBayes team!
[🔥🔥🔥 2024.06.17] Demo Gradio Online are released .
[🔥🔥🔥 2024.06.13] Data curation pipeline are released .
[🔥🔥🔥 2024.05.31] Arxiv MegActor paper are released.
[🔥🔥🔥 2024.05.24] Inference settings are released.

MegActor Features:

Usability: animates a portrait with video while ensuring consistent motion.

Reproducibility: fully open-source and trained on publicly available datasets.

Efficiency: ⚡200 V100 hours of training to achieve pleasant motions on portraits.

Overview

MegActor is an intermediate-representation-free portrait animator that uses the original video, rather than intermediate features, as the driving factor to generate realistic and vivid talking head videos. Specifically, we utilize two UNets: one extracts the identity and background features from the source image, while the other accurately generates and integrates motion features directly derived from the original videos. MegActor can be trained on low-quality, publicly available datasets and excels in facial expressiveness, pose diversity, subtle controllability, and visual quality.

Pre-generated results

demo.mp4

demo4.mp4

demo6.mp4

Preparation

Environments

Detailed environment settings should be found with environment.yaml

Linux

conda env create -f environment.yaml
pip install -U openmim

mim install mmengine
mim install "mmcv>=2.0.1"
mim install "mmdet>=3.1.0"
mim install "mmpose>=1.1.0"

conda install -c conda-forge cudatoolkit-dev -y

Dataset.
- For a detailed description of the data processing procedure, please refer to the accompanying below. Data Process Pipeline
- You may refer to a 10-min dataset in this format at Google Drive.
Pretrained weights

Please find our pretrained weights at https://huggingface.co/HVSiniX/RawVideoDriven. Or simply use
```
git clone https://huggingface.co/HVSiniX/RawVideoDriven && ln -s RawVideoDriven/weights weights
```

Training

We currently support two-stage training on single node machines.

Stage1(Image training):

bash train.sh train.py ./configs/train/train_stage1.yaml {number of gpus on this node}

Stage2(Video training):

bash train.sh train.py ./configs/train/train_stage2.yaml {number of gpus on this node}

Inference

Currently only single-GPU inference is supported. We highly recommend that you use --contour-preserve arg the better preserve the shape of the source face.

CUDA_VISIBLE_DEVICES=0 python eval.py --config configs/inference/inference.yaml --source {source image path} --driver {driving video path} --contour-preserve

Demo

For gradio interface, please run

python demo/run_gradio.py

BibTeX

@misc{yang2024megactorsigmaunlockingflexiblemixedmodal,
      title={MegActor-$\Sigma$: Unlocking Flexible Mixed-Modal Control in Portrait Animation with Diffusion Transformer}, 
      author={Shurong Yang and Huadong Li and Juhao Wu and Minhao Jing and Linze Li and Renhe Ji and Jiajun Liang and Haoqiang Fan and Jin Wang},
      year={2024},
      eprint={2408.14975},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2408.14975}, 
}
@misc{yang2024megactor,
      title={MegActor: Harness the Power of Raw Video for Vivid Portrait Animation}, 
      author={Shurong Yang and Huadong Li and Juhao Wu and Minhao Jing and Linze Li and Renhe Ji and Jiajun Liang and Haoqiang Fan},
      year={2024},
      eprint={2405.20851},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement

Many thanks to the authors of mmengine, MagicAnimate, Controlnet_aux, and Detectron2.

Contact

If you have any questions, feel free to open an issue or contact us at [email protected], [email protected] or [email protected].

If you're seeking an internship and are interested in our work, please send your resume to [email protected] or [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 95 Commits
.github/workflows		.github/workflows
animate		animate
assets		assets
configs		configs
controlnet_aux_lib		controlnet_aux_lib
controlnet_resource		controlnet_resource
data_processing		data_processing
demo		demo
detectron2 @ 79f9147		detectron2 @ 79f9147
test_data		test_data
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE.txt		LICENSE.txt
README.md		README.md
environment.yaml		environment.yaml
eval.py		eval.py
face_dataset.py		face_dataset.py
train.py		train.py
train.sh		train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MegActor: Harness the Power of Raw Video for Vivid Portrait Animation

News & TODO List

MegActor Features:

Overview

Pre-generated results

Preparation

Training

Inference

Demo

BibTeX

Acknowledgement

Contact

Star History

About

Releases

Packages

Contributors 4

Languages

License

megvii-research/megactor

Folders and files

Latest commit

History

Repository files navigation

MegActor: Harness the Power of Raw Video for Vivid Portrait Animation

News & TODO List

MegActor Features:

Overview

Pre-generated results

Preparation

Training

Inference

Demo

BibTeX

Acknowledgement

Contact

Star History

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages