Skip to content

๐Ÿคข LipSick: Fast, High Quality, Low Resource Lipsync Tool ๐Ÿคฎ

Notifications You must be signed in to change notification settings

Gnome101/LipSick

ย 
ย 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

26 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

LipSick Logo

Introduction

To get started with LipSick on Windows, follow these steps to set up your environment. This branch has been tested with Anaconda using Python 3.10 and CUDA 11.6 with only 4GB VRAM.

See branches for Linux or HuggingFace GPU / CPU or Collab (soon)

Setup

Install
  1. Clone the repository:
git clone https://github.com/Inferencer/LipSick.git
cd LipSick
  1. Create and activate the Anaconda environment:
conda env create -f environment.yml
conda activate lipsick

Download pre-trained models

Download Links

For the folder ./asserts

Please download pretrained_lipsick.pth using this link and place the file in the folder ./asserts

Then, download output_graph.pb using this link and place the file in the same folder.

For the folder ./models

Please download shape_predictor_68_face_landmarks.dat using this link and place the file in the folder ./models

The folder structure for manually downloaded models

.
โ”œโ”€โ”€ ...
โ”œโ”€โ”€ asserts                        
โ”‚   โ”œโ”€โ”€ examples                   # A place to store inputs if not using gradio UI
โ”‚   โ”œโ”€โ”€ inference_result           # Results will be saved to this folder
โ”‚   โ”œโ”€โ”€ output_graph.pb            # The DeepSpeech model you manually download and place here
โ”‚   โ””โ”€โ”€ pretrained_lipsick.pth     # Pre-trained model you manually download and place here
โ”‚                   
โ”œโ”€โ”€ models
โ”‚   โ”œโ”€โ”€ Discriminator.py
โ”‚   โ”œโ”€โ”€ LipSick.py
โ”‚   โ”œโ”€โ”€ shape_predictor_68_face_landmarks.dat  # Dlib Landmark tracking model you manually download and place here
โ”‚   โ”œโ”€โ”€ Syncnet.py
โ”‚   โ””โ”€โ”€ VGG19.py   
โ””โ”€โ”€ ...
  1. Run the application:
python app.py

This will launch a Gradio interface where you can upload your video and audio files to process them with LipSick. Changelog

To-Do List

  • Add support MacOS.
  • Add upscale reference frames with masking.
  • Add seamless clone masking to remove the common bounding box around mouths. ๐Ÿค•
  • Add alternative option for face tracking model SFD (likely best results, but slower than Dlib).
  • Add custom reference frame feature. ๐Ÿ˜ท
  • Add auto persistent crop_radius to prevent mask flickering. ๐Ÿ˜ท
  • Examine CPU speed upgrades.
  • Reintroduce persistent folders for frame extraction as an option with existing frame checks for faster extraction on commonly used videos. ๐Ÿ˜ท
  • Provide HuggingFace space CPU (free usage but slower). ๐Ÿ˜ท
  • Provide Google Colab .IPYNB. ๐Ÿ˜ท
  • Add support for Linux. ๐Ÿคข
  • Release Tutorial on manual masking using DaVinci. ๐Ÿ˜ท
  • Looped original video generated as an option for faster manual masking. ๐Ÿ˜ท
  • Image to MP4 conversion so a single image can be used as input.
  • Automatic audio conversion to WAV regardless of input audio format.
  • Clean README.md & provide command line inference.
  • Remove input video 25fps requirement.
  • Upload cherry picked input footage for user download & use.
  • Create a Discord to share results, faster help, suggestions & cherry picked input footage.
  • Upload results footage montage to GitHub so new users can see what LipSick is capable of.
  • Provide HuggingFace space GPU. ๐Ÿคฎ
  • Remove warning messages in command prompt that don't affect performance. ๐Ÿคข
  • Moved frame extraction to temp folders. ๐Ÿคฎ
  • Results with the same input video name no longer overwrite existing results. ๐Ÿคฎ
  • Remove OpenFace CSV requirement. ๐Ÿคฎ
  • Detect accepted media input formats only. ๐Ÿคฎ
  • Upgrade to Python 3.10. ๐Ÿคฎ
  • Add UI. ๐Ÿคฎ

Key:

  • ๐Ÿคฎ = Completed & published
  • ๐Ÿคข = Completed & published but requires community testing
  • ๐Ÿ˜ท = Tested & working but not published yet
  • ๐Ÿค• = Tested but not ready for public use

Simple Key:

  • Available
  • Unavailable

Acknowledge

This project, LipSick, is heavily inspired by and based on DINet. Specific components are borrowed and adapted to enhance LipSick

We express our gratitude to the authors and contributors of DINet for their open-source code and documentation.

About

๐Ÿคข LipSick: Fast, High Quality, Low Resource Lipsync Tool ๐Ÿคฎ

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%