28× Compressed Wav2Lip by Nota AI

title	emoji	colorFrom	colorTo	sdk	sdk_version	app_file	pinned	license
Compressed Wav2Lip	🌟	indigo	pink	gradio	4.13.0	app.py	true	apache-2.0

28× Compressed Wav2Lip by Nota AI

Official codebase for Accelerating Speech-Driven Talking Face Generation with 28× Compressed Wav2Lip.

Presented at ICCV'23 Demo Track; On-Device Intelligence Workshop @ MLSys'23; NVIDIA GTC 2023 Poster.

Installation

Docker (recommended)

git clone https://github.com/Nota-NetsPresso/nota-wav2lip.git
cd nota-wav2lip
docker compose run --service-ports --name nota-compressed-wav2lip compressed-wav2lip bash

Conda

Click

git clone https://github.com/Nota-NetsPresso/nota-wav2lip.git
cd nota-wav2lip
apt-get update
apt-get install ffmpeg libsm6 libxext6 tmux git -y
conda create -n nota-wav2lip python=3.9
conda activate nota-wav2lip
pip install -r requirements.txt

Gradio Demo

Use the below script to run the nota-ai/compressed-wav2lip demo. The models and sample data will be downloaded automatically.

bash app.sh

Inference

(1) Download YouTube videos in the LRS3-TED label text file and preprocess them properly.

Download lrs3_v0.4_txt.zip from this link.
Unzip the file and make a folder structure: ./data/lrs3_v0.4_txt/lrs3_v0.4/test
Run bash download.sh
Run bash preprocess.sh

(2) Run the script to compare the original Wav2Lip with Nota's compressed version.

bash inference.sh

License

All rights related to this repository and the compressed models are reserved by Nota Inc.
The intended use is strictly limited to research and non-commercial projects.

Contact

To obtain compression code and assistance, kindly contact Nota AI (contact@nota.ai). These are provided as part of our business solutions.
For Q&A about this repo, use this board: Nota-NetsPresso/discussions

Acknowledgment

NVIDIA Applied Research Accelerator Program for supporting this research.
Wav2Lip and LRS3-TED for facilitating the development of the original Wav2Lip.

Citation

@article{kim2023unified,
      title={A Unified Compression Framework for Efficient Speech-Driven Talking-Face Generation}, 
      author={Kim, Bo-Kyeong and Kang, Jaemin and Seo, Daeun and Park, Hancheol and Choi, Shinkook and Song, Hyoung-Kyu and Kim, Hyungshin and Lim, Sungsu},
      journal={MLSys Workshop on On-Device Intelligence (ODIW)},
      year={2023},
      url={https://arxiv.org/abs/2304.00471}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github/workflows		.github/workflows
checkpoints		checkpoints
config		config
data		data
docs		docs
face_detection		face_detection
nota_wav2lip		nota_wav2lip
.gitattributes		.gitattributes
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
app.sh		app.sh
docker-compose.yml		docker-compose.yml
download.py		download.py
download.sh		download.sh
inference.py		inference.py
inference.sh		inference.sh
preprocess.py		preprocess.py
preprocess.sh		preprocess.sh
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Nota-NetsPresso/nota-wav2lip

Folders and files

Latest commit

History

Repository files navigation

28× Compressed Wav2Lip by Nota AI

Installation

Docker (recommended)

Conda

Gradio Demo

Inference

License

Contact

Acknowledgment

Citation

About

Resources

Stars

Watchers

Forks

Languages