TOSS: High-quality Text-guided Novel View Synthesis from a Single Image (ICLR2024)

Yukai Shi, Jianan Wang, He Cao, Boshi Tang, Xianbiao Qi, Tianyu Yang, Yukun Huang, Shilong Liu, Lei Zhang, Heung-Yeung Shum

Official implementation for TOSS: High-quality Text-guided Novel View Synthesis from a Single Image.

TOSS introduces text as high-level sementic information to constraint the NVS solution space for more controllable and more plausible results.

Project Page | ArXiv | Weights

3d_generation_video.mp4

Install

Create environment

conda create -n toss python=3.9
conda activate toss

Install packages

pip install torch==2.0.0 torchvision==0.15.1 torchaudio==2.0.1 --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt
git clone https://github.com/openai/CLIP.git
pip install -e CLIP/

Weights

Download pretrain weights from this link to sub-directory ./ckpt

Inference

We suggest gradio for a visualized inference and test this demo on a single RTX3090.

python app.py

Todo List

Release inference code.
Release pretrained models.
Upload 3D generation code.
Upload training code.

Acknowledgement

Citation

@article{shi2023toss,
  title={Toss: High-quality text-guided novel view synthesis from a single image},
  author={Shi, Yukai and Wang, Jianan and Cao, He and Tang, Boshi and Qi, Xianbiao and Yang, Tianyu and Huang, Yukun and Liu, Shilong and Zhang, Lei and Shum, Heung-Yeung},
  journal={arXiv preprint arXiv:2310.10644},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
CLIP		CLIP
assets		assets
cldm		cldm
datasets		datasets
ldm		ldm
models		models
outputs		outputs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
config.py		config.py
opt.py		opt.py
requirements.txt		requirements.txt
share.py		share.py
streamlit_app.py		streamlit_app.py
train.py		train.py
tutorial_dataset.py		tutorial_dataset.py
viz.py		viz.py

License

IDEA-Research/TOSS

Folders and files

Latest commit

History

Repository files navigation

TOSS: High-quality Text-guided Novel View Synthesis from a Single Image (ICLR2024)

Yukai Shi, Jianan Wang, He Cao, Boshi Tang, Xianbiao Qi, Tianyu Yang, Yukun Huang, Shilong Liu, Lei Zhang, Heung-Yeung Shum

Project Page | ArXiv | Weights

Install

Create environment

Install packages

Weights

Inference

Todo List

Acknowledgement

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages