RGDiffSR

The official pytorch implementation of Paper: RECOGNITION-GUIDED DIFFUSION MODEL FOR SCENE TEXT IMAGE SUPER-RESOLUTION

Installation

Environment preparation: (Python 3.8 + PyTorch 1.7.0 + Torchvision 0.8.1 + pytorch_lightning 1.5.10 + CUDA 11.0)

conda create -n RGDiffSR python=3.8
git clone git@github.com:shercoo/RGDiffSR.git
cd RGDiffSR
pip install -r requirements.txt

You can also refer to taming-transformers for the installation of taming-transformers library (Needed if VQGAN is applied).

Dataset preparation

Download the TextZoom dataset at TextZoom.

Model checkpoints

Download the pre-trained recognizers Aster, Moran, CRNN.

Download the checkpoints of pre-trained VQGAN and RGDiffSR at Baidu Netdisk. Password: yws3

Training

First train the latent encoder (VQGAN) model.

CUDA_VISIBLE_DEVICES=<GPU_IDs> python main.py -b configs/autoencoder/vqgan_2x.yaml -t --gpus <GPU_IDS>

Put the pre-trained VQGAN model in checkpoints/.

CUDA_VISIBLE_DEVICES=<GPU_IDs> python main.py -b configs/latent-diffusion/sr_best.yaml -t --gpus <GPU_IDS>

Testing

Put the pre-trained RGDiffSR model in checkpoints/.

CUDA_VISIBLE_DEVICES=<GPU_IDs> python test.py -b configs/latent-diffusion/sr_test.yaml  --gpus <GPU_IDS>

You can manually modify the test dataset directory in sr_test.yaml for test on different difficulty of TextZoom dataset.

License

The model is licensed under the MIT license.

Acknowledgement

Our code is built on the latent-diffusion and TATT repositories. Thanks to their research!

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
configs		configs
ldm		ldm
text_super_resolution		text_super_resolution
utils		utils
LICENSE		LICENSE
README.md		README.md
RGDiffSR.png		RGDiffSR.png
al_chinese.txt		al_chinese.txt
main.py		main.py
requirements.txt		requirements.txt
setup.py		setup.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

configs

configs

ldm

ldm

text_super_resolution

text_super_resolution

utils

utils

LICENSE

LICENSE

README.md

README.md

RGDiffSR.png

RGDiffSR.png

al_chinese.txt

al_chinese.txt

main.py

main.py

requirements.txt

requirements.txt

setup.py

setup.py

test.py

test.py

Repository files navigation

RGDiffSR

Installation

Dataset preparation

Model checkpoints

Training

Testing

License

Acknowledgement

About

Releases

Packages

Languages

License

shercoo/RGDiffSR

Folders and files

Latest commit

History

Repository files navigation

RGDiffSR

Installation

Dataset preparation

Model checkpoints

Training

Testing

License

Acknowledgement

About

Topics

Resources

License

Stars

Watchers

Forks

Languages