Visual Question Reasoning on General Dependency Tree

This is the code for the paper on CLEVR

Visual Question Reasoning on General Dependency Tree
Qingxing Cao, Xiaodan Liang, Bailin Li, Guanbin Li, Liang Lin
Presented at CVPR 2018 (Spotlight Presentation)

If you find this code useful in your research then please cite

@InProceedings{Cao_2018_CVPR,
author = {Cao, Qingxing and Liang, Xiaodan and Li, Bailing and Li, Guanbin and Lin, Liang},
title = {Visual Question Reasoning on General Dependency Tree},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2018}
}

Requirement

tensorboardX
skimage
scipy
numpy
torchvision
h5py
tqdm

Data Preprocessing

Before you can train any models, you need to download the datasets; you also need to preprocess questions, and extract features for the images.

Step 1: Download the data

You can download CLEVR v1.0 (18 GB) with the common below.

$ sh data/clevr/download_dataset.sh

Step 2: Preprocess Questions

Codes for preprocessing would be available soon. For now you can download our preprocessed data with the following command:

$ sh data/clevr/download_preprocessed_questions.sh

Step 3: Extract Image Features

You can extract image features with the command below.

$ sh scripts/extract_image_feature.sh

The extracted features features_train.h5, features_val.h5, features_test.h5 woulde be placed in ./data/clevr/clevr_res101/.

Pretrained Models

You can download the pretrained models with the command below. The model will take about 2.6 GB on disk.

$ sh data/clevr/download_pretrained_model.sh

It is trained on CLEVR-train and can be validate on CLEVR-val.

Training on CLEVR

You can use the train_val.py script to train on CLEVR-train and validate the model on CLEVR-val.

$ python scripts/train_val.py --clevr_qa_dir=data/clevr/clevr_qa_dir/ --clevr_img_h5=data/clevr/clevr_res101/

The below script has the hyperparameters and settings to reproduce ACMN CLEVR results.

$ sh scripts/train_val.sh

Evaluation

You can use train_val.py to simply evaluate the model on CLEVR-val with --no_train option to skip the training process.

$ python scripts/train_val.py \
  --no_train=True \
  --clevr_qa_dir=data/clevr/clevr_qa_dir/ \
  --clevr_img_h5=data/clevr/clevr_res101/ \
  --resume=data/clevr/clevr_pretrained_model.pth

You can use test.py to generate CLEVR-test results in .json format so that you can upload to CLEVR official.

$ python scripts/test.py \
  --clevr_qa_dir=data/clevr/clevr_qa_dir/ \
  --clevr_img_h5=data/clevr/clevr_res101/ \
  --resume=data/clevr/clevr_pretrained_model.pth

Visualizing Attention Maps

You can use vis.py to visualize the attention maps discribed in Figure 4 of our paper.

$ python scripts/vis.py \
  --clevr_qa_dir=data/clevr/clevr_qa_dir/ \
  --clevr_img_h5=data/clevr/clevr_res101/ \
  --clevr_img_png=data/clevr/CLEVR_v1.0/ \
  --clevr_load_png=True \
  --logdir=logs/attmaps \
  --resume=data/clevr/clevr_pretrained_model.pth

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
data/clevr		data/clevr
img		img
scripts		scripts
vqa_lab		vqa_lab
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requrements.txt		requrements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data/clevr

data/clevr

img

img

scripts

scripts

vqa_lab

vqa_lab

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

requrements.txt

requrements.txt

Repository files navigation

Visual Question Reasoning on General Dependency Tree

Requirement

Data Preprocessing

Step 1: Download the data

Step 2: Preprocess Questions

Step 3: Extract Image Features

Pretrained Models

Training on CLEVR

Evaluation

Visualizing Attention Maps

About

Releases

Packages

Languages

License

bezorro/ACMN-Pytorch

Folders and files

Latest commit

History

Repository files navigation

Visual Question Reasoning on General Dependency Tree

Requirement

Data Preprocessing

Step 1: Download the data

Step 2: Preprocess Questions

Step 3: Extract Image Features

Pretrained Models

Training on CLEVR

Evaluation

Visualizing Attention Maps

About

Topics

Resources

License

Stars

Watchers

Forks

Languages