Prompt Tuning for Generative Multimodal Pretrained Models

Overview

This is the code for "Prompt Tuning for Generative Multimodal Pretrained Models", Check our paper on ArXiv. This paper explores prompt tuning for generative multimodal pretrained models, instead of the constrastive learning models. We specifically focuses on the unified sequence-to-sequence learning framework and implement on our OFA models.

Requirements

python 3.7.4
pytorch 1.8.1
torchvision 0.9.1
JAVA 1.8 (for COCO evaluation)

Installation

pip install -r requirements.txt

Datasets and Checkpoints

See datasets.md and checkpoints.md.

Training

We provide a demo script (run_scripts/refcoco/train_refcoco_prefix.sh) that has all the required parts for training.

sh ./run_scripts/refcoco/train_refcoco_prefix.sh

A few options of note:

--encoder-prompt :: whether to insert prompts to the encoder
--decoder-prompt :: whether to insert prompts to the decoder
--encoder-prompt-length :: encoder prompt length
--decoder-prompt-length :: decoder prompt length
--bitfit :: whether to use bitfit
--adapter :: whether to use adapter
--adapter-dim :: adapter projection dim

We recommend that your workspace directory should be organized like this:

OFA/
├── checkpoints/
│   ├── ofa_base.pt
│   ├── ofa_large.pt
│   └── ...
├── criterions/
├── data/
├── dataset/
│   ├── caption_data/
│   ├── refcoco_data/
│   └── ...
├── fairseq/
├── models/
├── run_scripts/
├── tasks/
├── train.py
├── trainer.py
└── utils/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prompt_tuning.md

prompt_tuning.md

Prompt Tuning for Generative Multimodal Pretrained Models

Overview

Requirements

Installation

Datasets and Checkpoints

Training

Files

prompt_tuning.md

Latest commit

History

prompt_tuning.md

File metadata and controls

Prompt Tuning for Generative Multimodal Pretrained Models

Overview

Requirements

Installation

Datasets and Checkpoints

Training