COCO-CN

COCO-CN is a bilingual image description dataset enriching MS-COCO with manually written Chinese sentences and tags. The new dataset can be used for multiple tasks including image tagging, captioning and retrieval, all in a cross-lingual setting.

Chinese sentences	COCO-CN train	COCO-CN val	COCO-CN test
human written	✅	✅	✅
human translation	❌	❌	✅
machine translation (baidu)	✅	✅	✅

Progress

version 201805: 20,341 images (training / validation / test: 18,341 / 1,000 / 1,000), associated with 22,218 manually written Chinese sentences and 5,000 manually translated sentences. Data is freely available upon request. Please submit your request via Google Form.
Precomputed image features: ResNext-101
COCO-CN-Results-Viewer: A lightweight tool to inspect the results of different image captioning systems on the COCO-CN test set, developed by Emiel van Miltenburg at the Tilburg University.
NUS-WIDE100: An extra test set.

2018-12-16: Code for cross-lingual image tagging and captioning released.
2018-12-20: Code for cross-lingual image retrieval and our image annotation system released.
2019-01-13: The COCO-CN paper accepted as a regular paper by the T-MM journal.
2021-02-03: Release of new annotations (4,573 images and 4,712 manually written sentences) collected via our iCap interactive image captioning System. The images have no overlap with the prevously released dataset.

Citation

If you find COCO-CN useful, please consider citing the following paper:

Xirong Li, Chaoxi Xu, Xiaoxu Wang, Weiyu Lan, Zhengxiong Jia, Gang Yang, Jieping Xu, COCO-CN for Cross-Lingual Image Tagging, Captioning and Retrieval, IEEE Transactions on Multimedia, Volume 21, Number 9, pages 2347-2360, 2019

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
code		code
data		data
eval		eval
LICENSE		LICENSE
README.md		README.md
dataset-snapshot.png		dataset-snapshot.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

code

code

data

data

eval

eval

LICENSE

LICENSE

README.md

README.md

dataset-snapshot.png

dataset-snapshot.png

Repository files navigation

COCO-CN

Progress

Citation

About

Releases

Packages

Languages

License

li-xirong/coco-cn

Folders and files

Latest commit

History

Repository files navigation

COCO-CN

Progress

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages