DVIS++: Improved Decoupled Framework for Universal Video Segmentation

Tao Zhang, XingYe Tian, Yikang Zhou, ShunPing Ji, Xuebo Wang, Xin Tao,

Yuan Zhang, Pengfei Wan, Zhongyuan Wang and Yu Wu

News

DVIS-DAQ achieves 57.1 AP on the OVIS dataset and also sets a new SOTA performance on YTVIS19/21 and VIPSeg. The code will be released in this repository and DAQ-VS. The paper is available at DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries and the project page can be found in project page.
DVIS and DVIS++ achieved 1st place in the VPS Track of the PVUW challenge at CVPR 2023. 2023.5.25
DVIS and DVIS++ achieved 1st place in the VIS Track of the 5th LSVOS challenge at ICCV 2023. 2023.8.15

Features

DVIS++ is a universal video segmentation framework that supports VIS, VPS and VSS.
DVIS++ can run in both online and offline modes.
DVIS++ achieved SOTA performance on YTVIS 2019&2021&2022, OVIS, VIPSeg and VSPW datasets.
OV-DVIS++ is the first open-vocabulary video universal segmentation framework with powerful zero-shot segmentation capability.

Demos

VIS

VSS

VPS

Open-vocabulary demos

Installation

See Installation Instructions.

Getting Started

See Preparing Datasets for DVIS++.

See Getting Started with DVIS++.

Model Zoo

Trained models are available for download in the DVIS++ Model Zoo.

Citing DVIS and DVIS++

@article{zhang2023dvis,
  title={DVIS: Decoupled Video Instance Segmentation Framework},
  author={Zhang, Tao and Tian, Xingye and Wu, Yu and Ji, Shunping and Wang, Xuebo and Zhang, Yuan and Wan, Pengfei},
  journal={arXiv preprint arXiv:2306.03413},
  year={2023}
}

@article{zhang2023dvisplus,
  title={DVIS++: Improved Decoupled Framework for Universal Video Segmentation}, 
  author={Tao Zhang and Xingye Tian and Yikang Zhou and Shunping Ji and Xuebo Wang and Xin Tao and Yuan Zhang and Pengfei Wan and Zhongyuan Wang and Yu Wu},
  journal={arXiv preprint arXiv:2312.13305},
  year={2023},
}

@article{dvisdaq,
  title={DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries}, 
  author={Yikang Zhou and Tao Zhang and Shunping Ji and Shuicheng Yan and Xiangtai Li},
  journal={arXiv},
  year={2024},
}

Acknowledgement

This repo is largely based on Mask2Former, MinVIS, VITA, CTVIS, FC-CLIP and DVIS. Thanks for their excellent works.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
configs		configs
datasets		datasets
demo_video		demo_video
dvis_Plus		dvis_Plus
mask2former		mask2former
mask2former_video		mask2former_video
ov_dvis		ov_dvis
utils		utils
GETTING_STARTED.md		GETTING_STARTED.md
INSTALL.md		INSTALL.md
LICENSE		LICENSE
MODEL_ZOO.md		MODEL_ZOO.md
README.md		README.md
requirements.txt		requirements.txt
train_net_video.py		train_net_video.py
train_net_video_ov.py		train_net_video_ov.py

License

zhang-tao-whu/DVIS_Plus

Folders and files

Latest commit

History

Repository files navigation

News

Features

Demos

VIS

VSS

VPS

Open-vocabulary demos

Installation

Getting Started

Model Zoo

Citing DVIS and DVIS++

Acknowledgement

About

Topics

Resources

License

Stars

Watchers

Forks

Languages