Open-World Entity Segmentation Project Website
Lu Qi*, Jason Kuen*, Yi Wang, Jiuxiang Gu, Hengshuang Zhao, Zhe Lin, Philip Torr, Jiaya Jia
This project provides an implementation for the paper "Open-World Entity Segmentation" based on Detectron2. Entity Segmentation is a segmentation task with the aim to segment everything in an image into semantically-meaningful regions without considering any category labels. Our entity segmentation models can perform exceptionally well in a cross-dataset setting where we use only COCO as the training dataset but we test the model on images from other datasets at inference time. Please refer to project website for more details and visualizations.
- 30.07.2022: We rebuild the dataloader with RLE Encoding in the
EntitySegRLE
file. You can just replace theEntitySegRLE
withEntitySeg
in the instructions described below. To generate the RLE Format with the past data format in our past version, please refer to the codeEntitySegRLE/tools/makeRLE_COCO2017.py
This project is based on Detectron2, which can be constructed as follows.
- Install Detectron2 following the instructions. We are noting that our code is implemented in detectron2 commit version 28174e932c534f841195f02184dc67b941c65a67 and pytorch 1.8.
- Setup the coco dataset including instance and panoptic annotations following the structure. The code of entity evaluation metric is saved in the file of modified_cocoapi. You can directly replace your compiled coco.py with modified_cocoapi/PythonAPI/pycocotools/coco.py.
- Copy this project to
/path/to/detectron2/projects/EntitySeg
- Set the "find_unused_parameters=True" in distributed training of your own detectron2. You could modify it in detectron2/engine/defaults.py.
(1) Generate the entity information of each image by the instance and panoptic annotation. Please change the path of coco annotation files in the following code.
cd /path/to/detectron2/projects/EntitySeg/make_data
bash make_entity_mask.sh
(2) Change the generated entity information to the json files.
cd /path/to/detectron2/projects/EntitySeg/make_data
python3 entity_to_json.py
To train model with 8 GPUs, run:
cd /path/to/detectron2
python3 projects/EntitySeg/train_net.py --config-file <projects/EntitySeg/configs/config.yaml> --num-gpus 8
For example, to launch entity segmentation training (1x schedule) with ResNet-50 backbone on 8 GPUs and save the model in the path "/data/entity_model". one should execute:
cd /path/to/detectron2
python3 projects/EntitySeg/train_net.py --config-file projects/EntitySeg/configs/entity_default.yaml --num-gpus 8 OUTPUT_DIR /data/entity_model
To evaluate a pre-trained model with 8 GPUs, run:
cd /path/to/detectron2
python3 projects/EntitySeg/train_net.py --config-file <config.yaml> --num-gpus 8 --eval-only MODEL.WEIGHTS model_checkpoint
To visualize some image result of a pre-trained model, run:
cd /path/to/detectron2
python3 projects/EntitySeg/demo_result_and_vis.py --config-file <config.yaml> --input <input_path> --output <output_path> MODEL.WEIGHTS model_checkpoint MODEL.CONDINST.MASK_BRANCH.USE_MASK_RESCORE "True"
For example,
python3 projects/EntitySeg/demo_result_and_vis.py --config-file projects/EntitySeg/configs/entity_swin_lw7_1x.yaml --input /data/input/*.jpg --output /data/output MODEL.WEIGHTS /data/pretrained_model/R_50.pth MODEL.CONDINST.MASK_BRANCH.USE_MASK_RESCORE "True"
Pretrained weights of Swin Transformers
Use the tools/convert_swin_to_d2.py to convert the pretrained weights of Swin Transformers to the detectron2 format. For example,
pip install timm
wget https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_tiny_patch4_window7_224.pth
python tools/convert_swin_to_d2.py swin_tiny_patch4_window7_224.pth swin_tiny_patch4_window7_224_trans.pth
Pretrained weights of Segformer Backbone
Use the tools/convert_mit_to_d2.py to convert the pretrained weights of SegFormer Backbone to the detectron2 format. For example,
pip install timm
python tools/convert_mit_to_d2.py mit_b0.pth mit_b0_trans.pth
We provide the results of several pretrained models on COCO val set. It is easy to extend it to other backbones. We first describe the results of using CNN backbone.
Method | Backbone | Sched | Entity AP | download |
---|---|---|---|---|
Baseline | R50 | 1x | 28.3 | model | metrics |
Ours | R50 | 1x | 29.8 | model | metrics |
Ours | R50 | 3x | 31.8 | model | metrics |
Ours | R101 | 1x | 31.0 | model | metrics |
Ours | R101 | 3x | 33.2 | model | metrics |
Ours | R101-DCNv2 | 3x | 35.5 | model | metrics |
The results of using transformer backbone as follows.The Mask Rescore indicates that we use mask rescoring in inference by setting MODEL.CONDINST.MASK_BRANCH.USE_MASK_RESCORE
to True
.
Method | Backbone | Sched | Entity AP | Mask Rescore | download |
---|---|---|---|---|---|
Ours | Swin-T | 1x | 33.0 | 34.6 | model | metrics |
Ours | Swin-L-W7 | 1x | 37.8 | 39.3 | model | metrics |
Ours | Swin-L-W7 | 3x | 38.6 | 40.0 | model | metrics |
Ours | Swin-L-W12 | 3x | 38.7 | 40.1 | model | metrics |
Ours | MiT-b0 | 1x | 28.8 | 30.4 | model | metrics |
Ours | MiT-b2 | 1x | 35.1 | 36.6 | model | metrics |
Ours | MiT-b3 | 1x | 36.9 | 38.5 | model | metrics |
Ours | MiT-b5 | 1x | 37.2 | 38.7 | model | metrics |
Ours | MiT-b5 | 3x | 37.4 | 38.7 | model | metrics |
TBD
Dataset | TEST-COCO | TEST-ADE20K | TEST-CITY | download |
---|---|---|---|---|
TRAIN-COCO | TBD | TBD | TBD | model | metrics |
TRAIN-ADE20K | TBD | TBD | TBD | model | metrics |
TRAIN-CITY | TBD | TBD | TBD | model | metrics |
TRAIN-ALL | TBD | TBD | TBD | model | metrics |
Dataset | TEST-COCO | TEST-ADE20K | TEST-CITY | download |
---|---|---|---|---|
TRAIN-ALL | 38.9 | 37.0 | 33.0 | model | metrics |
Consider to cite Open-World Entity Segmentation if it helps your research.
@article{qi2021open,
title={Open-World Entity Segmentation},
author={Qi, Lu and Kuen, Jason and Wang, Yi and Gu, Jiuxiang and Zhao, Hengshuang and Lin, Zhe and Torr, Philip and Jia, Jiaya},
journal={arXiv preprint arXiv:2107.14228},
year={2021}
}