Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,34 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
---
|
| 4 |
+
# PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
|
| 5 |
+
|
| 6 |
+
[Project](https://polyformer.github.io/) | [GitHub](https://github.com/amazon-science/polygon-transformer) | [Demo](https://huggingface.co/spaces/koajoel/PolyFormer)
|
| 7 |
+
|
| 8 |
+
## Model description
|
| 9 |
+
|
| 10 |
+
PolyFormer is a unified framework for referring image segmentation (RIS) and referring expression comprehension (REC) by formulating them as a sequence-to-sequence (seq2seq) prediction problem. For more details, please refer to our paper:
|
| 11 |
+
|
| 12 |
+
[PolyFormer: Referring Image Segmentation as Sequential Polygon Generation](https://arxiv.org/abs/2302.07387)
|
| 13 |
+
Jiang Liu*, Hui Ding*, Zhaowei Cai, Yuting Zhang, Ravi Kumar Satzoda, Vijay Mahadevan, R. Manmatha, [CVPR 2023](https://cvpr2023.thecvf.com/Conferences/2023/AcceptedPapers)
|
| 14 |
+
|
| 15 |
+
## Training data
|
| 16 |
+
|
| 17 |
+
We pre-train PolyFormer on the REC task using Visual Genome, RefCOCO, RefCOCO+, RefCOCOg, and Flickr30k-entities, and the finetune on REC + RIS task using RefCOCO, RefCOCO+,
|
| 18 |
+
and RefCOCOg.
|
| 19 |
+
|
| 20 |
+
* PolyFormer-B: Swin-B as the visual encoder, BERT-base as the text encoder, 6 transformer encoder layers and 6 decoder layers.
|
| 21 |
+
* PolyFormer-L: Swin-L as the visual encoder, BERT-base as the text encoder, 12 transformer encoder layers and 12 decoder layers.
|
| 22 |
+
|
| 23 |
+
## Citation
|
| 24 |
+
|
| 25 |
+
If you find PolyFormer useful in your research, please cite the following paper:
|
| 26 |
+
|
| 27 |
+
``` latex
|
| 28 |
+
@article{liu2023polyformer,
|
| 29 |
+
title={PolyFormer: Referring Image Segmentation as Sequential Polygon Generation},
|
| 30 |
+
author={Liu, Jiang and Ding, Hui and Cai, Zhaowei and Zhang, Yuting and Satzoda, Ravi Kumar and Mahadevan, Vijay and Manmatha, R},
|
| 31 |
+
journal={arXiv preprint arXiv:2302.07387},
|
| 32 |
+
year={2023}
|
| 33 |
+
}
|
| 34 |
+
```
|