Yysrc
/

Mantis-Base

Model card Files Files and versions

Mantis-Base / README.md

Yysrc's picture

Update README.md

34825ae verified 13 days ago

|

history blame contribute delete

1.23 kB

	---
	license: apache-2.0
	pipeline_tag: robotics
	library_name: transformers
	---

	# Mantis

	> This is the official checkpoint of **Mantis: A Versatile Vision-Language-Action Model
	with Disentangled Visual Foresight**

	- Paper: https://arxiv.org/pdf/2511.16175
	- Code: https://github.com/zhijie-group/Mantis

	### 🔥 Highlights
	- Disentangled Visual Foresight augments action learning without overburdening the backbone.
	- Progressive Training preserves the understanding capabilities of the backbone.
	- Adaptive Temporal Ensemble reduces inference cost while maintaining stable control.

	### How to use
	This is the base Mantis model. For detailed usage please refer to [our repository](https://github.com/zhijie-group/Mantis).

	### 📝 Citation
	If you find our code or models useful in your work, please cite [our paper](https://arxiv.org/pdf/2511.16175):
	```
	@article{yang2025mantis,
	title={Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight},
	author={Yang, Yi and Li, Xueqi and Chen, Yiyang and Song, Jin and Wang, Yihan and Xiao, Zipeng and Su, Jiadi and Qiaoben, You and Liu, Pengfei and Deng, Zhijie},
	journal={arXiv preprint arXiv:2511.16175},
	year={2025}
	}
	```