Improve model card: Add pipeline tag, project page metadata, update paper and code links
#1
by
nielsr
HF Staff
- opened
README.md
CHANGED
|
@@ -1,11 +1,13 @@
|
|
| 1 |
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
language:
|
| 4 |
-
- en
|
| 5 |
base_model:
|
| 6 |
- Qwen/Qwen2.5-7B
|
| 7 |
- google/siglip2-so400m-patch14-384
|
|
|
|
|
|
|
| 8 |
library_name: transformers
|
|
|
|
|
|
|
|
|
|
| 9 |
tags:
|
| 10 |
- molmoact
|
| 11 |
- molmo
|
|
@@ -21,16 +23,17 @@ tags:
|
|
| 21 |
# MolmoAct 7B-D Pretrain
|
| 22 |
|
| 23 |
MolmoAct is a fully open-source action reasoning model for robotic manipulation developed by the Allen Institute for AI. MolmoAct is trained on a subset of OXE and MolmoAct Dataset, a dataset with 10k high-quality trajectories of a single-arm Franka robot performing 93 unique manipulation tasks in both home and tabletop environments. It has state-of-the-art performance among vision-language-action models on multiple benchmarks while being fully open-source. You can find all models in the MolmoAct family [here](https://huggingface.co/collections/allenai/molmoact-689697591a3936fba38174d7).
|
| 24 |
-
**Learn more about MolmoAct** in our announcement [blog post](https://allenai.org/blog/molmoact) or the [paper](https://huggingface.co/
|
| 25 |
|
| 26 |
-
**MolmoAct 7B-D Pretrain** is based on [Qwen2.5-7B](https://huggingface.co/Qwen/Qwen2.5-7B) and uses [SigLip2](https://huggingface.co/google/siglip2-so400m-patch14-384) as the vision backbone, which is initialized using Molmo's pre-training approach. It is pre-trained on MolmoAct's [Pre-training Mixture](https://huggingface.co/datasets/allenai/MolmoAct-Pretraining-Mixture). This model is intended to be used for downstream mid-training, or for replicating our zero-shot results on SimplerEnv (Google Robot).
|
| 27 |
|
| 28 |
This checkpoint is a **preview** of the MolmoAct release. All artifacts used in creating MolmoAct (data, training code, evaluations, intermediate checkpoints) will be made available at a later date, furthering our commitment to open-source AI development and reproducibility.
|
| 29 |
|
| 30 |
Quick links:
|
| 31 |
- π [All Models](https://huggingface.co/collections/allenai/molmoact-689697591a3936fba38174d7)
|
| 32 |
- π [All Data](https://huggingface.co/collections/allenai/molmoact-data-mixture-6897e583e13b6c2cf3ea2b80)
|
| 33 |
-
- π [Paper](https://huggingface.co/
|
|
|
|
| 34 |
- π₯ [Blog Post](https://allenai.org/blog/molmoact)
|
| 35 |
- π₯ [Video](https://youtu.be/-_wag1X25OE?si=Xi_kUaJTmcQBx1f6)
|
| 36 |
|
|
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
| 2 |
base_model:
|
| 3 |
- Qwen/Qwen2.5-7B
|
| 4 |
- google/siglip2-so400m-patch14-384
|
| 5 |
+
language:
|
| 6 |
+
- en
|
| 7 |
library_name: transformers
|
| 8 |
+
license: apache-2.0
|
| 9 |
+
pipeline_tag: robotics
|
| 10 |
+
project_page: https://allenai.org/blog/molmoact
|
| 11 |
tags:
|
| 12 |
- molmoact
|
| 13 |
- molmo
|
|
|
|
| 23 |
# MolmoAct 7B-D Pretrain
|
| 24 |
|
| 25 |
MolmoAct is a fully open-source action reasoning model for robotic manipulation developed by the Allen Institute for AI. MolmoAct is trained on a subset of OXE and MolmoAct Dataset, a dataset with 10k high-quality trajectories of a single-arm Franka robot performing 93 unique manipulation tasks in both home and tabletop environments. It has state-of-the-art performance among vision-language-action models on multiple benchmarks while being fully open-source. You can find all models in the MolmoAct family [here](https://huggingface.co/collections/allenai/molmoact-689697591a3936fba38174d7).
|
| 26 |
+
**Learn more about MolmoAct** in our announcement [blog post](https://allenai.org/blog/molmoact) or the [paper](https://huggingface.co/papers/2508.07917).
|
| 27 |
|
| 28 |
+
**MolmoAct 7B-D Pretrain** is based on [Qwen2.5-7B](https://huggingface.co/Qwen/Qwen2.5-7B) and uses [SigLip2](https://huggingface.co/google/siglip2-so400m-patch14-384) as the vision backbone, which is initialized using Molmo's pre-training approach. It is pre-trained on MolmoAct's [Pre-training Mixture](https://huggingface.co/datasets/allenai/MolmoAct-Pretraining-Mixture). This model is intended to be used for downstream mid-training, or for replicating our zero-shot results on SimplerEnv (Google Robot).
|
| 29 |
|
| 30 |
This checkpoint is a **preview** of the MolmoAct release. All artifacts used in creating MolmoAct (data, training code, evaluations, intermediate checkpoints) will be made available at a later date, furthering our commitment to open-source AI development and reproducibility.
|
| 31 |
|
| 32 |
Quick links:
|
| 33 |
- π [All Models](https://huggingface.co/collections/allenai/molmoact-689697591a3936fba38174d7)
|
| 34 |
- π [All Data](https://huggingface.co/collections/allenai/molmoact-data-mixture-6897e583e13b6c2cf3ea2b80)
|
| 35 |
+
- π [Paper](https://huggingface.co/papers/2508.07917)
|
| 36 |
+
- π» [Code](https://github.com/allenai/MolmoAct)
|
| 37 |
- π₯ [Blog Post](https://allenai.org/blog/molmoact)
|
| 38 |
- π₯ [Video](https://youtu.be/-_wag1X25OE?si=Xi_kUaJTmcQBx1f6)
|
| 39 |
|