Add library_name, link to code, and update pipeline tag
#1
by
nielsr
HF Staff
- opened
README.md
CHANGED
|
@@ -1,15 +1,23 @@
|
|
| 1 |
---
|
| 2 |
-
|
|
|
|
| 3 |
datasets:
|
| 4 |
- cerebras/SlimPajama-627B
|
| 5 |
language:
|
| 6 |
- en
|
| 7 |
-
|
| 8 |
-
|
|
|
|
|
|
|
|
|
|
| 9 |
---
|
| 10 |
|
| 11 |
# Meta-rater Language Model (7.2B Parameters, 150B Tokens)
|
| 12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
## Model Description
|
| 14 |
|
| 15 |
This is a 7.2B parameter transformer-based decoder-only language model trained from scratch on 150B tokens selected from SlimPajama dataset using the **Meta-rater** framework with all 25 quality scores. This represents the largest and most capable model in the Meta-rater research, demonstrating maximal benefits of quality-driven data selection at scale.
|
|
@@ -224,4 +232,4 @@ Please refer to the license terms of the original SlimPajama dataset and follow
|
|
| 224 |
|
| 225 |
## Contact
|
| 226 |
|
| 227 |
-
For questions or issues, please contact the authors or open an issue in the repository.
|
|
|
|
| 1 |
---
|
| 2 |
+
base_model:
|
| 3 |
+
- internlm/internlm2-7b
|
| 4 |
datasets:
|
| 5 |
- cerebras/SlimPajama-627B
|
| 6 |
language:
|
| 7 |
- en
|
| 8 |
+
license: mit
|
| 9 |
+
metrics:
|
| 10 |
+
- accuracy
|
| 11 |
+
pipeline_tag: text-generation
|
| 12 |
+
library_name: transformers
|
| 13 |
---
|
| 14 |
|
| 15 |
# Meta-rater Language Model (7.2B Parameters, 150B Tokens)
|
| 16 |
|
| 17 |
+
This repository contains the model described in the paper [Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models](https://huggingface.co/papers/2504.14194).
|
| 18 |
+
|
| 19 |
+
Code: https://github.com/opendatalab/Meta-rater
|
| 20 |
+
|
| 21 |
## Model Description
|
| 22 |
|
| 23 |
This is a 7.2B parameter transformer-based decoder-only language model trained from scratch on 150B tokens selected from SlimPajama dataset using the **Meta-rater** framework with all 25 quality scores. This represents the largest and most capable model in the Meta-rater research, demonstrating maximal benefits of quality-driven data selection at scale.
|
|
|
|
| 232 |
|
| 233 |
## Contact
|
| 234 |
|
| 235 |
+
For questions or issues, please contact the authors or open an issue in the repository.
|