Update README.md
Browse files
README.md
CHANGED
|
@@ -1,13 +1,13 @@
|
|
| 1 |
# Weaver Distilled - All Datasets (gte-Qwen2-1.5B-instruct)
|
| 2 |
|
| 3 |
-
This is a distilled cross-encoder model based on gte-Qwen2-1.5B-instruct, trained to predict the correctness of answers across multiple domains. This general-purpose verifier was trained on
|
| 4 |
|
| 5 |
## Model Details
|
| 6 |
|
| 7 |
- **Base Model**: [Alibaba-NLP/gte-Qwen2-1.5B-instruct](https://huggingface.co/Alibaba-NLP/gte-Qwen2-1.5B-instruct)
|
| 8 |
- **Architecture**: Cross-encoder with MLP head (1536 → 768 → 384 → 1)
|
| 9 |
- **Max Sequence Length**: 4096
|
| 10 |
-
- **Training Data**: Combined dataset from 35 different LM Judges and reward models aggregated with Weaver
|
| 11 |
- **Training Objective**: Binary classification (correct/incorrect answer prediction)
|
| 12 |
|
| 13 |
## Usage
|
|
|
|
| 1 |
# Weaver Distilled - All Datasets (gte-Qwen2-1.5B-instruct)
|
| 2 |
|
| 3 |
+
This is a distilled cross-encoder model based on gte-Qwen2-1.5B-instruct, trained to predict the correctness of answers across multiple domains. This general-purpose verifier was trained on Weaver scores aggregated over 35 different verifiers and reward models.
|
| 4 |
|
| 5 |
## Model Details
|
| 6 |
|
| 7 |
- **Base Model**: [Alibaba-NLP/gte-Qwen2-1.5B-instruct](https://huggingface.co/Alibaba-NLP/gte-Qwen2-1.5B-instruct)
|
| 8 |
- **Architecture**: Cross-encoder with MLP head (1536 → 768 → 384 → 1)
|
| 9 |
- **Max Sequence Length**: 4096
|
| 10 |
+
- **Training Data**: Combined dataset of [MATH500](https://huggingface.co/datasets/HuggingFaceH4/MATH-500), [GPQA](https://huggingface.co/datasets/Idavidrein/gpqa), and [MMLU Pro](https://huggingface.co/datasets/TIGER-Lab/MMLU-Pro) from 35 different LM Judges and reward models aggregated with Weaver
|
| 11 |
- **Training Objective**: Binary classification (correct/incorrect answer prediction)
|
| 12 |
|
| 13 |
## Usage
|