jonsaadfalcon commited on
Commit
c58de00
·
verified ·
1 Parent(s): 2c068c7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -1,13 +1,13 @@
1
  # Weaver Distilled - All Datasets (gte-Qwen2-1.5B-instruct)
2
 
3
- This is a distilled cross-encoder model based on gte-Qwen2-1.5B-instruct, trained to predict the correctness of answers across multiple domains. This general-purpose verifier was trained on a combined dataset of 35 different verifiers and reward models aggregated using Weaver.
4
 
5
  ## Model Details
6
 
7
  - **Base Model**: [Alibaba-NLP/gte-Qwen2-1.5B-instruct](https://huggingface.co/Alibaba-NLP/gte-Qwen2-1.5B-instruct)
8
  - **Architecture**: Cross-encoder with MLP head (1536 → 768 → 384 → 1)
9
  - **Max Sequence Length**: 4096
10
- - **Training Data**: Combined dataset from 35 different LM Judges and reward models aggregated with Weaver
11
  - **Training Objective**: Binary classification (correct/incorrect answer prediction)
12
 
13
  ## Usage
 
1
  # Weaver Distilled - All Datasets (gte-Qwen2-1.5B-instruct)
2
 
3
+ This is a distilled cross-encoder model based on gte-Qwen2-1.5B-instruct, trained to predict the correctness of answers across multiple domains. This general-purpose verifier was trained on Weaver scores aggregated over 35 different verifiers and reward models.
4
 
5
  ## Model Details
6
 
7
  - **Base Model**: [Alibaba-NLP/gte-Qwen2-1.5B-instruct](https://huggingface.co/Alibaba-NLP/gte-Qwen2-1.5B-instruct)
8
  - **Architecture**: Cross-encoder with MLP head (1536 → 768 → 384 → 1)
9
  - **Max Sequence Length**: 4096
10
+ - **Training Data**: Combined dataset of [MATH500](https://huggingface.co/datasets/HuggingFaceH4/MATH-500), [GPQA](https://huggingface.co/datasets/Idavidrein/gpqa), and [MMLU Pro](https://huggingface.co/datasets/TIGER-Lab/MMLU-Pro) from 35 different LM Judges and reward models aggregated with Weaver
11
  - **Training Objective**: Binary classification (correct/incorrect answer prediction)
12
 
13
  ## Usage