hazyresearch
/

Weaver_Distilled_All_Datasets_gte-Qwen2-1.5B-instruct

Text Classification

Model card Files Files and versions

jonsaadfalcon commited on Jun 10

Commit

c58de00

·

verified ·

1 Parent(s): 2c068c7

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -1,13 +1,13 @@
 # Weaver Distilled - All Datasets (gte-Qwen2-1.5B-instruct)
-This is a distilled cross-encoder model based on gte-Qwen2-1.5B-instruct, trained to predict the correctness of answers across multiple domains. This general-purpose verifier was trained on a combined dataset of 35 different verifiers and reward models aggregated using Weaver.
 ## Model Details
 - **Base Model**: [Alibaba-NLP/gte-Qwen2-1.5B-instruct](https://huggingface.co/Alibaba-NLP/gte-Qwen2-1.5B-instruct)
 - **Architecture**: Cross-encoder with MLP head (1536 → 768 → 384 → 1)
 - **Max Sequence Length**: 4096
-- **Training Data**: Combined dataset from 35 different LM Judges and reward models aggregated with Weaver
 - **Training Objective**: Binary classification (correct/incorrect answer prediction)
 ## Usage

 # Weaver Distilled - All Datasets (gte-Qwen2-1.5B-instruct)
+This is a distilled cross-encoder model based on gte-Qwen2-1.5B-instruct, trained to predict the correctness of answers across multiple domains. This general-purpose verifier was trained on Weaver scores aggregated over 35 different verifiers and reward models.
 ## Model Details
 - **Base Model**: [Alibaba-NLP/gte-Qwen2-1.5B-instruct](https://huggingface.co/Alibaba-NLP/gte-Qwen2-1.5B-instruct)
 - **Architecture**: Cross-encoder with MLP head (1536 → 768 → 384 → 1)
 - **Max Sequence Length**: 4096
+- **Training Data**: Combined dataset of [MATH500](https://huggingface.co/datasets/HuggingFaceH4/MATH-500), [GPQA](https://huggingface.co/datasets/Idavidrein/gpqa), and [MMLU Pro](https://huggingface.co/datasets/TIGER-Lab/MMLU-Pro) from 35 different LM Judges and reward models aggregated with Weaver
 - **Training Objective**: Binary classification (correct/incorrect answer prediction)
 ## Usage