extrapolation_rl / README.md
nielsr's picture
nielsr HF Staff
Improve model card: Add pipeline tag and GitHub link
7d3ec55 verified
|
raw
history blame
1.37 kB
metadata
license: mit
pipeline_tag: text-generation

On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models

Charlie Zhang, Graham Neubig, Xiang Yue

Carnegie Mellon University, Language Technologies Institute

arXiv License: MIT Python

This repository contains post-training related checkpoints in extrapolation tasks.

Code: GitHub Repository

πŸ“š Citation

If you find this work or code useful, please consider citing:

@misc{zhang2025interplaypretrainingmidtrainingrl,
      title={On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models}, 
      author={Charlie Zhang and Graham Neubig and Xiang Yue},
      year={2025},
      eprint={2512.07783},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2512.07783}, 
}