extrapolation_rl / README.md
nielsr's picture
nielsr HF Staff
Improve model card: Add pipeline tag and GitHub link
7d3ec55 verified
|
raw
history blame
1.37 kB
---
license: mit
pipeline_tag: text-generation
---
<h1 align="center">
On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models
</h1>
<div align="center">
<a href="https://chenlong-clock.github.io">Charlie Zhang</a>, <a href="https://www.phontron.com">Graham Neubig</a>,
<a href="https://xiangyue9607.github.io">Xiang Yue</a>
Carnegie Mellon University, Language Technologies Institute
</div>
<div align="center">
[![arXiv](https://img.shields.io/badge/arXiv-2512.07783-b31b1b.svg?logo=arxiv&logoColor=white)](https://arxiv.org/abs/2512.07783)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
![Python](https://img.shields.io/badge/python-3.9%2B-blue)
</div>
This repository contains post-training related checkpoints in extrapolation tasks.
**Code:** [GitHub Repository](https://github.com/Interplay-LM-Reasoning/Interplay-LM-Reasoning)
## ๐Ÿ“š Citation
If you find this work or code useful, please consider citing:
```bibtex
@misc{zhang2025interplaypretrainingmidtrainingrl,
title={On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models},
author={Charlie Zhang and Graham Neubig and Xiang Yue},
year={2025},
eprint={2512.07783},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2512.07783},
}
```