None defined yet.
On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation
HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning