xx18 's Collections

TFPI

Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners