Collection of models and datasets for Beyond Binary Rewards: Training LMs to Reason about their Uncertainty
Mehul Damani PRO
mehuldamani
AI & ML interests
Reinforcement Learning, Large Language Models
Recent Activity
published
a model about 14 hours ago
mehuldamani/qwen25noInstruct_rlvr_multi_veryHardDataset_moreThinking published
a model about 19 hours ago
mehuldamani/qwen25noInstruct_SFTed_rlvr_multi_veryHardDataset_moreThinking published
a model about 19 hours ago
mehuldamani/qwen25_rlvr_single_veryHardDataset Organizations
None yet