Bingxiang He's picture

1 20 6

Bingxiang He

hbx

·

https://hbx-hbx.github.io/

AI & ML interests

NLP

Recent Activity

upvoted a paper 21 days ago

P1: Mastering Physics Olympiads with Reinforcement Learning

updated a model 26 days ago

hbx/JustRL-DeepSeek-1.5B

liked a model 26 days ago

hbx/JustRL-DeepSeek-1.5B

View all activity

Organizations

authored 6 papers 2 months ago

Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents

Paper • 2402.09205 • Published Feb 14, 2024

AIR: A Systematic Analysis of Annotations, Instructions, and Response Pairs in Preference Dataset

Paper • 2504.03612 • Published Apr 4 • 2

MiniCPM4: Ultra-Efficient LLMs on End Devices

Paper • 2506.07900 • Published Jun 9 • 92

A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10 • 189

EscapeBench: Towards Advancing Creative Intelligence of Language Model Agents

Paper • 2412.13549 • Published Dec 18, 2024

The Right Time Matters: Data Arrangement Affects Zero-Shot Generalization in Instruction Tuning

Paper • 2406.11721 • Published Jun 17, 2024

authored a paper 3 months ago

MiniCPM-V 4.5: Cooking Efficient MLLMs via Architecture, Data, and Training Recipe

Paper • 2509.18154 • Published Sep 16 • 51

authored a paper 10 months ago

Process Reinforcement through Implicit Rewards

Paper • 2502.01456 • Published Feb 3 • 61