Models used in CHARM: Calibrating Reward Models With Chatbot Arena Scores.
shawnxzhu
shawnxzhu
AI & ML interests
None yet
Recent Activity
authored a paper about 12 hours ago
EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL submitted a paper 1 day ago
EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL