Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Junkang Wu's picture
2 5

Junkang Wu

junkang0909
Rosykunai's profile picture 00ffcc's profile picture guoqingyu2004's profile picture
·
https://junkangwu.github.io/

AI & ML interests

LLM alignment

Organizations

None yet

upvoted 2 papers 3 months ago

EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning

Paper • 2509.22576 • Published Sep 26 • 134

Quantile Advantage Estimation for Entropy-Safe Reasoning

Paper • 2509.22611 • Published Sep 26 • 118
upvoted 3 papers 10 months ago

Aligning Multimodal LLM with Human Preference: A Survey

Paper • 2503.14504 • Published Mar 18 • 26

Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning

Paper • 2503.07572 • Published Mar 10 • 47

MM-RLHF: The Next Step Forward in Multimodal LLM Alignment

Paper • 2502.10391 • Published Feb 14 • 34
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs