GR-RL: Going Dexterous and Precise for Long-Horizon Robotic Manipulation
Abstract
GR-RL enhances a vision-language-action policy for long-horizon dexterous manipulation through a multi-stage training pipeline that filters, augments, and refines demonstrations using reinforcement learning.
We present GR-RL, a robotic learning framework that turns a generalist vision-language-action (VLA) policy into a highly capable specialist for long-horizon dexterous manipulation. Assuming the optimality of human demonstrations is core to existing VLA policies. However, we claim that in highly dexterous and precise manipulation tasks, human demonstrations are noisy and suboptimal. GR-RL proposes a multi-stage training pipeline that filters, augments, and reinforces the demonstrations by reinforcement learning. First, GR-RL learns a vision-language-conditioned task progress, filters the demonstration trajectories, and only keeps the transitions that contribute positively to the progress. Specifically, we show that by directly applying offline RL with sparse reward, the resulting Q-values can be treated as a robust progress function. Next, we introduce morphological symmetry augmentation that greatly improves the generalization and performance of GR-RL. Lastly, to better align the VLA policy with its deployment behaviors for high-precision control, we perform online RL by learning a latent space noise predictor. With this pipeline, GR-RL is, to our knowledge, the first learning-based policy that can autonomously lace up a shoe by threading shoelaces through multiple eyelets with an 83.3% success rate, a task requiring long-horizon reasoning, millimeter-level precision, and compliant soft-body interaction. We hope GR-RL provides a step toward enabling generalist robot foundations models to specialize into reliable real-world experts.
Community
good job!
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- RESample: A Robust Data Augmentation Framework via Exploratory Sampling for Robotic Manipulation (2025)
- Human-in-the-loop Online Rejection Sampling for Robotic Manipulation (2025)
- Self-Improving Vision-Language-Action Models with Data Generation via Residual RL (2025)
- RL-100: Performant Robotic Manipulation with Real-World Reinforcement Learning (2025)
- SITCOM: Scaling Inference-Time COMpute for VLAs (2025)
- WMPO: World Model-based Policy Optimization for Vision-Language-Action Models (2025)
- Beyond Success: Refining Elegant Robot Manipulation from Mixed-Quality Data via Just-in-Time Intervention (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper