Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2603.03276

Heterogeneous Agent Collaborative Reinforcement Learning

Paper • 2603.02604 • Published 12 days ago • 173
Beyond Language Modeling: An Exploration of Multimodal Pretraining

Paper • 2603.03276 • Published 11 days ago • 89
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

Paper • 2602.05400 • Published Feb 5 • 347

about 11 hours ago

Beyond Language Modeling: An Exploration of Multimodal Pretraining

Paper • 2603.03276 • Published 11 days ago • 89
Qwen3-Coder-Next Technical Report

Paper • 2603.00729 • Published 14 days ago • 47
Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use

Paper • 2603.03205 • Published 11 days ago • 11
AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios

Paper • 2602.23166 • Published 16 days ago • 40

Endless Terminals: Scaling RL Environments for Terminal Agents

Paper • 2601.16443 • Published Jan 23 • 18
Linear representations in language models can change dramatically over a conversation

Paper • 2601.20834 • Published Jan 28 • 21
Scaling Embeddings Outperforms Scaling Experts in Language Models

Paper • 2601.21204 • Published Jan 29 • 102
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability

Paper • 2601.18778 • Published Jan 26 • 41

Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward

Paper • 2510.03222 • Published Oct 3, 2025 • 75
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

Paper • 2510.05592 • Published Oct 7, 2025 • 108
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 509
Multi-Agent Tool-Integrated Policy Optimization

Paper • 2510.04678 • Published Oct 6, 2025 • 31

UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models

Paper • 2410.14059 • Published Oct 17, 2024 • 63
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching

Paper • 2503.05179 • Published Mar 7, 2025 • 46
Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6, 2025 • 96
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing

Paper • 2503.10639 • Published Mar 13, 2025 • 53

Understanding of Multimodal

Beyond Language Modeling: An Exploration of Multimodal Pretraining

Paper • 2603.03276 • Published 11 days ago • 89

Beyond Language Modeling: An Exploration of Multimodal Pretraining

Paper • 2603.03276 • Published 11 days ago • 89

GARDO: Reinforcing Diffusion Models without Reward Hacking

Paper • 2512.24138 • Published Dec 30, 2025 • 30
DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models

Paper • 2512.24165 • Published Dec 30, 2025 • 51
Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space

Paper • 2512.24617 • Published Dec 31, 2025 • 65
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss

Paper • 2512.23447 • Published Dec 29, 2025 • 98

Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published Mar 26, 2025 • 170
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO

Paper • 2505.22453 • Published May 28, 2025 • 46
UniRL: Self-Improving Unified Multimodal Models via Supervised and Reinforcement Learning

Paper • 2505.23380 • Published May 29, 2025 • 22
More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models

Paper • 2505.21523 • Published May 23, 2025 • 13

Image-Video MultiModal Understanding

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published Dec 13, 2024 • 147
SeFAR: Semi-supervised Fine-grained Action Recognition with Temporal Perturbation and Learning Stabilization

Paper • 2501.01245 • Published Jan 2, 2025 • 5
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

Paper • 2501.00599 • Published Dec 31, 2024 • 46
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks

Paper • 2501.08326 • Published Jan 14, 2025 • 34

Heterogeneous Agent Collaborative Reinforcement Learning

Paper • 2603.02604 • Published 12 days ago • 173
Beyond Language Modeling: An Exploration of Multimodal Pretraining

Paper • 2603.03276 • Published 11 days ago • 89
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

Paper • 2602.05400 • Published Feb 5 • 347

Understanding of Multimodal

Beyond Language Modeling: An Exploration of Multimodal Pretraining

Paper • 2603.03276 • Published 11 days ago • 89

about 11 hours ago

Beyond Language Modeling: An Exploration of Multimodal Pretraining

Paper • 2603.03276 • Published 11 days ago • 89
Qwen3-Coder-Next Technical Report

Paper • 2603.00729 • Published 14 days ago • 47
Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use

Paper • 2603.03205 • Published 11 days ago • 11
AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios

Paper • 2602.23166 • Published 16 days ago • 40

Beyond Language Modeling: An Exploration of Multimodal Pretraining

Paper • 2603.03276 • Published 11 days ago • 89

Endless Terminals: Scaling RL Environments for Terminal Agents

Paper • 2601.16443 • Published Jan 23 • 18
Linear representations in language models can change dramatically over a conversation

Paper • 2601.20834 • Published Jan 28 • 21
Scaling Embeddings Outperforms Scaling Experts in Language Models

Paper • 2601.21204 • Published Jan 29 • 102
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability

Paper • 2601.18778 • Published Jan 26 • 41

GARDO: Reinforcing Diffusion Models without Reward Hacking

Paper • 2512.24138 • Published Dec 30, 2025 • 30
DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models

Paper • 2512.24165 • Published Dec 30, 2025 • 51
Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space

Paper • 2512.24617 • Published Dec 31, 2025 • 65
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss

Paper • 2512.23447 • Published Dec 29, 2025 • 98

Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward

Paper • 2510.03222 • Published Oct 3, 2025 • 75
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

Paper • 2510.05592 • Published Oct 7, 2025 • 108
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 509
Multi-Agent Tool-Integrated Policy Optimization

Paper • 2510.04678 • Published Oct 6, 2025 • 31

Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published Mar 26, 2025 • 170
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO

Paper • 2505.22453 • Published May 28, 2025 • 46
UniRL: Self-Improving Unified Multimodal Models via Supervised and Reinforcement Learning

Paper • 2505.23380 • Published May 29, 2025 • 22
More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models

Paper • 2505.21523 • Published May 23, 2025 • 13

UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models

Paper • 2410.14059 • Published Oct 17, 2024 • 63
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching

Paper • 2503.05179 • Published Mar 7, 2025 • 46
Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6, 2025 • 96
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing

Paper • 2503.10639 • Published Mar 13, 2025 • 53

Image-Video MultiModal Understanding

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published Dec 13, 2024 • 147
SeFAR: Semi-supervised Fine-grained Action Recognition with Temporal Perturbation and Learning Stabilization

Paper • 2501.01245 • Published Jan 2, 2025 • 5
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

Paper • 2501.00599 • Published Dec 31, 2024 • 46
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks

Paper • 2501.08326 • Published Jan 14, 2025 • 34

Previous
1
2
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs