AI & ML interests
LLM
Recent Activity
View all activity
-
OpenMOSS-Team/MOSS-TTS
Text-to-Speech • 8B • Updated • 99.8k • 350 -
OpenMOSS-Team/MOSS-TTS-Realtime
Text-to-Speech • 2B • Updated • 83.6k • 67 -
OpenMOSS-Team/MOSS-TTS-Local-Transformer
Text-to-Speech • 3B • Updated • 56.9k • 21 -
OpenMOSS-Team/MOSS-Audio-Tokenizer
Feature Extraction • 2B • Updated • 77.6k • 37
True Speech-to-Speech Langugage Model
First Omni-modal Future Forecasting Benchmark
https://github.com/OpenMOSS/FRoM-W1
Proactive Robot Manipulation in Omni-modal Context
Open source weights of Lorsa modules introduced in "Towards Understanding the Nature of Attention with Low-Rank Sparse Decomposition".
The MHA2MLA model published in the paper "Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-Based LLMs"
-
Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs
Paper • 2502.14837 • Published • 3 -
OpenMOSS-Team/Llama-2-7B-MLA-d_kv_16
Text Generation • 6B • Updated • 2 -
OpenMOSS-Team/Llama-2-7B-MLA-d_kv_32
Text Generation • 6B • Updated -
OpenMOSS-Team/Llama-2-7B-MLA-d_kv_64
Text Generation • 7B • Updated • 20
Opensource Lorsas and Transcoders
A unified multimodal large language model for end-to-end speaker-attributed, time-stamped transcription.
Evaluating Agentic Backend Coding Capabilities in Real-World Development Scenarios
[ICLR 2026] Game-RL: Synthesizing Multimodal Verifiable Game Data to Boost VLMs' General Reasoning
An Efficient Training Framework for Diffusion Language Models
-
World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning
Paper • 2503.10480 • Published • 56 -
Unleashing Embodied Task Planning Ability in LLMs via Reinforcement Learning
Paper • 2506.23127 • Published • 1 -
World-aware Planning Narratives Enhance Large Vision-Language Model Planner
Paper • 2506.21230 • Published -
OpenMOSS-Team/Embodied_R1-ScienceWorld
8B • Updated • 3
The MHA2MLA model published in the paper "Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-Based LLMs"
-
OpenMOSS-Team/SmolLM-135M-MLA-d_kv_8-refactor
Text Generation • 0.1B • Updated • 2 -
OpenMOSS-Team/SmolLM-135M-MLA-d_kv_32-refactor
Text Generation • 0.1B • Updated -
OpenMOSS-Team/SmolLM-135M-MLA-d_kv_16-refactor
Text Generation • 0.1B • Updated • 3 -
OpenMOSS-Team/SmolLM-360M-MLA-d_kv_8-refactor
Text Generation • 0.3B • Updated • 2
Opensource Lorsas and Transcoders
-
OpenMOSS-Team/MOSS-TTS
Text-to-Speech • 8B • Updated • 99.8k • 350 -
OpenMOSS-Team/MOSS-TTS-Realtime
Text-to-Speech • 2B • Updated • 83.6k • 67 -
OpenMOSS-Team/MOSS-TTS-Local-Transformer
Text-to-Speech • 3B • Updated • 56.9k • 21 -
OpenMOSS-Team/MOSS-Audio-Tokenizer
Feature Extraction • 2B • Updated • 77.6k • 37
A unified multimodal large language model for end-to-end speaker-attributed, time-stamped transcription.
True Speech-to-Speech Langugage Model
Evaluating Agentic Backend Coding Capabilities in Real-World Development Scenarios
First Omni-modal Future Forecasting Benchmark
[ICLR 2026] Game-RL: Synthesizing Multimodal Verifiable Game Data to Boost VLMs' General Reasoning
https://github.com/OpenMOSS/FRoM-W1
An Efficient Training Framework for Diffusion Language Models
Proactive Robot Manipulation in Omni-modal Context
-
World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning
Paper • 2503.10480 • Published • 56 -
Unleashing Embodied Task Planning Ability in LLMs via Reinforcement Learning
Paper • 2506.23127 • Published • 1 -
World-aware Planning Narratives Enhance Large Vision-Language Model Planner
Paper • 2506.21230 • Published -
OpenMOSS-Team/Embodied_R1-ScienceWorld
8B • Updated • 3
Open source weights of Lorsa modules introduced in "Towards Understanding the Nature of Attention with Low-Rank Sparse Decomposition".
The MHA2MLA model published in the paper "Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-Based LLMs"
-
OpenMOSS-Team/SmolLM-135M-MLA-d_kv_8-refactor
Text Generation • 0.1B • Updated • 2 -
OpenMOSS-Team/SmolLM-135M-MLA-d_kv_32-refactor
Text Generation • 0.1B • Updated -
OpenMOSS-Team/SmolLM-135M-MLA-d_kv_16-refactor
Text Generation • 0.1B • Updated • 3 -
OpenMOSS-Team/SmolLM-360M-MLA-d_kv_8-refactor
Text Generation • 0.3B • Updated • 2
The MHA2MLA model published in the paper "Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-Based LLMs"
-
Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs
Paper • 2502.14837 • Published • 3 -
OpenMOSS-Team/Llama-2-7B-MLA-d_kv_16
Text Generation • 6B • Updated • 2 -
OpenMOSS-Team/Llama-2-7B-MLA-d_kv_32
Text Generation • 6B • Updated -
OpenMOSS-Team/Llama-2-7B-MLA-d_kv_64
Text Generation • 7B • Updated • 20