logo

1. Introduction

GRM-2.6-Plus is a 27B-parameter reasoning model built for general-purpose AI and optimized for difficult, high-complexity tasks. It is designed to deliver stronger performance for its size while remaining practical, efficient, and accessible for advanced local and research-oriented use.

The model focuses on structured reasoning, helping it produce more accurate, coherent, and reliable responses across demanding problems. GRM-2.6-Plus brings elite-level reasoning to complex workloads, making it suitable for users who need a capable model for advanced problem-solving, coding, agents, and everyday intelligence.

2. Key Capabilities

Elite-Level Reasoning for Hard Tasks: GRM-2.6-Plus is optimized to handle difficult reasoning workloads with clarity, consistency, and strong step-by-step problem-solving ability.
High Performance for Its Size: With 27B parameters, the model is designed to deliver excellent capability relative to its scale, balancing strong intelligence with practical deployment.
Advanced Coding and Agentic Use: GRM-2.6-Plus is well suited for code generation, structured problem-solving, tool-style workflows, and local agentic applications.
Optimized for Practical Deployment: The model aims to remain efficient and usable across capable consumer and workstation hardware while offering strong performance for advanced tasks.

3. Performance

GRM-2.6-Plus is designed to be a highly capable 27B local AI model for complex reasoning, coding, everyday chat, and agentic workflows. It focuses on delivering better performance for its size, making it a strong option for users who want powerful reasoning without relying only on massive-scale models.

Its core strength is practical intelligence: elite-level reasoning, strong task understanding, stable responses, and the ability to handle difficult problems across multiple domains.

	GRM-2.6-Plus	Qwen3.6-27B	google/gemma-4-31B-it	GPT-5.4-Mini	Claude-4.5-Haiku
Knowledge & STEM
MMLU-Pro	86.8	86.2	85.2	--	80.0
MMLU-Redux	94.2	93.5	93.7	--	--
C-Eval	92.0	91.4	82.6	--	--
GPQA Diamond	88.3	87.8	84.3	88.0	73.0
SuperGPQA	66.4	66.0	65.7	--	--
Reasoning & Coding
LiveCodeBench v6	84.8	83.9	80.0	--	51.1
HMMT Feb 26	84.8	84.3	77.2	--	--
AIME26	95.1	94.1	89.2	--	--
General Agent
SWE-bench Verified	77.7	77.2	52.0	--	73.3
SWE-bench Pro	54.0	53.5	35.7	54.4	--
Terminal-Bench 2.0	59.8	59.3	42.9	60.0	41.0

4. Family

The GRM-2.6 family is available in various sizes to suit every case.

Model	Size	Domain
GRM-2.6-Plus	27B	Powerful model for extremely difficult tasks
GRM-2.6	9B	Powerful on-device deployment for difficult tasks
GRM-2.6-Air	2B	Any-device deployment for everyday chat

5. Architecture

GRM-2.6 is built on the Qwen3.6 architecture and is optimized for complex tasks, agent environments, and everyday chat.

GRM-2.6 applies the same principle to a stronger, larger foundation, resulting in a model that punches above its weight class on structured reasoning tasks while remaining deployable on consumer hardware.

6. Quick start

Before starting, make sure it is installed and the API key and the API base URL is configured, e.g.:

pip install -U openai

# Set the following accordingly
export OPENAI_BASE_URL="http://localhost:8000/v1"
export OPENAI_API_KEY="EMPTY"

Text-Only Input

from openai import OpenAI
# Configured by environment variables
client = OpenAI()

messages = [
    {"role": "user", "content": "Create an calculator in a single HTML file backwards"},
]

chat_response = client.chat.completions.create(
    model="OrionLLM/GRM-2.5-Plus",
    messages=messages,
    max_tokens=81920,
    temperature=1.0,
    top_p=0.95,
    presence_penalty=0.0,
    extra_body={
        "top_k": 20,
    }, 
)
print("Chat response:", chat_response)