Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

134

Full-text search

Active filters: vLLM

QuantTrio/Qwen3.5-122B-A10B-AWQ

Image-Text-to-Text • 125B • Updated 3 days ago • 16.2k • 8

QuantTrio/Qwen3.5-35B-A3B-AWQ

Image-Text-to-Text • 36B • Updated 3 days ago • 14.8k • 5

QuantTrio/Qwen3.5-397B-A17B-AWQ

Image-Text-to-Text • Updated 1 day ago • 3.11k • 4

QuantTrio/Qwen3.5-27B-AWQ

Image-Text-to-Text • 28B • Updated 3 days ago • 15.2k • 4

QuantTrio/MiniMax-M2.5-AWQ

Text Generation • 229B • Updated 13 days ago • 43.9k • 10

QuantTrio/GLM-5-AWQ

Text Generation • 586B • Updated about 20 hours ago • 50 • 2

QuantTrio/Qwen3-30B-A3B-Thinking-2507-AWQ

Text Generation • 31B • Updated Sep 5, 2025 • 5.04k • 4

JunHowie/Qwen3-4B-Instruct-2507-GPTQ-Int4

Text Generation • 4B • Updated Sep 4, 2025 • 21.9k • 2

QuantTrio/Qwen3-VL-30B-A3B-Thinking-AWQ

Text Generation • 31B • Updated Oct 8, 2025 • 3.47k • 12

QuantTrio/Qwen3-VL-32B-Thinking-AWQ

Image-Text-to-Text • 33B • Updated Dec 3, 2025 • 1.41k • 7

QuantTrio/GLM-4.7-Flash-AWQ

Text Generation • 31B • Updated Jan 21 • 120k • 7

QuantTrio/Qwen3-Coder-Next-E336

Text Generation • 53B • Updated 23 days ago • 107 • 1

QuantTrio/Qwen3-Coder-Next-E400

Text Generation • 63B • Updated 23 days ago • 1.22k • 2

model-scope/glm-4-9b-chat-GPTQ-Int4

Text Generation • 9B • Updated Jul 17, 2024 • 79 • 6

model-scope/glm-4-9b-chat-GPTQ-Int8

Text Generation • 9B • Updated Jul 23, 2024 • 6 • 2

tclf90/qwen2.5-72b-instruct-gptq-int4

Text Generation • 73B • Updated May 12, 2025 • 45 • 2

tclf90/qwen2.5-72b-instruct-gptq-int3

Text Generation • 69B • Updated May 12, 2025 • 73

prithivMLmods/Nu2-Lupi-Qwen-14B

Text Generation • 15B • Updated Mar 27, 2025 • 2 • 2

mradermacher/Nu2-Lupi-Qwen-14B-GGUF

15B • Updated Jul 11, 2025 • 230 • 1

mradermacher/Nu2-Lupi-Qwen-14B-i1-GGUF

15B • Updated Jul 11, 2025 • 215 • 1

JunHowie/Qwen3-0.6B-GPTQ-Int4

Text Generation • 0.6B • Updated Sep 3, 2025 • 219 • 1

JunHowie/Qwen3-0.6B-GPTQ-Int8

Text Generation • 0.6B • Updated Sep 3, 2025 • 35

JunHowie/Qwen3-1.7B-GPTQ-Int4

Text Generation • 2B • Updated Sep 3, 2025 • 600 • 1

JunHowie/Qwen3-1.7B-GPTQ-Int8

Text Generation • 2B • Updated Sep 3, 2025 • 5

JunHowie/Qwen3-32B-GPTQ-Int4

Text Generation • 33B • Updated Sep 5, 2025 • 11.7k • 4

JunHowie/Qwen3-32B-GPTQ-Int8

Text Generation • 33B • Updated Sep 5, 2025 • 1.7k • 4

JunHowie/Qwen3-30B-A3B-GPTQ-Int4

Text Generation • 5B • Updated Sep 6, 2025 • 8 • 1

JunHowie/Qwen3-14B-GPTQ-Int8

Text Generation • 15B • Updated Sep 5, 2025 • 417 • 1

JunHowie/Qwen3-14B-GPTQ-Int4

Text Generation • 15B • Updated Sep 5, 2025 • 1.7k • 4

JunHowie/Qwen3-8B-GPTQ-Int8

Text Generation • 8B • Updated Sep 4, 2025 • 106