Mukul
mtcl
AI & ML interests
None yet
Recent Activity
new activity 2 days ago
nvidia/nemotron-3.5-asr-streaming-0.6b:vllm support ? new activity 4 days ago
DevQuasar/MiniMaxAI.MiniMax-M3-GGUF:thank you for ggufs new activity 5 days ago
unsloth/MiniMax-M3-GGUF:tool call parser and reasoning parserOrganizations
None yet
vllm support ?
👍 9
4
#6 opened 12 days ago
by
sdd5125
thank you for ggufs
🤝 1
#1 opened 4 days ago
by
mtcl
tool call parser and reasoning parser
👍 1
2
#4 opened 5 days ago
by
mtcl
Vllm and SgLang command please
👍 1
5
#1 opened 10 days ago
by
mtcl
nvidia/DeepSeek-V4-flash-NVFP4
6
#1 opened 21 days ago
by
mtcl
Docker Image
8
#1 opened 21 days ago
by
mtcl
Worse than (smaller) MiniMax M2.7??
17
#2 opened about 2 months ago
by deleted
Unable to run on 2x RTX Pro 6000 (DEEP_GEMM problem)
➕ 10
17
#15 opened about 2 months ago
by
stev236
Running on 2 RTX Pro 6000 Blackwell GPUs at ~30 tps (Instructions that worked for me)
👍❤️ 7
10
#17 opened about 2 months ago
by
CarouselAether
2x Nvidia 6000 Pros
3
#2 opened about 2 months ago
by
mtcl
Will it work on 2X6000 Pros
6
#1 opened about 2 months ago
by
mtcl
Can I deploy it with sglang at my 8*4090 ubuntu sever?
10
#1 opened about 2 months ago
by
marshal007
Context Length for 2X6000 Pros (2x96 = 192GB VRAM)
3
#2 opened about 2 months ago
by
mtcl
really awesome speeds! running at 256k context.
🔥 1
5
#11 opened about 2 months ago
by
mtcl
MOE 122b and 397b please!
🚀 24
14
#7 opened about 2 months ago
by
jesleocizi
How to disable thinking?
4
#9 opened about 2 months ago
by
Hansi2024
These are NOT actual AWQ-quantized models.
2
#1 opened 2 months ago
by
cai-cai
max context
#2 opened about 2 months ago
by
mtcl
No think tags.
10
#4 opened about 2 months ago
by
DrRos
Minimax M2.7 NVFP4
👀🔥 5
4
#4 opened 2 months ago
by
mtcl