rStar-Coder-Qwen3-0.6B-q8-hi-mlx / README.md

nightmedia

Update README.md

124d4c8 verified 4 months ago

preview code

raw

history blame contribute delete

1.85 kB

metadata

license: apache-2.0
datasets:
  - microsoft/rStar-Coder
language:
  - en
base_model: prithivMLmods/rStar-Coder-Qwen3-0.6B
pipeline_tag: text-generation
library_name: mlx
tags:
  - text-generation-inference
  - chain-of-thought
  - trl
  - vllm
  - coder
  - code
  - core
  - python
  - math
  - gspo
  - mlx

rStar-Coder-Qwen3-0.6B-q8-hi-mlx

Performance evaluation

21194/21194 [16:09<00:00, 21.85it/s]

arc_challenge
acc 0.280, norm 0.300, stderr 0.013
arc_easy
acc 0.416, norm 0.377, stderr 0.009
boolq
acc 0.378, norm 0.378, stderr 0.008
hellaswag
acc 0.366, norm 0.434, stderr 0.004
openbookqa
acc 0.184, norm 0.342, stderr 0.021
piqa
acc 0.651, norm 0.652, stderr 0.011
winogrande
acc 0.532, norm 0.532, stderr 0.014

Performance evaluation of the parent model at BF16

21194/21194 [23:40<00:00, 14.92it/s]

arc_challenge
acc 0.279, norm 0.299, stderr 0.013
arc_easy
acc 0.420, norm 0.379, stderr 0.009
boolq
acc 0.378, norm 0.378, stderr 0.008
hellaswag
acc 0.366, norm 0.434, stderr 0.004
openbookqa
acc 0.186, norm 0.344, stderr 0.021
piqa
acc 0.656, norm 0.655, stderr 0.014
winogrande
acc 0.524, norm 0.524, stderr 0.014

This model rStar-Coder-Qwen3-0.6B-q8-hi-mlx was converted to MLX format from prithivMLmods/rStar-Coder-Qwen3-0.6B using mlx-lm version 0.26.3.

Use with mlx

pip install mlx-lm

from mlx_lm import load, generate

model, tokenizer = load("rStar-Coder-Qwen3-0.6B-q8-hi-mlx")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)