Motit commited on
Commit
1fa6d4c
·
verified ·
1 Parent(s): 0ea9444

added graphs and link to the blog post

Browse files
Files changed (1) hide show
  1. README.md +7 -5
README.md CHANGED
@@ -8,6 +8,8 @@ library_name: transformers
8
  ## Introduction
9
 
10
  AI21’s Jamba Reasoning 3B is a top-performing reasoning model that packs leading scores on intelligence benchmarks and highly-efficient processing into a compact 3B build.
 
 
11
 
12
  ### Key Advantages
13
 
@@ -15,19 +17,19 @@ AI21’s Jamba Reasoning 3B is a top-performing reasoning model that packs leadi
15
 
16
  The hybrid design combines Transformer attention with Mamba (a state-space model). Mamba layers are more efficient for sequence processing, while attention layers capture complex dependencies. This mix reduces memory overhead, improves throughput, and makes the model run smoothly on laptops, GPUs, and even mobile devices, while maintainig impressive quality.
17
 
18
- *Placeholder graph: Intelligence vs Speed*
19
 
20
 
21
  **Smart: Leading intelligence scores**
22
  The model outperforms competitors, such as Gemma 3 4B, Llama 3.2 3B, and Granite 4.0 Micro, on a combined intelligence score that averages 6 standard benchmarks.
23
 
24
- *Placeholder graph: MMLU + HLE + IFBench*
25
 
26
 
27
  **Scalable: Handles very long contexts**
28
 
29
  Unlike most compact models, Jamba Reasoning 3B supports extremely long contexts. Mamba layers allow the model to process inputs without storing massive attention caches, so it scales to **256K tokens** while keeping inference practical. This makes it suitable for edge deployment as well as datacenter workloads.
30
- *Placeholder graph: On-Device Speed as Context Scales*
31
 
32
 
33
  ## Model Details
@@ -50,7 +52,7 @@ Unlike most compact models, Jamba Reasoning 3B supports extremely long contexts.
50
  | Llama 3.2 3B | 35.0% | 5.2% | 26.0% |
51
  | Gemma 3 4B | 42.0% | 5.2% | 28.0% |
52
  | Qwen 3 1.7B | 57.0% | 4.8% | 27.0% |
53
- | Qwen 3 4B | 70% | 5.1% | 0.33 |
54
  | **Jamba Reasoning 3B** | **61.0%** | **6.0%** | **52.0%** |
55
 
56
  ## Quickstart
@@ -238,4 +240,4 @@ Full support for training Jamba through VeRL will be available soon. AI21 has in
238
 
239
  ## Citation
240
 
241
- - Blog post- Placeholder
 
8
  ## Introduction
9
 
10
  AI21’s Jamba Reasoning 3B is a top-performing reasoning model that packs leading scores on intelligence benchmarks and highly-efficient processing into a compact 3B build.
11
+ <br> Read the full blog post [here](https://www.ai21.com/blog/introducing-jamba-reasoning-3B).
12
+
13
 
14
  ### Key Advantages
15
 
 
17
 
18
  The hybrid design combines Transformer attention with Mamba (a state-space model). Mamba layers are more efficient for sequence processing, while attention layers capture complex dependencies. This mix reduces memory overhead, improves throughput, and makes the model run smoothly on laptops, GPUs, and even mobile devices, while maintainig impressive quality.
19
 
20
+ <img src="https://huggingface.co/ai21labs/AI21-Jamba-Reasoning-3B-GGUF/resolve/main/assets/Intelligence%20vs%20Speed%20Jamba%20Reasoning%203B.png" width="900"/>
21
 
22
 
23
  **Smart: Leading intelligence scores**
24
  The model outperforms competitors, such as Gemma 3 4B, Llama 3.2 3B, and Granite 4.0 Micro, on a combined intelligence score that averages 6 standard benchmarks.
25
 
26
+ <img src="https://huggingface.co/ai21labs/AI21-Jamba-Reasoning-3B-GGUF/resolve/main/assets/Jamba%20Reasoning%203B%20Quality%20Benchmarks.png" width="900"/>
27
 
28
 
29
  **Scalable: Handles very long contexts**
30
 
31
  Unlike most compact models, Jamba Reasoning 3B supports extremely long contexts. Mamba layers allow the model to process inputs without storing massive attention caches, so it scales to **256K tokens** while keeping inference practical. This makes it suitable for edge deployment as well as datacenter workloads.
32
+ <img src="https://huggingface.co/ai21labs/AI21-Jamba-Reasoning-3B-GGUF/resolve/main/assets/Speed%20vs%20Context%20Length.png" width="900"/>
33
 
34
 
35
  ## Model Details
 
52
  | Llama 3.2 3B | 35.0% | 5.2% | 26.0% |
53
  | Gemma 3 4B | 42.0% | 5.2% | 28.0% |
54
  | Qwen 3 1.7B | 57.0% | 4.8% | 27.0% |
55
+ | Qwen 3 4B | 70% | 5.1% | 33% |
56
  | **Jamba Reasoning 3B** | **61.0%** | **6.0%** | **52.0%** |
57
 
58
  ## Quickstart
 
240
 
241
  ## Citation
242
 
243
+ - Blog post- Read the full blog post [here](https://www.ai21.com/blog/introducing-jamba-reasoning-3B)