Update README.md
Browse files
README.md
CHANGED
|
@@ -33,6 +33,8 @@ The reason is, the FFN (feed forward networks) of gpt-oss do not behave nicely w
|
|
| 33 |
|
| 34 |
The rest of these are provided for your own interest in case you feel like experimenting, but the size savings is basically non-existent so I would not recommend running them, they are provided simply for show:
|
| 35 |
|
|
|
|
|
|
|
| 36 |
| [gpt-oss-120b-bf16.gguf](https://huggingface.co/bartowski/openai_gpt-oss-120b-GGUF/tree/main/openai_gpt-oss-120b-bf16) | bf16 | 65.37GB | true | Full BF16 weights. |
|
| 37 |
| [gpt-oss-120b-Q6_K.gguf](https://huggingface.co/bartowski/openai_gpt-oss-120b-GGUF/tree/main/openai_gpt-oss-120b-Q6_K) | Q6_K | 63.28GB | true | Q6_K with all FFN kept at MXFP4_MOE. |
|
| 38 |
| [gpt-oss-120b-Q4_K_L.gguf](https://huggingface.co/bartowski/openai_gpt-oss-120b-GGUF/tree/main/openai_gpt-oss-120b-Q4_K_L) | Q4_K_L | 63.06GB | true | Uses Q8_0 for embed and output weights. Q4_K_M with all FFN kept at MXFP4_MOE. |
|
|
|
|
| 33 |
|
| 34 |
The rest of these are provided for your own interest in case you feel like experimenting, but the size savings is basically non-existent so I would not recommend running them, they are provided simply for show:
|
| 35 |
|
| 36 |
+
| Filename | Quant type | File Size | Split | Description |
|
| 37 |
+
| -------- | ---------- | --------- | ----- | ----------- |
|
| 38 |
| [gpt-oss-120b-bf16.gguf](https://huggingface.co/bartowski/openai_gpt-oss-120b-GGUF/tree/main/openai_gpt-oss-120b-bf16) | bf16 | 65.37GB | true | Full BF16 weights. |
|
| 39 |
| [gpt-oss-120b-Q6_K.gguf](https://huggingface.co/bartowski/openai_gpt-oss-120b-GGUF/tree/main/openai_gpt-oss-120b-Q6_K) | Q6_K | 63.28GB | true | Q6_K with all FFN kept at MXFP4_MOE. |
|
| 40 |
| [gpt-oss-120b-Q4_K_L.gguf](https://huggingface.co/bartowski/openai_gpt-oss-120b-GGUF/tree/main/openai_gpt-oss-120b-Q4_K_L) | Q4_K_L | 63.06GB | true | Uses Q8_0 for embed and output weights. Q4_K_M with all FFN kept at MXFP4_MOE. |
|