removing ARM quants
Browse files- DeepSeek-V2.5-Q4_0_8_8/DeepSeek-V2.5-Q4_0_8_8-00001-of-00004.gguf +0 -3
- DeepSeek-V2.5-Q4_0_8_8/DeepSeek-V2.5-Q4_0_8_8-00002-of-00004.gguf +0 -3
- DeepSeek-V2.5-Q4_0_8_8/DeepSeek-V2.5-Q4_0_8_8-00003-of-00004.gguf +0 -3
- DeepSeek-V2.5-Q4_0_8_8/DeepSeek-V2.5-Q4_0_8_8-00004-of-00004.gguf +0 -3
- README.md +0 -7
DeepSeek-V2.5-Q4_0_8_8/DeepSeek-V2.5-Q4_0_8_8-00001-of-00004.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:5eebd6e6dfa8f5770fe170397854f090a6196f127a8568c0ae6848c743a39868
|
| 3 |
-
size 39658925312
|
|
|
|
|
|
|
|
|
|
|
|
DeepSeek-V2.5-Q4_0_8_8/DeepSeek-V2.5-Q4_0_8_8-00002-of-00004.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:648bef998337f9868e04cd99414037370bd180d5f2f5beb840f80e691c4dd3cf
|
| 3 |
-
size 39675124704
|
|
|
|
|
|
|
|
|
|
|
|
DeepSeek-V2.5-Q4_0_8_8/DeepSeek-V2.5-Q4_0_8_8-00003-of-00004.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:53ca9a75c289504ca24cb1af67bd1a4c5b16ba79d7ba09b0cea89f8ca94db18b
|
| 3 |
-
size 39447549440
|
|
|
|
|
|
|
|
|
|
|
|
DeepSeek-V2.5-Q4_0_8_8/DeepSeek-V2.5-Q4_0_8_8-00004-of-00004.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:58cf99d2f5488a43247bc3ef5e1f543fabc304c9e15f61cc5bec616e7fce7971
|
| 3 |
-
size 14130858304
|
|
|
|
|
|
|
|
|
|
|
|
README.md
CHANGED
|
@@ -32,7 +32,6 @@ Run them in [LM Studio](https://lmstudio.ai/)
|
|
| 32 |
| [DeepSeek-V2.5-Q5_K_M.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q5_K_M) | Q5_K_M | 167.22GB | true | High quality, *recommended*. |
|
| 33 |
| [DeepSeek-V2.5-Q4_K_M.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q4_K_M) | Q4_K_M | 142.45GB | true | Good quality, default size for must use cases, *recommended*. |
|
| 34 |
| [DeepSeek-V2.5-Q4_0.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q4_0) | Q4_0 | 133.39GB | true | Legacy format, generally not worth using over similarly sized formats |
|
| 35 |
-
| [DeepSeek-V2.5-Q4_0_8_8.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q4_0_8_8) | Q4_0_8_8 | 132.91GB | true | Optimized for ARM inference. Requires 'sve' support (see link below). |
|
| 36 |
| [DeepSeek-V2.5-IQ4_XS.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-IQ4_XS) | IQ4_XS | 125.56GB | true | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
|
| 37 |
| [DeepSeek-V2.5-Q3_K_XL.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q3_K_XL) | Q3_K_XL | 122.83GB | true | Uses Q8_0 for embed and output weights. Lower quality but usable, good for low RAM availability. |
|
| 38 |
| [DeepSeek-V2.5-Q3_K_L.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q3_K_L) | Q3_K_L | 122.37GB | true | Lower quality but usable, good for low RAM availability. |
|
|
@@ -76,12 +75,6 @@ huggingface-cli download bartowski/DeepSeek-V2.5-GGUF --include "DeepSeek-V2.5-Q
|
|
| 76 |
|
| 77 |
You can either specify a new local-dir (DeepSeek-V2.5-Q8_0) or download them all in place (./)
|
| 78 |
|
| 79 |
-
## Q4_0_X_X
|
| 80 |
-
|
| 81 |
-
If you're using an ARM chip, the Q4_0_X_X quants will have a substantial speedup. Check out Q4_0_4_4 speed comparisons [on the original pull request](https://github.com/ggerganov/llama.cpp/pull/5780#pullrequestreview-21657544660)
|
| 82 |
-
|
| 83 |
-
To check which one would work best for your ARM chip, you can check [AArch64 SoC features](https://gpages.juszkiewicz.com.pl/arm-socs-table/arm-socs.html) (thanks EloyOn!).
|
| 84 |
-
|
| 85 |
## Which file should I choose?
|
| 86 |
|
| 87 |
A great write up with charts showing various performances is provided by Artefact2 [here](https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9)
|
|
|
|
| 32 |
| [DeepSeek-V2.5-Q5_K_M.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q5_K_M) | Q5_K_M | 167.22GB | true | High quality, *recommended*. |
|
| 33 |
| [DeepSeek-V2.5-Q4_K_M.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q4_K_M) | Q4_K_M | 142.45GB | true | Good quality, default size for must use cases, *recommended*. |
|
| 34 |
| [DeepSeek-V2.5-Q4_0.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q4_0) | Q4_0 | 133.39GB | true | Legacy format, generally not worth using over similarly sized formats |
|
|
|
|
| 35 |
| [DeepSeek-V2.5-IQ4_XS.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-IQ4_XS) | IQ4_XS | 125.56GB | true | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
|
| 36 |
| [DeepSeek-V2.5-Q3_K_XL.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q3_K_XL) | Q3_K_XL | 122.83GB | true | Uses Q8_0 for embed and output weights. Lower quality but usable, good for low RAM availability. |
|
| 37 |
| [DeepSeek-V2.5-Q3_K_L.gguf](https://huggingface.co/bartowski/DeepSeek-V2.5-GGUF/tree/main/DeepSeek-V2.5-Q3_K_L) | Q3_K_L | 122.37GB | true | Lower quality but usable, good for low RAM availability. |
|
|
|
|
| 75 |
|
| 76 |
You can either specify a new local-dir (DeepSeek-V2.5-Q8_0) or download them all in place (./)
|
| 77 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 78 |
## Which file should I choose?
|
| 79 |
|
| 80 |
A great write up with charts showing various performances is provided by Artefact2 [here](https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9)
|