add note about flash attn2 compat and quant
Browse files
README.md
CHANGED
|
@@ -8,6 +8,11 @@ Special thanks to https://huggingface.co/fahadh4ilyas
|
|
| 8 |
# use this to convert original dbrx models
|
| 9 |
convert_v2.py
|
| 10 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
---
|
| 12 |
inference: false
|
| 13 |
license: other
|
|
|
|
| 8 |
# use this to convert original dbrx models
|
| 9 |
convert_v2.py
|
| 10 |
```
|
| 11 |
+
|
| 12 |
+
Known Issues:
|
| 13 |
+
|
| 14 |
+
1. [TRAINING] padding in-compatibility with flash attention2. If you use padding, training may fail
|
| 15 |
+
2. [QUANT GPTQ] testing...
|
| 16 |
---
|
| 17 |
inference: false
|
| 18 |
license: other
|