Update README.md
Browse files
README.md
CHANGED
|
@@ -14,6 +14,9 @@ license: artistic-2.0
|
|
| 14 |
# juanako-7b-v1 (UNA: Uniform Neural Alignment)
|
| 15 |
|
| 16 |
This model uses uniform neural alignment (UNA) for the DPO training phases and is a fine-tuned version of [fblgit/zephyr-lora-dpo-b1](https://huggingface.co/fblgit/zephyr-lora-dpo-b1) on the HuggingFaceH4/ultrafeedback_binarized dataset.
|
|
|
|
|
|
|
|
|
|
| 17 |
It achieves the following results on the evaluation set:
|
| 18 |
- Loss: 0.4594
|
| 19 |
- Rewards/chosen: -1.1095
|
|
@@ -27,7 +30,7 @@ It achieves the following results on the evaluation set:
|
|
| 27 |
|
| 28 |
Followed [alignment-handbook](https://github.com/huggingface/alignment-handbook) to perform DPO (Phase 2) over Zephyr-SFT model.
|
| 29 |
|
| 30 |
-
**Please feel free to run more tests and commit the results. Also if you are interested to participate in [UNA's paper research or GPU sponsorship](mailto:
|
| 31 |
|
| 32 |
Special thanks to [TheBloke](https://huggingface.co/TheBloke) for converting the model into multiple formats and overall his enormous contribution to the community.
|
| 33 |
Here are the models:
|
|
|
|
| 14 |
# juanako-7b-v1 (UNA: Uniform Neural Alignment)
|
| 15 |
|
| 16 |
This model uses uniform neural alignment (UNA) for the DPO training phases and is a fine-tuned version of [fblgit/zephyr-lora-dpo-b1](https://huggingface.co/fblgit/zephyr-lora-dpo-b1) on the HuggingFaceH4/ultrafeedback_binarized dataset.
|
| 17 |
+
|
| 18 |
+
**It is recommended to use the latest [Juanako Version](https://huggingface.co/fblgit/juanako-7b-UNA) which highly outperforms the v1**
|
| 19 |
+
|
| 20 |
It achieves the following results on the evaluation set:
|
| 21 |
- Loss: 0.4594
|
| 22 |
- Rewards/chosen: -1.1095
|
|
|
|
| 30 |
|
| 31 |
Followed [alignment-handbook](https://github.com/huggingface/alignment-handbook) to perform DPO (Phase 2) over Zephyr-SFT model.
|
| 32 |
|
| 33 |
+
**Please feel free to run more tests and commit the results. Also if you are interested to participate in [UNA's paper research or GPU sponsorship](mailto:xavi@juanako.ai) to support UNA research, feel free to contact.**
|
| 34 |
|
| 35 |
Special thanks to [TheBloke](https://huggingface.co/TheBloke) for converting the model into multiple formats and overall his enormous contribution to the community.
|
| 36 |
Here are the models:
|