Update README.md
Browse files
README.md
CHANGED
|
@@ -29,7 +29,7 @@ This repository contains a robust, general-domain generative reward model presen
|
|
| 29 |
|
| 30 |
- **Paper**: [One Token to Fool LLM-as-a-Judge](https://huggingface.co/papers/2507.08794)
|
| 31 |
- **Training Data**: [https://huggingface.co/datasets/sarosavo/Master-RM](https://huggingface.co/datasets/sarosavo/Master-RM)
|
| 32 |
-
- **Code/GitHub Repository**: [https://github.com/Yulai-Zhao/Robust-Reward-Model](https://github.com/Yulai-Zhao/Robust-Reward-Model)
|
| 33 |
- **Training algorithm**: Standard supervised fine-tuning, see Appendix A.2 for more details.
|
| 34 |
|
| 35 |
## Model Description
|
|
|
|
| 29 |
|
| 30 |
- **Paper**: [One Token to Fool LLM-as-a-Judge](https://huggingface.co/papers/2507.08794)
|
| 31 |
- **Training Data**: [https://huggingface.co/datasets/sarosavo/Master-RM](https://huggingface.co/datasets/sarosavo/Master-RM)
|
| 32 |
+
<!-- - **Code/GitHub Repository**: [https://github.com/Yulai-Zhao/Robust-Reward-Model](https://github.com/Yulai-Zhao/Robust-Reward-Model) -->
|
| 33 |
- **Training algorithm**: Standard supervised fine-tuning, see Appendix A.2 for more details.
|
| 34 |
|
| 35 |
## Model Description
|