Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
ulab-ai
/
sotopia-rl-qwen2.5-7B-rm
like
1
Follow
ulab
26
Feature Extraction
Transformers
Safetensors
PEFT
reward-model
social-intelligence
reinforcement-learning
llm
qwen
arxiv:
2508.03905
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
2
Deploy
Use this model
skyyyyks
commited on
Jul 21
Commit
003f5f4
·
verified
·
1 Parent(s):
ad15141
Update README.md
Browse files
Files changed (1)
hide
show
README.md
+5
-3
README.md
CHANGED
Viewed
@@ -1,3 +1,5 @@
1
-
---
2
-
license: apache-2.0
3
-
---
1
+
---
2
+
license: apache-2.0
3
+
base_model:
4
+
- Qwen/Qwen2.5-7B-Instruct
5
+
---