Siliang Zeng

SiliangZ

https://siliangzeng.github.io/index.html

AI & ML interests

Alignment, RLHF, LLM

Recent Activity

upvoted a paper 29 days ago

Agent Learning via Early Experience

upvoted a paper 8 months ago

Reinforcing Multi-Turn Reasoning in LLM Agents via Turn-Level Credit Assignment

updated a model 10 months ago

SiliangZ/zephyr-7b-dpo-full

View all activity

Organizations

upvoted a paper 29 days ago

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9, 2025 • 271

upvoted a paper 8 months ago

Reinforcing Multi-Turn Reasoning in LLM Agents via Turn-Level Credit Assignment

Paper • 2505.11821 • Published May 17, 2025 • 14

updated a model 10 months ago

SiliangZ/zephyr-7b-dpo-full

7B • Updated Mar 14, 2025 • 1

published a model 10 months ago

SiliangZ/zephyr-7b-dpo-full

7B • Updated Mar 14, 2025 • 1

updated a Space 10 months ago

Wiki Tool Use

📈

published a Space 10 months ago

Wiki Tool Use

📈

updated a dataset 12 months ago

SiliangZ/mistral_irl3_rm_data_idpo

Viewer • Updated Jan 21, 2025 • 208k • 8

published a dataset 12 months ago

SiliangZ/mistral_irl3_rm_data_idpo

Viewer • Updated Jan 21, 2025 • 208k • 8

updated a dataset 12 months ago

SiliangZ/mistral_irl3_rm_data_combined_idpo

Viewer • Updated Jan 20, 2025 • 624k • 14

published a dataset 12 months ago

SiliangZ/mistral_irl3_rm_data_combined_idpo

Viewer • Updated Jan 20, 2025 • 624k • 14

updated a model 12 months ago

SiliangZ/mistral-irl-iter2-iterative-dpo

Text Generation • 7B • Updated Jan 20, 2025 • 1

published a model 12 months ago

SiliangZ/mistral-irl-iter2-iterative-dpo

Text Generation • 7B • Updated Jan 20, 2025 • 1

updated a model 12 months ago

SiliangZ/RM_Zephyr_dpo_init_ultrafeedbck_lr_5e7

Text Classification • 7B • Updated Jan 19, 2025 • 1

published a model 12 months ago

SiliangZ/RM_Zephyr_dpo_init_ultrafeedbck_lr_5e7

Text Classification • 7B • Updated Jan 19, 2025 • 1

updated a model 12 months ago

SiliangZ/RM_Zephyr_dpo_init_ultrafeedbck_lr_5e6

Text Classification • 7B • Updated Jan 19, 2025 • 5

published a model 12 months ago

SiliangZ/RM_Zephyr_dpo_init_ultrafeedbck_lr_5e6

Text Classification • 7B • Updated Jan 19, 2025 • 5

updated 2 models 12 months ago

SiliangZ/RM_Mistral_sft_init_ultrafeedbck_lr_5e7

Text Classification • 7B • Updated Jan 19, 2025 • 3

SiliangZ/RM_Mistral_sft_init_ultrafeedbck_lr_5e6

Text Classification • 7B • Updated Jan 19, 2025 • 9

published 2 models 12 months ago

SiliangZ/RM_Mistral_sft_init_ultrafeedbck_lr_5e7

Text Classification • 7B • Updated Jan 19, 2025 • 3

SiliangZ/RM_Mistral_sft_init_ultrafeedbck_lr_5e6

Text Classification • 7B • Updated Jan 19, 2025 • 9

Siliang Zeng

AI & ML interests

Recent Activity

Organizations

SiliangZ's activity

Wiki Tool Use

Wiki Tool Use