TAUR-dev/M-test_scratch-sft
1B
•
Updated
•
8
TAUR-dev/M-0921__0epoch_alltask2_grpo-rl
2B
•
Updated
•
2
TAUR-dev/M-bolt_gpt4o_baseline-rl
2B
•
Updated
•
5
TAUR-dev/M-0921__pv2_CT3and4arg_grpo-rl
2B
•
Updated
•
5
TAUR-dev/M-0921__0epoch_CT3and4arg_grpo-rl
2B
•
Updated
•
4
TAUR-dev/M-BASELINE_gtp4o_distillation-sft
2B
•
Updated
•
4
TAUR-dev/M-BASELINE_gtp4o_BOLT-sft
2B
•
Updated
•
7
TAUR-dev/M-multitask_sftdata_cd3_lm3_ac4_lc4-sft
2B
•
Updated
•
4
TAUR-dev/M-multitask_sftdata_cd34_lm3_ac4_lc4-sft
2B
•
Updated
•
5
TAUR-dev/M-0921__0epoch_alltask1_grpo-rl
Updated
TAUR-dev/M-0918__bon_tuning_correct_samples_3args_grpo-rl
2B
•
Updated
•
3
TAUR-dev/M-0918__bon_tuning_all_samples_3args_grpo-rl
2B
•
Updated
•
4
TAUR-dev/M-0918__orig_only_prompts_3args_grpo-rl
2B
•
Updated
•
5
TAUR-dev/M-ablations__rl_ab_no_reflects-rl
2B
•
Updated
•
5
TAUR-dev/M-0918__random_3args_grpo-rl
2B
•
Updated
•
7
TAUR-dev/M-0918__1_sample_only_corrects_3args_grpo-rl
2B
•
Updated
•
4
TAUR-dev/M-sft_on_pv_v2__rl_on_cd34_gsm_csqa_lm34-rl
Updated
TAUR-dev/M-sft_basemodel__rl_on_cd34_gsm_csqa_lm34-rl
Updated
TAUR-dev/M-0918__low_quality_reflections_3args_grpo-rl
Updated
TAUR-dev/M-RC-ab_sft_bon_all_samples-sft
2B
•
Updated
•
9
TAUR-dev/M-skillfactory-ablations__random_reflections5_formatsrandom-sft
2B
•
Updated
•
6
TAUR-dev/M-skillfactory-ablations__no_reflections_reflections5_formatsno_reflection-sft
2B
•
Updated
•
6
TAUR-dev/M-skillfactory-ablations__orig_only_reflections5_formats-C_full-sft
2B
•
Updated
•
6
TAUR-dev/M-RC-ab_sft_bon_corr_samples-sft
2B
•
Updated
•
6
TAUR-dev/M-RC-ab_sft_our_structure_single_sample-sft
2B
•
Updated
•
8
TAUR-dev/M-rl_1e_v2__pv_v3-rl
2B
•
Updated
•
4
TAUR-dev/M-0918__0epoch_3and4args_grpo-rl
Updated
TAUR-dev/M-sft_exp_1e_zayneprompts_v3-sft
2B
•
Updated
•
3
TAUR-dev/M-rl_1e_v2__pv_v2-rl
2B
•
Updated
•
3
TAUR-dev/M-rl_1e_v2__pv_v2_origonly2e-rl
2B
•
Updated
•
4