SFT, DPO and self-SFT LoRA adapters for the writingprompts dataset.
-
dementor-research/dpo_writingprompts_gpt-oss-20b_as_llama-3.1-8b_seed1
Updated • 13 -
dementor-research/dpo_writingprompts_gpt-oss-20b_as_llama-3.1-8b_seed2
Updated • 8 -
dementor-research/dpo_writingprompts_gpt-oss-20b_as_llama-3.1-8b_seed3
Updated • 13 -
dementor-research/dpo_writingprompts_gpt-oss-20b_as_nemotron-nano-30b-a3b_seed1
Updated • 13