mirrors/open-r1

mirror of https://github.com/huggingface/open-r1.git synced 2026-06-24 01:54:06 +00:00

lewtun 5ac5971ea5

Add OpenR1-Distill recipe (#661 )

2025-05-26 17:57:44 +02:00

641 B

Raw Permalink Blame History

Post-training recipes

OpenR1 Distill 7B

To train the OpenR1 Distill 7B model, run:

sbatch --nodes=1 slurm/train.slurm --model OpenR1-Distill-7B --task sft --config distill --accelerator zero3

OlympicCoder

To train the OlympicCoder models, run:

# 7B
sbatch --nodes=1 slurm/train.slurm --model OlympicCoder-7B --task sft --config v00.00 --accelerator zero3

# 32B
sbatch --nodes=16 slurm/train.slurm --model OlympicCoder-32B --task sft --config v00.00 --accelerator fsdp

Note that we found it necessary to switch to FSDP1 and paged AdamW 8-bit for the 32B model in order to fit the largest possible context size.