open-r1

mirror of https://github.com/huggingface/open-r1.git synced 2026-06-24 01:54:06 +00:00

History

Lewis Tunstall 3bcc4fc86e Add codeforces		2025-05-28 19:21:15 +00:00
..
experimental	Add the actual async generation script (#273 )	2025-02-10 16:52:23 +01:00
piston	Add codeforces	2025-05-28 19:21:15 +00:00
compute_pass_rate.slurm	Add dataset filtering script (#637 )	2025-05-16 10:26:49 +02:00
e2b_router.slurm	Add time to Slurm (#639 )	2025-05-09 19:19:51 +02:00
evaluate.slurm	Fix Weka refresh (#666 )	2025-05-28 13:45:48 +02:00
generate.slurm	Bump LightEval to enable DP>1 (#629 )	2025-04-30 22:02:20 +02:00
morph_router.slurm	Add time to Slurm (#639 )	2025-05-09 19:19:51 +02:00
README.md	adds support for running GRPO on IOI problems (#495 )	2025-03-21 08:48:00 +01:00
serve_r1.slurm	Add the actual async generation script (#273 )	2025-02-10 16:52:23 +01:00
serve_router.slurm	Add the actual async generation script (#273 )	2025-02-10 16:52:23 +01:00
train.slurm	Add codeforces	2025-05-28 19:21:15 +00:00

README.md

Serving DeepSeek-R1 on 2x8 H100 SLURM nodes with SGLang

Set up the environment (adjust for your cuda version):

conda create -n sglang124 python=3.11
conda activate sglang124

pip install torch==2.5.1 --index-url https://download.pytorch.org/whl/cu124

pip install sgl-kernel --force-reinstall --no-deps
pip install "sglang[all]>=0.4.2.post4" --find-links https://flashinfer.ai/whl/cu124/torch2.5/flashinfer/

Run the server and wait for the model to load:

sbatch slurm/serve_r1.slurm -m "/fsx/deepseek-r1-checkpoint" -e "sglang124"

Run the data generation script:

python scripts/generate_reasoning.py \
    --dataset-name "AI-MO/NuminaMath-1.5" \
    --output-file "numinamath_r1_generations.jsonl" \
    --prompt-column "problem" \
    --uuid-column "problem" \
    --api-addr "<SGLANG_SERVER_ADDRESS>:39877" \
    --num-generations 2 \
    --max-tokens 16384 \
    --max-concurrent 200