mirror of
https://github.com/huggingface/open-r1.git
synced 2026-06-24 01:54:06 +00:00
* adds support for running GRPO on IOI problems * nit * bugfixes + recipe * added piston info and readme changes * readme updates * run isort to fix checks * Update src/open_r1/rewards.py Co-authored-by: Edward Beeching <edbeeching@users.noreply.github.com> * adding ioi test * fix merge issues with python slow tests * style * generalize piston workers * generalize readme * fix extract code * finalize slow tests --------- Co-authored-by: Edward Beeching <edbeeching@users.noreply.github.com> Co-authored-by: edbeeching <edbeeching@gmail.com>
938 B
938 B
Serving DeepSeek-R1 on 2x8 H100 SLURM nodes with SGLang
- Set up the environment (adjust for your cuda version):
conda create -n sglang124 python=3.11
conda activate sglang124
pip install torch==2.5.1 --index-url https://download.pytorch.org/whl/cu124
pip install sgl-kernel --force-reinstall --no-deps
pip install "sglang[all]>=0.4.2.post4" --find-links https://flashinfer.ai/whl/cu124/torch2.5/flashinfer/
- Run the server and wait for the model to load:
sbatch slurm/serve_r1.slurm -m "/fsx/deepseek-r1-checkpoint" -e "sglang124"
- Run the data generation script:
python scripts/generate_reasoning.py \
--dataset-name "AI-MO/NuminaMath-1.5" \
--output-file "numinamath_r1_generations.jsonl" \
--prompt-column "problem" \
--uuid-column "problem" \
--api-addr "<SGLANG_SERVER_ADDRESS>:39877" \
--num-generations 2 \
--max-tokens 16384 \
--max-concurrent 200