Commit graph

6 commits

Author SHA1 Message Date
lewtun
6a0cd5c8ad
Fix style again :) (#636) 2025-05-08 16:29:01 +02:00
Andrei
af81114044
Code Execution using Morph Cloud (#614)
* initial commit for morphcloud sandbox support

* initial

* fixed prints in morph client for ioi

* updated import

* context manager

* removed unnecessary comments

* more intelligent instance/snapshot management

* update

* Add documentation for Morph integration

* Delete MORPH_INTEGRATION.md

* added retry and modularity to morph client

* updates to kwargs and setup.py

* Update setup.py

* added languages codepath + fixed slurm + added m
orph tests

* make quality formatting fixes

* conditional imports for morph

---------

Co-authored-by: arb8020 <arbeightytwenty@gmail.com>
2025-05-08 08:59:54 +02:00
Edward Beeching
c1eadaa097
E2B Router bug fixes (#592)
* fix eval system prompt

* style

* fix a rare issue where the execution is None

* fixes a bug in the e2b router
2025-04-11 14:04:59 +02:00
Edward Beeching
1b3bf043dc
Adds a E2B router server that executes batches of scripts (#561)
* adds a dedicated e2b server to handle batches of requests

* fix reward tests

* update slow reward

* style

* updates e2b router to be more generic

* refactor

* refactoring

* licence, cleanup

* update tests

* style

* fix import when e2b not present

* style

* rename sandbox file

* rename to RoutedSandbox

* update readme

* nits

* nits2

* unlimited max time

* update logs path
2025-04-07 21:01:06 +02:00
Guilherme Penedo
7835979801
adds support for running GRPO on IOI problems (#495)
* adds support for running GRPO on IOI problems

* nit

* bugfixes + recipe

* added piston info and readme changes

* readme updates

* run isort to fix checks

* Update src/open_r1/rewards.py

Co-authored-by: Edward Beeching <edbeeching@users.noreply.github.com>

* adding ioi test

* fix merge issues with python slow tests

* style

* generalize piston workers

* generalize readme

* fix extract code

* finalize slow tests

---------

Co-authored-by: Edward Beeching <edbeeching@users.noreply.github.com>
Co-authored-by: edbeeching <edbeeching@gmail.com>
2025-03-21 08:48:00 +01:00
Edward Beeching
5dcfae8979
Fixes bug with async code reward (#504)
* adds slow test for code reward

* fixes bug in setting language and the output parsing

* style

* removed redundant comment

* removed exeception as e

* remove rewards

* removed whitespace

* more whitespace

* remove need for loop with asyncio.run

* nits

* fix type error with e2n AsyncSandbox
2025-03-13 22:54:15 +01:00