Commit graph

219 commits

Author SHA1 Message Date
albertan017
85b364bf09
Merge pull request #73 from BaiRiDreamer/main
Merge VERL RL training + BringUpBench evaluation pipeline
2026-02-12 11:02:03 +08:00
BaiRiDreamer
239cba2673 feat(sk2decompile): add BringUpBench evaluation pipeline and results
Integrate BringUpBench evaluation into sk2decompile/evaluation/bringupbench/,
corresponding to Section A.6 of the paper (arXiv:2509.22114).

BringUpBench is a benchmark suite of 90 self-contained C programs (505 functions,
O0-O3). SK2Decompile achieves 42.3% compilation rate and 27.0% re-executability
rate, compared to IDA Pro's 23.6% / 21.7%.

Contents:
- scripts/: 5-step reproduction pipeline (compile, decompile, map, infer, eval)
- data/func_maps/: pre-built function-level mappings (source <-> pseudo <-> asm)
- data/infer_results/: SK2Decompile inference outputs for all opt levels
- reports/: per-opt-level evaluation result summaries (Markdown)
- config.env: template environment configuration
- README.md: comprehensive documentation with reproduction guide

Also updated sk2decompile/README.md to reference BringUpBench evaluation.
2026-02-12 00:02:25 +08:00
BaiRiDreamer
e33b3e7829 feat(sk2decompile): integrate VERL RL training pipeline into sk2decompile/verl/
Add the complete RL (Reinforcement Learning) training content for SK2Decompile,
based on the VERL framework v0.4.1 with GRPO algorithm.

New files added:
- verl/SK2DECOMPILE/README.md: Detailed RL documentation including paper reward
  formulations (Eq.3 & Eq.4), step-by-step reproduction guide (VERL installation,
  reward function integration, data preparation, training launch), paper
  configurations table, hyperparameters, and troubleshooting
- verl/SK2DECOMPILE/reward_functions/: 4 reference reward implementations
  - exe_type.py: compilability + placeholder Jaccard (Structure Recovery, Eq.3)
  - sim_exe.py: compilability + word-level similarity (Structure Recovery)
  - embedding_gte.py: tree-sitter identifier extraction + GTE embedding cosine
    similarity (Identifier Naming, Eq.4)
  - embedding_qwen3.py: same as GTE variant but with Qwen3-Embedding-0.6B
- verl/SK2DECOMPILE/scripts/: 2 reference training scripts
  - run_struct_rl.sh: Structure Recovery RL training
  - run_ident_rl.sh: Identifier Naming RL training

Updated files:
- sk2decompile/README.md: Updated RL section to reference VERL v0.4.1, GRPO,
  paper Section 3.5, correct script paths, and link to detailed RL README

All reward functions and scripts are marked as reference implementations,
directing readers to the paper (arXiv:2509.22114) for precise formulations.
2026-02-11 21:47:19 +08:00
albertan017
1c164e21cd
fix model name error 2026-01-25 17:38:20 +08:00
albertan017
410e3d1f09
detailed usage instructions for sk2decompile models
Added detailed instructions for using sk2decompile, including installation steps, data preparation, and inference commands.
2026-01-25 17:32:29 +08:00
albertan017
2169186d5b
Update citation paper 2025-12-05 16:23:46 +08:00
albertan017
f5a6f43eae
Refactor section headings in README.md
Updated section headings in README.md for consistency and clarity.
2025-12-05 16:20:11 +08:00
albertan017
061b8e87cd
Update README.md 2025-12-05 16:17:52 +08:00
albertan017
86857f4f87
Update README with installation and usage instructions
Added installation instructions for vllm and clang-format.
2025-12-05 16:15:53 +08:00
albertan017
71b120a58e
inference 2025-10-16 23:32:08 +08:00
albertan017
b82f0e511e
Add files via upload
inference
2025-10-16 23:29:55 +08:00
albertan017
cff0c7c3fc
Update README.md 2025-10-15 20:52:04 +08:00
albertan017
357705de0d sk2decompile 2025-10-08 18:15:57 +08:00
albertan017
2098a01c8f
Update release date for SK²Decompile in README 2025-10-04 22:22:51 +08:00
albertan017
766f3ec79f
SK2Decompile 2025-10-04 22:22:24 +08:00
albertan017
cf685cdcbb
add docker support 2025-08-22 14:54:09 +08:00
albertan017
3eee3690d4
Merge pull request #61 from hanXen/main
feat: add Dockerfile for ready-to-use environment
2025-08-22 14:47:31 +08:00
hanxen
4001fbd45c build: pin dependencies in requirements-demo.txt 2025-08-22 13:08:35 +09:00
hanxen
473973cf79 build: update pytorch and cuda versions in dockerfile 2025-08-22 13:06:19 +09:00
hanxen
7ed839a8a2 feat: add Dockerfile for ready-to-use environment 2025-08-20 18:53:31 +09:00
albertan017
e0f3f65459
Merge pull request #55 from 7Sageer/main
Update README for LLaMA-Factory training example
2025-06-30 20:24:20 +08:00
7Sageer
6553713e41
docs: update README for LLaMA-Factory training example 2025-06-30 20:21:50 +08:00
albertan017
41bd69aae0
Update readability_template.txt 2025-06-20 17:36:41 +08:00
albertan017
36a2d57a3e
Merge pull request #50 from 7Sageer/main
fix: add missing training script
2025-06-12 19:52:58 +08:00
7Sageer
4c38c50505
fix: add missing training script 2025-06-12 19:51:16 +08:00
albertan017
f46ef2ae47
Update readme.md 2025-06-06 14:59:22 +08:00
albertan017
494e81735e
Update README.md 2025-05-29 15:10:07 +08:00
albertan017
717032a0b0
Merge pull request #48 from 7Sageer/main
Add LLaMA-Factory training example
2025-05-29 15:01:54 +08:00
Hanrui Qi
e30c73cb92 Add LLaMA-Factory training example with README, requirements, and dataset files 2025-05-29 12:57:28 +08:00
albertan017
99b209388a
Update readme.md 2025-05-23 16:02:31 +08:00
albertan017
cec9426c04
Update readme.md 2025-05-22 11:29:34 +08:00
albertan017
341371b124
Update readme.md 2025-05-22 11:28:02 +08:00
albertan017
6d6023770b
Update readme.md 2025-05-21 19:40:12 +08:00
albertan017
26b451f5ee
Update README.md 2025-05-21 19:36:00 +08:00
albertan017
bdabad7543
Update readme.md 2025-05-21 19:34:40 +08:00
albertan017
757b7a28fb legacy 2025-05-21 19:31:00 +08:00
albertan017
f91058ce48
Update readme.md 2025-05-21 19:26:43 +08:00
albertan017
16d6f4903d
Add files via upload 2025-05-21 19:25:04 +08:00
albertan017
54c59d79e4
Add files via upload 2025-05-21 19:24:45 +08:00
albertan017
e6551ae8c3 evaluation 2025-05-21 19:08:31 +08:00
albertan017
15c3a0785f upload mbpp 2025-05-21 17:33:59 +08:00
albertan017
9eb2fdcda2
Create readme.md 2025-05-21 17:07:54 +08:00
albertan017
0d663e528e
Add files via upload 2025-05-21 17:02:28 +08:00
albertan017
44f530d2fb
Add files via upload 2025-05-21 16:54:54 +08:00
albertan017
02f240f6bd
Add files via upload 2025-05-21 11:05:39 +08:00
albertan017
61265b5619
Create humaneval-decompile.json 2025-05-21 10:53:05 +08:00
albertan017
156dea65b5
Create cal_edit_sim.py 2025-05-21 10:52:07 +08:00
albertan017
1354ed4f6e
Create cal_execute_rate.py 2025-05-21 10:48:21 +08:00
albertan017
0a5aafca28
Update decompile-bench 2025-05-21 10:44:53 +08:00
albertan017
72d403f3d7
Update youtube link 2024-10-28 17:39:24 +08:00