mirror of
https://github.com/albertan017/LLM4Decompile.git
synced 2026-06-17 01:55:50 +00:00
feat(sk2decompile): add BringUpBench evaluation pipeline and results
Integrate BringUpBench evaluation into sk2decompile/evaluation/bringupbench/, corresponding to Section A.6 of the paper (arXiv:2509.22114). BringUpBench is a benchmark suite of 90 self-contained C programs (505 functions, O0-O3). SK2Decompile achieves 42.3% compilation rate and 27.0% re-executability rate, compared to IDA Pro's 23.6% / 21.7%. Contents: - scripts/: 5-step reproduction pipeline (compile, decompile, map, infer, eval) - data/func_maps/: pre-built function-level mappings (source <-> pseudo <-> asm) - data/infer_results/: SK2Decompile inference outputs for all opt levels - reports/: per-opt-level evaluation result summaries (Markdown) - config.env: template environment configuration - README.md: comprehensive documentation with reproduction guide Also updated sk2decompile/README.md to reference BringUpBench evaluation.
This commit is contained in:
parent
e33b3e7829
commit
239cba2673
22 changed files with 6313 additions and 0 deletions
|
|
@ -41,6 +41,12 @@ SK2Decompile/
|
|||
│ ├── scripts/ # Training launch scripts
|
||||
│ └── README.md # Detailed RL documentation
|
||||
├── evaluation/ # Comprehensive evaluation suite
|
||||
│ ├── bringupbench/ # BringUpBench evaluation (Section A.6)
|
||||
│ │ ├── scripts/ # Pipeline scripts (compile, decompile, evaluate)
|
||||
│ │ ├── data/ # Pre-built function maps and inference results
|
||||
│ │ ├── reports/ # Evaluation result summaries
|
||||
│ │ └── README.md # Detailed BringUpBench documentation
|
||||
│ └── ... # HumanEval, MBPP evaluation scripts
|
||||
└── README.md # This file
|
||||
```
|
||||
|
||||
|
|
@ -198,6 +204,12 @@ python gpt_judge.py --json_file your_json_file_path
|
|||
--api_key your_openai_api_key
|
||||
```
|
||||
|
||||
**BringUpBench Evaluation** (Section A.6 of the paper)
|
||||
|
||||
We also evaluate on [BringUpBench](https://github.com/toddmaustin/bringup-bench) — 90 self-contained C programs with 505 functions across O0–O3. SK²Decompile achieves **42.3% compilation rate** and **27.0% re-executability rate**, compared to IDA Pro's 23.6% / 21.7%.
|
||||
|
||||
See [`evaluation/bringupbench/README.md`](evaluation/bringupbench/README.md) for the full reproduction pipeline, pre-built data, and detailed results.
|
||||
|
||||
## 📊 Results
|
||||
|
||||
Our approach achieves state-of-the-art performance:
|
||||
|
|
|
|||
249
sk2decompile/evaluation/bringupbench/README.md
Normal file
249
sk2decompile/evaluation/bringupbench/README.md
Normal file
|
|
@ -0,0 +1,249 @@
|
|||
# SK²Decompile — Evaluation on BringUpBench
|
||||
|
||||
This directory contains the evaluation pipeline for SK²Decompile on the [BringUpBench](https://github.com/toddmaustin/bringup-bench) benchmark, as described in **Section A.6** of our paper:
|
||||
|
||||
> **SK²Decompile: LLM-based Two-Phase Binary Decompilation from Skeleton to Skin**
|
||||
> [[arXiv:2509.22114]](https://arxiv.org/abs/2509.22114)
|
||||
|
||||
## Overview
|
||||
|
||||
[BringUpBench](https://github.com/toddmaustin/bringup-bench) (Austin, 2024) is a benchmark suite of **90 self-contained C programs** designed for bringing up newly designed CPUs, accelerators, compilers, and operating systems. It has **zero library dependencies** — all programs rely solely on a built-in `libmin` library and only 4 system calls — making it an ideal, standardized test bed for decompilation evaluation on complex, real-world binaries.
|
||||
|
||||
We compiled, decompiled, and executed all projects across optimization levels O0–O3, yielding **505 functions** in total. We compared SK²Decompile against the industry-standard rule-based decompiler, **IDA Pro** (Hex-Rays).
|
||||
|
||||
## Results
|
||||
|
||||
### SK²Decompile vs IDA Pro
|
||||
|
||||
| Opt Level | Functions | SK²Decompile Compilable | SK²Decompile Executable | IDA Compilable | IDA Executable |
|
||||
|:---------:|:---------:|:-----------------------:|:-----------------------:|:--------------:|:--------------:|
|
||||
| O0 | 382 | **50.26%** | **49.48%** | — | — |
|
||||
| O1 | 379 | **40.90%** | **39.05%** | — | — |
|
||||
| O2 | 368 | **37.77%** | **34.24%** | — | — |
|
||||
| O3 | 359 | **31.75%** | **29.53%** | — | — |
|
||||
| **Avg** | **1488** | **42.3%** | **27.0%** | **23.6%** | **21.7%** |
|
||||
|
||||
> The average row reports the paper's aggregate numbers (Table 8 in Section A.6). Per-opt-level IDA baselines are not separately reported in the paper. Detailed per-benchmark breakdowns are available in `reports/`.
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
bringupbench/
|
||||
├── README.md # This file
|
||||
├── config.env # Environment configuration (paths)
|
||||
├── scripts/
|
||||
│ ├── build-host-opt-levels.sh # Step 1: Compile benchmarks at O0-O3
|
||||
│ ├── decompile-all-pseudo.sh # Step 2: IDA Pro batch decompilation
|
||||
│ ├── dump_pseudo.py # IDA headless decompilation helper
|
||||
│ ├── disasm-all-objdump.sh # Step 3: objdump batch disassembly
|
||||
│ ├── build-func-maps.py # Step 4: Build function-level mappings
|
||||
│ ├── clean-all-benchmarks.sh # Utility: clean all build artifacts
|
||||
│ └── eval_infer_out.py # Step 5: Automated evaluation
|
||||
├── data/
|
||||
│ ├── func_maps/ # Pre-built function mappings (JSONL)
|
||||
│ │ ├── merged.O0.func_map.jsonl # O0: 493 functions
|
||||
│ │ ├── merged.O1.func_map.jsonl # O1: 449 functions
|
||||
│ │ ├── merged.O2.func_map.jsonl # O2: 441 functions
|
||||
│ │ └── merged.O3.func_map.jsonl # O3: 439 functions
|
||||
│ └── infer_results/ # SK²Decompile inference results
|
||||
│ ├── merged.O0.func_map.infer.jsonl # O0: 382 evaluated functions
|
||||
│ ├── merged.O1.func_map.infer.jsonl # O1: 379 evaluated functions
|
||||
│ ├── merged.O2.func_map.infer.jsonl # O2: 368 evaluated functions
|
||||
│ └── merged.O3.func_map.infer.jsonl # O3: 359 evaluated functions
|
||||
└── reports/ # Evaluation result summaries
|
||||
├── O0_results.md
|
||||
├── O1_results.md
|
||||
├── O2_results.md
|
||||
└── O3_results.md
|
||||
```
|
||||
|
||||
## Reproduction Pipeline
|
||||
|
||||
Our evaluation pipeline consists of five steps, as described in the paper:
|
||||
|
||||
```
|
||||
Source (.c)
|
||||
│
|
||||
▼ Step 1: Compilation
|
||||
Binary (.host.O0 ~ .host.O3)
|
||||
│
|
||||
├──▶ Step 2: Baseline Extraction (IDA Pro) ──▶ Pseudocode (.pseudo)
|
||||
│
|
||||
├──▶ Step 3: Ground Truth Mapping ──▶ Function Maps (.func_map.jsonl)
|
||||
│
|
||||
▼ Step 4: Decompilation (SK²Decompile)
|
||||
Inferred C code (.func_map.infer.jsonl)
|
||||
│
|
||||
▼ Step 5: Validation
|
||||
Evaluation Reports (reports/)
|
||||
```
|
||||
|
||||
### Prerequisites
|
||||
|
||||
| Dependency | Purpose | Installation |
|
||||
|------------|---------|-------------|
|
||||
| [Bringup-Bench](https://github.com/toddmaustin/bringup-bench) | Upstream benchmark suite (90 C programs) | `git clone https://github.com/toddmaustin/bringup-bench.git` |
|
||||
| GCC | Compile benchmarks | `apt install gcc` |
|
||||
| IDA Pro + Hex-Rays | Decompile binaries to pseudocode | Commercial software |
|
||||
| objdump (binutils) | Disassemble binaries | `apt install binutils` |
|
||||
| clang-format | Pseudocode normalization | `apt install clang-format` |
|
||||
| Python >= 3.10 | Run evaluation scripts | `apt install python3` |
|
||||
|
||||
### Quick Start (Evaluation Only)
|
||||
|
||||
If you only want to reproduce the evaluation step (Step 5), the pre-built data is included in `data/`. You only need the Bringup-Bench source repository:
|
||||
|
||||
```bash
|
||||
# 1. Clone Bringup-Bench
|
||||
git clone https://github.com/toddmaustin/bringup-bench.git
|
||||
|
||||
# 2. Configure paths
|
||||
cd bringupbench
|
||||
vim config.env # Set BENCH_REPO_ROOT to your bringup-bench path
|
||||
|
||||
# 3. Run evaluation (e.g., O0)
|
||||
python3 scripts/eval_infer_out.py data/infer_results/merged.O0.func_map.infer.jsonl
|
||||
|
||||
# 4. Check results
|
||||
cat reports/O0_results.md
|
||||
```
|
||||
|
||||
### Full Pipeline (From Scratch)
|
||||
|
||||
To reproduce the entire pipeline from compilation to evaluation:
|
||||
|
||||
```bash
|
||||
cd bringupbench
|
||||
vim config.env # Set BENCH_REPO_ROOT and IDA_BIN
|
||||
```
|
||||
|
||||
**Step 1: Compile benchmarks at O0–O3**
|
||||
|
||||
Build all 90 Bringup-Bench programs at four optimization levels, producing `<name>.host.O{0,1,2,3}` binaries.
|
||||
|
||||
```bash
|
||||
scripts/build-host-opt-levels.sh
|
||||
```
|
||||
|
||||
**Step 2: Baseline Extraction (IDA Pro)**
|
||||
|
||||
Use IDA Pro in headless mode to decompile all binaries, producing `.pseudo` files with Hex-Rays pseudocode.
|
||||
|
||||
```bash
|
||||
scripts/decompile-all-pseudo.sh
|
||||
```
|
||||
|
||||
Each function is delimited by `/* function_name @ 0xADDRESS */` in the output.
|
||||
|
||||
**Step 3: Ground Truth Mapping**
|
||||
|
||||
Parse source code, pseudocode, and assembly; match functions by name across all three representations; normalize pseudocode (remove IDA-specific types, hex-to-decimal conversion, clang-format).
|
||||
|
||||
```bash
|
||||
# Disassemble (optional, for assembly mapping)
|
||||
scripts/disasm-all-objdump.sh
|
||||
|
||||
# Build function-level mappings
|
||||
python3 scripts/build-func-maps.py
|
||||
```
|
||||
|
||||
Output: per-binary `.func_map.jsonl` files. Merge them per optimization level:
|
||||
|
||||
```bash
|
||||
cat $BENCH_REPO_ROOT/*/*.host.O0.func_map.jsonl > data/func_maps/merged.O0.func_map.jsonl
|
||||
cat $BENCH_REPO_ROOT/*/*.host.O1.func_map.jsonl > data/func_maps/merged.O1.func_map.jsonl
|
||||
cat $BENCH_REPO_ROOT/*/*.host.O2.func_map.jsonl > data/func_maps/merged.O2.func_map.jsonl
|
||||
cat $BENCH_REPO_ROOT/*/*.host.O3.func_map.jsonl > data/func_maps/merged.O3.func_map.jsonl
|
||||
```
|
||||
|
||||
**Step 4: Decompilation (SK²Decompile Inference)**
|
||||
|
||||
Feed the `pseudo_normalize` field from the function maps to SK²Decompile. The two-phase inference pipeline (see `../sk2decompile_inf.py`) produces C code for each function. Results should be written into the JSONL with the `pseudo.content-fix` field containing the final decompiled function body.
|
||||
|
||||
```bash
|
||||
# Example: use the main SK²Decompile inference pipeline
|
||||
cd ../ # back to sk2decompile/evaluation/
|
||||
python3 sk2decompile_inf.py \
|
||||
--dataset_path bringupbench/data/func_maps/merged.O0.func_map.jsonl \
|
||||
--model_path LLM4Binary/sk2decompile-struct-6.7b \
|
||||
--recover_model_path LLM4Binary/sk2decompile-ident-6.7b
|
||||
```
|
||||
|
||||
**Step 5: Validation**
|
||||
|
||||
For each function, replace the original source with the decompiled output, rebuild in an isolated workspace, and run the project's test suite.
|
||||
|
||||
```bash
|
||||
python3 scripts/eval_infer_out.py data/infer_results/merged.O0.func_map.infer.jsonl \
|
||||
--jobs 16 \
|
||||
--command-timeout 20
|
||||
```
|
||||
|
||||
Common options:
|
||||
|
||||
```bash
|
||||
--jobs N # Parallel workers (default: 96)
|
||||
--command-timeout S # Timeout per make command in seconds (default: 20)
|
||||
--limit N # Process only first N cases (for debugging)
|
||||
--keep-workspaces # Keep temporary build directories
|
||||
```
|
||||
|
||||
## Data Format
|
||||
|
||||
### func_map.jsonl (Function Mappings)
|
||||
|
||||
Each line is a JSON object containing the source, pseudocode, and assembly for one function:
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"source": {
|
||||
"path": "ackermann/ackermann.c", // Source file (relative to BENCH_REPO_ROOT)
|
||||
"function_name": "ackermann", // Function name
|
||||
"content": "int ackermann(int m, ...) { ... }\n" // Complete function body
|
||||
},
|
||||
"pseudo": {
|
||||
"path": "ackermann/ackermann.host.O0.pseudo",
|
||||
"function_name": "ackermann",
|
||||
"address": "0x11e9", // Function address in binary
|
||||
"label": "ackermann",
|
||||
"content": "__int64 __fastcall ackermann(...) { ... }\n" // Raw IDA pseudocode
|
||||
},
|
||||
"pseudo_normalize": "int ackermann(...) { ... }", // Normalized pseudocode
|
||||
"binary": "ackermann/ackermann.host.O0", // Binary file path
|
||||
"assembly": "<ackermann>:\npush %rbp\n..." // Cleaned objdump output
|
||||
}
|
||||
```
|
||||
|
||||
### func_map.infer.jsonl (Inference Results)
|
||||
|
||||
Extends `func_map.jsonl` with SK²Decompile inference outputs:
|
||||
|
||||
```jsonc
|
||||
{
|
||||
// ... all fields from func_map.jsonl ...
|
||||
"pseudo": {
|
||||
// ... all fields above, plus:
|
||||
"content-fix": "..." // Final decompiled function (used for source replacement)
|
||||
},
|
||||
"infer-out-model1": "...", // Phase 1 (Structure Recovery) raw output
|
||||
"infer-out-model2": "...", // Phase 2 (Identifier Naming) raw output
|
||||
"pseudo_normalize-fix": "..." // Corrected normalized pseudocode
|
||||
}
|
||||
```
|
||||
|
||||
## Evaluation Metrics
|
||||
|
||||
| Metric | Definition |
|
||||
|--------|-----------|
|
||||
| **Replacement Rate** | Fraction of functions where the decompiled output can be located and substituted into the original source file |
|
||||
| **Compilable Rate** | Fraction of functions where the modified source compiles successfully (`make build`) |
|
||||
| **Executable Rate** | Fraction of functions where the compiled program passes its test suite (`make test`, output matches reference) |
|
||||
|
||||
The evaluation uses BringUpBench's own build infrastructure (`Makefile`, `libmin`, `libtarg`) to compile and validate. Each function is tested in an isolated workspace to prevent cross-contamination.
|
||||
|
||||
## Notes
|
||||
|
||||
- BringUpBench programs are self-contained with zero external dependencies, making them ideal for evaluating decompilation without the confounding factor of missing headers or libraries.
|
||||
- The `func_maps/` data contains more functions than `infer_results/` because some functions are filtered during inference (e.g., exceeding token limits).
|
||||
- All scripts load paths from `config.env`. You can also override via environment variables or CLI arguments (priority: CLI > env > config.env).
|
||||
- For the complete SK²Decompile methodology and other benchmark results (HumanEval, MBPP, ExeBench, GitHub2025), see the [main README](../../README.md).
|
||||
14
sk2decompile/evaluation/bringupbench/config.env
Normal file
14
sk2decompile/evaluation/bringupbench/config.env
Normal file
|
|
@ -0,0 +1,14 @@
|
|||
# BringUpBench Evaluation — Environment Configuration
|
||||
# All scripts resolve paths from this file.
|
||||
# Values can be overridden by same-named environment variables or CLI arguments.
|
||||
# Priority: CLI args > environment variables > config.env
|
||||
|
||||
# Absolute path to the Bringup-Bench repository
|
||||
# Clone from: https://github.com/toddmaustin/bringup-bench.git
|
||||
BENCH_REPO_ROOT=/path/to/bringup-bench
|
||||
|
||||
# IDA Pro command-line executable (required for Step 2: decompilation)
|
||||
IDA_BIN=/path/to/idat
|
||||
|
||||
# Default build target (host = native x86-64 Linux)
|
||||
DEFAULT_TARGET=host
|
||||
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
296
sk2decompile/evaluation/bringupbench/reports/O0_results.md
Normal file
296
sk2decompile/evaluation/bringupbench/reports/O0_results.md
Normal file
|
|
@ -0,0 +1,296 @@
|
|||
# Infer-Out Model 2 Evaluation (merged.O0.func_map.infer-host)
|
||||
|
||||
- Timestamp: 20251119-171008
|
||||
- Source JSONL: merged.O0.func_map.infer.jsonl
|
||||
- Target: host
|
||||
- Total cases: 382
|
||||
- Replacement success: 382 (100.00%)
|
||||
- Compilable: 192 (50.26%)
|
||||
- Executable: 189 (49.48%)
|
||||
|
||||
## Benchmark Breakdown
|
||||
| Benchmark | Cases | Replacement% | Build% | Exec% |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| ackermann | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| aes | 9 | 100.00% | 33.33% | 33.33% |
|
||||
| anagram | 12 | 100.00% | 58.33% | 58.33% |
|
||||
| audio-codec | 4 | 100.00% | 50.00% | 50.00% |
|
||||
| avl-tree | 14 | 100.00% | 35.71% | 35.71% |
|
||||
| banner | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| bit-kernels | 5 | 100.00% | 100.00% | 100.00% |
|
||||
| blake2b | 6 | 100.00% | 16.67% | 16.67% |
|
||||
| bloom-filter | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| boyer-moore-search | 3 | 100.00% | 0.00% | 0.00% |
|
||||
| bubble-sort | 2 | 100.00% | 100.00% | 100.00% |
|
||||
| c-interp | 10 | 100.00% | 70.00% | 70.00% |
|
||||
| ccmac | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| checkers | 15 | 100.00% | 80.00% | 80.00% |
|
||||
| cipher | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| congrad | 6 | 100.00% | 66.67% | 66.67% |
|
||||
| connect4-minimax | 13 | 100.00% | 61.54% | 61.54% |
|
||||
| convex-hull | 4 | 100.00% | 75.00% | 75.00% |
|
||||
| dhrystone | 5 | 100.00% | 60.00% | 60.00% |
|
||||
| distinctness | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| fft-int | 4 | 100.00% | 50.00% | 50.00% |
|
||||
| flood-fill | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| frac-calc | 10 | 100.00% | 60.00% | 60.00% |
|
||||
| fuzzy-match | 4 | 100.00% | 25.00% | 25.00% |
|
||||
| fy-shuffle | 4 | 100.00% | 50.00% | 50.00% |
|
||||
| gcd-list | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| grad-descent | 4 | 100.00% | 75.00% | 75.00% |
|
||||
| graph-tests | 19 | 100.00% | 21.05% | 21.05% |
|
||||
| hanoi | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| heapsort | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| heat-calc | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| huff-encode | 12 | 100.00% | 91.67% | 91.67% |
|
||||
| idct-alg | 4 | 100.00% | 50.00% | 50.00% |
|
||||
| indirect-test | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| k-means | 6 | 100.00% | 100.00% | 100.00% |
|
||||
| kadane | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| kepler | 7 | 100.00% | 28.57% | 28.57% |
|
||||
| knapsack | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| knights-tour | 3 | 100.00% | 66.67% | 66.67% |
|
||||
| life | 14 | 100.00% | 78.57% | 71.43% |
|
||||
| longdiv | 7 | 100.00% | 71.43% | 71.43% |
|
||||
| lu-decomp | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| lz-compress | 2 | 100.00% | 100.00% | 100.00% |
|
||||
| mandelbrot | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| matmult | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| max-subseq | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| mersenne | 3 | 100.00% | 0.00% | 0.00% |
|
||||
| minspan | 8 | 100.00% | 62.50% | 62.50% |
|
||||
| monte-carlo | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| murmur-hash | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| n-queens | 3 | 100.00% | 66.67% | 66.67% |
|
||||
| natlog | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| nbody-sim | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| nr-solver | 1 | 100.00% | 100.00% | 100.00% |
|
||||
| packet-filter | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| parrondo | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| pascal | 3 | 100.00% | 100.00% | 100.00% |
|
||||
| pi-calc | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| primal-test | 3 | 100.00% | 0.00% | 0.00% |
|
||||
| priority-queue | 5 | 100.00% | 80.00% | 80.00% |
|
||||
| qsort-demo | 5 | 100.00% | 0.00% | 0.00% |
|
||||
| qsort-test | 3 | 100.00% | 66.67% | 66.67% |
|
||||
| quaternions | 4 | 100.00% | 0.00% | 0.00% |
|
||||
| rabinkarp-search | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| rand-test | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| ransac | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| regex-parser | 11 | 100.00% | 72.73% | 63.64% |
|
||||
| rho-factor | 4 | 100.00% | 75.00% | 75.00% |
|
||||
| rle-compress | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| rsa-cipher | 4 | 100.00% | 0.00% | 0.00% |
|
||||
| sat-solver | 5 | 100.00% | 60.00% | 60.00% |
|
||||
| shortest-path | 3 | 100.00% | 66.67% | 66.67% |
|
||||
| sieve | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| simple-grep | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| spelt2num | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| spirograph | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| sudoku-solver | 4 | 100.00% | 75.00% | 75.00% |
|
||||
| tetris-sim | 12 | 100.00% | 75.00% | 75.00% |
|
||||
| tiny-NN | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| topo-sort | 7 | 100.00% | 0.00% | 0.00% |
|
||||
| totient | 4 | 100.00% | 75.00% | 75.00% |
|
||||
| transcend | 3 | 100.00% | 66.67% | 66.67% |
|
||||
| uniquify | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| vectors-3d | 8 | 100.00% | 12.50% | 12.50% |
|
||||
| verlet | 4 | 100.00% | 25.00% | 0.00% |
|
||||
| weekday | 2 | 100.00% | 0.00% | 0.00% |
|
||||
|
||||
## Compilation Failures
|
||||
- ackermann/ackermann.c::main@0x13b9
|
||||
- aes/aes.c::aes_decrypt@0x1a65
|
||||
- aes/aes.c::aes_encrypt@0x1943
|
||||
- aes/aes.c::inv_shift_rows@0x1396
|
||||
- aes/aes.c::key_expansion@0x179a
|
||||
- aes/aes.c::main@0x1b87
|
||||
- aes/aes.c::shift_rows@0x12e5
|
||||
- anagram/anagram.c::BuildMask@0x13e7
|
||||
- anagram/anagram.c::BuildWord@0x17e5
|
||||
- anagram/anagram.c::FindAnagram@0x1ba6
|
||||
- anagram/anagram.c::ReadDict@0x121f
|
||||
- anagram/anagram.c::main@0x1f71
|
||||
- audio-codec/audio-codec.c::decode@0x12f5
|
||||
- audio-codec/audio-codec.c::main@0x14b3
|
||||
- avl-tree/avlcore.c::DeleteByElement@0x240f
|
||||
- avl-tree/avlcore.c::DeleteByElementRecursive@0x21af
|
||||
- avl-tree/avlcore.c::DeleteLeftMost@0x2086
|
||||
- avl-tree/avlcore.c::FindByElement@0x1a46
|
||||
- avl-tree/avlcore.c::Height@0x2475
|
||||
- avl-tree/avlcore.c::Insert@0x1fc4
|
||||
- avl-tree/avlcore.c::SingleLeftRotation@0x1b3a
|
||||
- avl-tree/avl-tree.c::main@0x1399
|
||||
- avl-tree/avl-tree.c::printTree@0x11e9
|
||||
- banner/banner.c::main@0x11e9
|
||||
- blake2b/blake2b.c::BLAKE2B@0x1a9b
|
||||
- blake2b/blake2b.c::F@0x1502
|
||||
- blake2b/blake2b.c::G@0x1258
|
||||
- blake2b/blake2b.c::blake2b@0x1cd3
|
||||
- blake2b/blake2b.c::test@0x2071
|
||||
- bloom-filter/bloom-filter.c::bad_search@0x11e9
|
||||
- bloom-filter/bloom-filter.c::main@0x123d
|
||||
- boyer-moore-search/boyer-moore-search.c::badCharHeuristic@0x11e9
|
||||
- boyer-moore-search/boyer-moore-search.c::main@0x146d
|
||||
- boyer-moore-search/boyer-moore-search.c::search@0x126d
|
||||
- c-interp/c-interp.c::eval@0x457c
|
||||
- c-interp/c-interp.c::main@0x4e03
|
||||
- c-interp/c-interp.c::next@0x11e9
|
||||
- ccmac/ccmac.c::main@0x127e
|
||||
- checkers/functions.c::fill_print_initial@0x1793
|
||||
- checkers/functions.c::generate_node_children@0x29ff
|
||||
- checkers/checkers.c::main@0x11e9
|
||||
- cipher/cipher.c::encipher@0x11e9
|
||||
- cipher/cipher.c::main@0x13cd
|
||||
- congrad/congrad.c::cg_solve@0x1643
|
||||
- congrad/congrad.c::main@0x199b
|
||||
- connect4-minimax/connect4-minimax.c::init_board@0x11e9
|
||||
- connect4-minimax/connect4-minimax.c::main@0x2299
|
||||
- connect4-minimax/connect4-minimax.c::minimax@0x1d07
|
||||
- connect4-minimax/connect4-minimax.c::play_game@0x20d1
|
||||
- connect4-minimax/connect4-minimax.c::score_position@0x1a02
|
||||
- convex-hull/convex-hull.c::main@0x13e7
|
||||
- dhrystone/dhrystone.c::Proc_1@0x199f
|
||||
- dhrystone/dhrystone.c::main@0x11e9
|
||||
- distinctness/distinctness.c::isDistinct@0x11e9
|
||||
- distinctness/distinctness.c::main@0x15d8
|
||||
- fft-int/fft-int.c::db_from_ampl@0x1807
|
||||
- fft-int/fft-int.c::fix_fft@0x11e9
|
||||
- flood-fill/flood-fill.c::main@0x144d
|
||||
- frac-calc/frac-calc.c::copyr@0x14d4
|
||||
- frac-calc/frac-calc.c::divtokens@0x15b8
|
||||
- frac-calc/frac-calc.c::help@0x13d9
|
||||
- frac-calc/frac-calc.c::main@0x11e9
|
||||
- fuzzy-match/fuzzy-match.c::compute_score@0x2379
|
||||
- fuzzy-match/fuzzy-match.c::fuzzy_match_recurse@0x2283
|
||||
- fuzzy-match/fuzzy-match.c::main@0x24b3
|
||||
- fy-shuffle/fy-shuffle.c::main@0x1378
|
||||
- fy-shuffle/fy-shuffle.c::rand_int@0x11e9
|
||||
- gcd-list/gcd-list.c::gcd@0x11e9
|
||||
- gcd-list/gcd-list.c::main@0x125e
|
||||
- grad-descent/grad-descent.c::main@0x1413
|
||||
- graph-tests/graph-tests.c::addEdge@0x12c9
|
||||
- graph-tests/graph-tests.c::addVertex@0x19f6
|
||||
- graph-tests/graph-tests.c::bfs@0x15ce
|
||||
- graph-tests/graph-tests.c::bfs_test@0x16e9
|
||||
- graph-tests/graph-tests.c::bubbleSort@0x1829
|
||||
- graph-tests/graph-tests.c::createGraph@0x1221
|
||||
- graph-tests/graph-tests.c::createNode@0x11e9
|
||||
- graph-tests/graph-tests.c::createQueue@0x1372
|
||||
- graph-tests/graph-tests.c::dequeue@0x145d
|
||||
- graph-tests/graph-tests.c::enqueue@0x13d7
|
||||
- graph-tests/graph-tests.c::insertAtTheBegin@0x17b1
|
||||
- graph-tests/graph-tests.c::link_list@0x18b8
|
||||
- graph-tests/graph-tests.c::main@0x1d6c
|
||||
- graph-tests/graph-tests.c::printQueue@0x151b
|
||||
- graph-tests/graph-tests.c::swap@0x17f8
|
||||
- hanoi/hanoi.c::main@0x12d4
|
||||
- heapsort/heapsort.c::main@0x155f
|
||||
- heat-calc/heat-calc.c::main@0x11e9
|
||||
- huff-encode/huff-encode.c::main@0x192d
|
||||
- idct-alg/idct-alg.c::C@0x11e9
|
||||
- idct-alg/idct-alg.c::main@0x1472
|
||||
- indirect-test/indirect-test.c::main@0x12c9
|
||||
- kadane/kadane.c::main@0x1276
|
||||
- kepler/kepler.c::bin_fact@0x1b3e
|
||||
- kepler/kepler.c::binary@0x12c6
|
||||
- kepler/kepler.c::e_series@0x1389
|
||||
- kepler/kepler.c::j_series@0x1501
|
||||
- kepler/kepler.c::main@0x1608
|
||||
- knapsack/knapsack.c::main@0x138e
|
||||
- knapsack/knapsack.c::max@0x11e9
|
||||
- knights-tour/knights-tour.c::solveKT@0x12d6
|
||||
- life/life.c::getNumNeigbors@0x156f
|
||||
- life/life.c::main@0x11e9
|
||||
- life/life.c::process@0x1426
|
||||
- longdiv/longdiv.c::main@0x18fd
|
||||
- longdiv/longdiv.c::sub@0x11e9
|
||||
- lu-decomp/lu-decomp.c::main@0x1520
|
||||
- lu-decomp/lu-decomp.c::print_matrix@0x11e9
|
||||
- mandelbrot/mandelbrot.c::main@0x1220
|
||||
- matmult/matmult.c::main@0x11e9
|
||||
- max-subseq/max-subseq.c::lcsAlgo@0x11e9
|
||||
- max-subseq/max-subseq.c::main@0x171a
|
||||
- mersenne/mersenne.c::genrand@0x12ee
|
||||
- mersenne/mersenne.c::main@0x153a
|
||||
- mersenne/mersenne.c::sgenrand@0x11e9
|
||||
- minspan/minspan.c::displayPath@0x1af2
|
||||
- minspan/minspan.c::main@0x1d8f
|
||||
- minspan/minspan.c::minSpanTree@0x1297
|
||||
- monte-carlo/monte-carlo.c::main@0x11e9
|
||||
- murmur-hash/murmur-hash.c::main@0x13a9
|
||||
- murmur-hash/murmur-hash.c::murmurhash@0x11e9
|
||||
- n-queens/n-queens.c::main@0x12ec
|
||||
- natlog/natlog.c::main@0x11e9
|
||||
- nbody-sim/nbody-sim.c::main@0x11e9
|
||||
- packet-filter/packet-filter.c::generate_packet@0x11e9
|
||||
- packet-filter/packet-filter.c::main@0x14c3
|
||||
- parrondo/parrondo.c::cointoss@0x11e9
|
||||
- parrondo/parrondo.c::main@0x12cb
|
||||
- pi-calc/pi-calc.c::main@0x11e9
|
||||
- primal-test/primal-test.c::main@0x1459
|
||||
- primal-test/primal-test.c::miller_rabin_int@0x12fd
|
||||
- primal-test/primal-test.c::powm@0x11e9
|
||||
- priority-queue/priority-queue.c::main@0x13ee
|
||||
- qsort-demo/qsort-demo.c::main@0x17bf
|
||||
- qsort-demo/qsort-demo.c::print_struct_array@0x155e
|
||||
- qsort-demo/qsort-demo.c::sort_cstrings_example@0x1401
|
||||
- qsort-demo/qsort-demo.c::sort_integers_example@0x1280
|
||||
- qsort-demo/qsort-demo.c::sort_structs_example@0x1603
|
||||
- qsort-test/qsort-test.c::main@0x1415
|
||||
- quaternions/quaternions.c::euler_from_quat@0x1447
|
||||
- quaternions/quaternions.c::quat_from_euler@0x11e9
|
||||
- quaternions/quaternions.c::quaternion_multiply@0x1655
|
||||
- quaternions/quaternions.c::test@0x18b2
|
||||
- rabinkarp-search/rabinkarp-search.c::main@0x1341
|
||||
- rand-test/rand-test.c::main@0x1913
|
||||
- rand-test/rand-test.c::run_tests@0x1258
|
||||
- ransac/ransac.c::main@0x1466
|
||||
- regex-parser/regex-parser.c::main@0x32b9
|
||||
- regex-parser/regex-parser.c::re_compile@0x22e1
|
||||
- regex-parser/regex-parser.c::re_print@0x278f
|
||||
- rho-factor/rho-factor.c::main@0x5c7d
|
||||
- rle-compress/rle-compress.c::run_length_encode@0x11e9
|
||||
- rsa-cipher/rsa-cipher.c::main@0x1634
|
||||
- rsa-cipher/rsa-cipher.c::mod_inverse@0x1363
|
||||
- rsa-cipher/rsa-cipher.c::mod_pow@0x11e9
|
||||
- rsa-cipher/rsa-cipher.c::print_hex_int128@0x14ef
|
||||
- sat-solver/sat-solver.c::main@0x1518
|
||||
- sat-solver/sat-solver.c::printFormula@0x1391
|
||||
- shortest-path/shortest-path.c::main@0x1469
|
||||
- sieve/sieve.c::main@0x1300
|
||||
- simple-grep/simple-grep.c::main@0x11e9
|
||||
- spelt2num/spelt2num.c::main@0x11e9
|
||||
- spirograph/spirograph.c::spirograph@0x11e9
|
||||
- sudoku-solver/sudoku-solver.c::main@0x1532
|
||||
- tetris-sim/tetris-sim.c::best_move@0x1810
|
||||
- tetris-sim/tetris-sim.c::evaluate_board@0x1686
|
||||
- tetris-sim/tetris-sim.c::main@0x1ba5
|
||||
- tiny-NN/tiny-NN.c::train@0x1485
|
||||
- topo-sort/topo-sort.c::addEdge@0x12cf
|
||||
- topo-sort/topo-sort.c::createGraph@0x1259
|
||||
- topo-sort/topo-sort.c::createListNode@0x1221
|
||||
- topo-sort/topo-sort.c::createStackNode@0x11e9
|
||||
- topo-sort/topo-sort.c::main@0x153d
|
||||
- topo-sort/topo-sort.c::topologicalSort@0x13fd
|
||||
- topo-sort/topo-sort.c::topologicalSortUtil@0x1332
|
||||
- totient/totient.c::my_gcd@0x11e9
|
||||
- transcend/transcend.c::init_inputs_f64@0x1235
|
||||
- uniquify/uniquify.c::main@0x1228
|
||||
- vectors-3d/vectors-3d.c::get_cross_matrix@0x1601
|
||||
- vectors-3d/vectors-3d.c::print_vector@0x144f
|
||||
- vectors-3d/vectors-3d.c::test@0x17fb
|
||||
- vectors-3d/vectors-3d.c::unit_vec@0x1510
|
||||
- vectors-3d/vectors-3d.c::vector_add@0x126d
|
||||
- vectors-3d/vectors-3d.c::vector_prod@0x1373
|
||||
- vectors-3d/vectors-3d.c::vector_sub@0x11e9
|
||||
- verlet/verlet.c::main@0x170b
|
||||
- verlet/verlet.c::vb_init@0x1271
|
||||
- verlet/verlet.c::vb_step_avg@0x13aa
|
||||
- weekday/weekday.c::dayOfWeek@0x11e9
|
||||
- weekday/weekday.c::main@0x130d
|
||||
|
||||
## Execution Failures
|
||||
- life/life.c::init@0x1237
|
||||
- regex-parser/regex-parser.c::matchpattern@0x313f
|
||||
- verlet/verlet.c::vb_checksum@0x160b
|
||||
334
sk2decompile/evaluation/bringupbench/reports/O1_results.md
Normal file
334
sk2decompile/evaluation/bringupbench/reports/O1_results.md
Normal file
|
|
@ -0,0 +1,334 @@
|
|||
# Infer-Out Model 2 Evaluation (merged.O1.func_map.infer-host)
|
||||
|
||||
- Timestamp: 20251119-171212
|
||||
- Source JSONL: merged.O1.func_map.infer.jsonl
|
||||
- Target: host
|
||||
- Total cases: 379
|
||||
- Replacement success: 379 (100.00%)
|
||||
- Compilable: 155 (40.90%)
|
||||
- Executable: 148 (39.05%)
|
||||
|
||||
## Benchmark Breakdown
|
||||
| Benchmark | Cases | Replacement% | Build% | Exec% |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| ackermann | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| aes | 9 | 100.00% | 33.33% | 33.33% |
|
||||
| anagram | 13 | 100.00% | 53.85% | 53.85% |
|
||||
| audio-codec | 3 | 100.00% | 0.00% | 0.00% |
|
||||
| avl-tree | 17 | 100.00% | 29.41% | 29.41% |
|
||||
| banner | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| bit-kernels | 5 | 100.00% | 80.00% | 80.00% |
|
||||
| blake2b | 5 | 100.00% | 20.00% | 20.00% |
|
||||
| bloom-filter | 4 | 100.00% | 50.00% | 50.00% |
|
||||
| boyer-moore-search | 3 | 100.00% | 0.00% | 0.00% |
|
||||
| bubble-sort | 3 | 100.00% | 100.00% | 100.00% |
|
||||
| c-interp | 10 | 100.00% | 60.00% | 60.00% |
|
||||
| ccmac | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| checkers | 16 | 100.00% | 81.25% | 81.25% |
|
||||
| cipher | 3 | 100.00% | 33.33% | 0.00% |
|
||||
| congrad | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| connect4-minimax | 13 | 100.00% | 61.54% | 61.54% |
|
||||
| convex-hull | 4 | 100.00% | 75.00% | 75.00% |
|
||||
| dhrystone | 5 | 100.00% | 40.00% | 40.00% |
|
||||
| distinctness | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| fft-int | 4 | 100.00% | 75.00% | 75.00% |
|
||||
| flood-fill | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| frac-calc | 10 | 100.00% | 40.00% | 40.00% |
|
||||
| fuzzy-match | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| fy-shuffle | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| gcd-list | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| grad-descent | 4 | 100.00% | 0.00% | 0.00% |
|
||||
| graph-tests | 19 | 100.00% | 21.05% | 21.05% |
|
||||
| hanoi | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| heapsort | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| heat-calc | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| huff-encode | 13 | 100.00% | 92.31% | 92.31% |
|
||||
| idct-alg | 3 | 100.00% | 66.67% | 33.33% |
|
||||
| indirect-test | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| k-means | 6 | 100.00% | 50.00% | 50.00% |
|
||||
| kadane | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| kepler | 7 | 100.00% | 14.29% | 14.29% |
|
||||
| knapsack | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| knights-tour | 3 | 100.00% | 66.67% | 66.67% |
|
||||
| life | 14 | 100.00% | 21.43% | 14.29% |
|
||||
| longdiv | 7 | 100.00% | 71.43% | 71.43% |
|
||||
| lu-decomp | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| lz-compress | 2 | 100.00% | 100.00% | 100.00% |
|
||||
| mandelbrot | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| matmult | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| max-subseq | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| mersenne | 3 | 100.00% | 0.00% | 0.00% |
|
||||
| minspan | 8 | 100.00% | 37.50% | 25.00% |
|
||||
| monte-carlo | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| murmur-hash | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| n-queens | 3 | 100.00% | 66.67% | 66.67% |
|
||||
| natlog | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| nbody-sim | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| nr-solver | 1 | 100.00% | 100.00% | 100.00% |
|
||||
| packet-filter | 4 | 100.00% | 25.00% | 25.00% |
|
||||
| parrondo | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| pascal | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| pi-calc | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| primal-test | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| priority-queue | 5 | 100.00% | 80.00% | 80.00% |
|
||||
| qsort-demo | 7 | 100.00% | 28.57% | 28.57% |
|
||||
| qsort-test | 5 | 100.00% | 80.00% | 80.00% |
|
||||
| quaternions | 4 | 100.00% | 0.00% | 0.00% |
|
||||
| rabinkarp-search | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| rand-test | 3 | 100.00% | 0.00% | 0.00% |
|
||||
| ransac | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| regex-parser | 8 | 100.00% | 25.00% | 12.50% |
|
||||
| rho-factor | 4 | 100.00% | 75.00% | 75.00% |
|
||||
| rle-compress | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| rsa-cipher | 4 | 100.00% | 0.00% | 0.00% |
|
||||
| sat-solver | 5 | 100.00% | 60.00% | 60.00% |
|
||||
| shortest-path | 3 | 100.00% | 66.67% | 66.67% |
|
||||
| sieve | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| simple-grep | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| spelt2num | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| spirograph | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| sudoku-solver | 4 | 100.00% | 50.00% | 50.00% |
|
||||
| tetris-sim | 12 | 100.00% | 75.00% | 66.67% |
|
||||
| tiny-NN | 5 | 100.00% | 40.00% | 40.00% |
|
||||
| topo-sort | 7 | 100.00% | 0.00% | 0.00% |
|
||||
| totient | 4 | 100.00% | 50.00% | 50.00% |
|
||||
| transcend | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| uniquify | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| vectors-3d | 8 | 100.00% | 12.50% | 0.00% |
|
||||
| verlet | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| weekday | 2 | 100.00% | 0.00% | 0.00% |
|
||||
|
||||
## Compilation Failures
|
||||
- ackermann/ackermann.c::main@0x131c
|
||||
- aes/aes.c::aes_decrypt@0x161b
|
||||
- aes/aes.c::aes_encrypt@0x1560
|
||||
- aes/aes.c::inv_shift_rows@0x12cd
|
||||
- aes/aes.c::key_expansion@0x14c3
|
||||
- aes/aes.c::main@0x16d1
|
||||
- aes/aes.c::shift_rows@0x1248
|
||||
- anagram/anagram.c::BuildMask@0x1372
|
||||
- anagram/anagram.c::BuildWord@0x15cd
|
||||
- anagram/anagram.c::DumpWords@0x17e8
|
||||
- anagram/anagram.c::FindAnagram@0x1839
|
||||
- anagram/anagram.c::ReadDict@0x1233
|
||||
- anagram/anagram.c::main@0x1a93
|
||||
- audio-codec/audio-codec.c::decode@0x1271
|
||||
- audio-codec/audio-codec.c::encode@0x11e9
|
||||
- audio-codec/audio-codec.c::main@0x12d7
|
||||
- avl-tree/avlcore.c::CheckTreeNodeRotation@0x186a
|
||||
- avl-tree/element.c::Compare@0x1764
|
||||
- avl-tree/avlcore.c::DeleteByElement@0x1d2b
|
||||
- avl-tree/avlcore.c::DeleteByElementRecursive@0x1b8b
|
||||
- avl-tree/avlcore.c::DoubleLeftRotation@0x1845
|
||||
- avl-tree/avlcore.c::DoubleRightRotation@0x1821
|
||||
- avl-tree/avlcore.c::FindByElement@0x1790
|
||||
- avl-tree/avlcore.c::Height@0x1d6e
|
||||
- avl-tree/avlcore.c::Insert@0x1a73
|
||||
- avl-tree/avlcore.c::InsertNode@0x199b
|
||||
- avl-tree/avl-tree.c::main@0x1380
|
||||
- avl-tree/avl-tree.c::printTree@0x11e9
|
||||
- banner/banner.c::main@0x11e9
|
||||
- bit-kernels/bit-kernels.c::main@0x12e8
|
||||
- blake2b/blake2b.c::F@0x1258
|
||||
- blake2b/blake2b.c::G@0x11e9
|
||||
- blake2b/blake2b.c::blake2b@0x1616
|
||||
- blake2b/blake2b.c::test@0x1982
|
||||
- bloom-filter/bloom-filter.c::bad_search@0x11e9
|
||||
- bloom-filter/bloom-filter.c::main@0x1217
|
||||
- boyer-moore-search/boyer-moore-search.c::badCharHeuristic@0x11e9
|
||||
- boyer-moore-search/boyer-moore-search.c::main@0x1329
|
||||
- boyer-moore-search/boyer-moore-search.c::search@0x1223
|
||||
- c-interp/c-interp.c::eval@0x35d3
|
||||
- c-interp/c-interp.c::function_body@0x310b
|
||||
- c-interp/c-interp.c::main@0x3c45
|
||||
- c-interp/c-interp.c::next@0x11e9
|
||||
- ccmac/ccmac.c::main@0x11e9
|
||||
- checkers/functions.c::fill_print_initial@0x15dd
|
||||
- checkers/functions.c::link_new_node@0x204d
|
||||
- checkers/checkers.c::main@0x11e9
|
||||
- cipher/cipher.c::encipher@0x11e9
|
||||
- cipher/cipher.c::main@0x12b3
|
||||
- congrad/congrad.c::cg_spmv@0x11e9
|
||||
- congrad/congrad.c::main@0x125a
|
||||
- connect4-minimax/connect4-minimax.c::init_board@0x11e9
|
||||
- connect4-minimax/connect4-minimax.c::main@0x1c5d
|
||||
- connect4-minimax/connect4-minimax.c::minimax@0x17ed
|
||||
- connect4-minimax/connect4-minimax.c::play_game@0x1b13
|
||||
- connect4-minimax/connect4-minimax.c::score_position@0x158e
|
||||
- convex-hull/convex-hull.c::main@0x130d
|
||||
- dhrystone/dhrystone.c::PFunc_1@0x12ab
|
||||
- dhrystone/dhrystone.c::PFunc_2@0x12c8
|
||||
- dhrystone/dhrystone.c::main@0x1311
|
||||
- distinctness/distinctness.c::isDistinct@0x11e9
|
||||
- distinctness/distinctness.c::main@0x1342
|
||||
- fft-int/fft-int.c::db_from_ampl@0x1513
|
||||
- flood-fill/flood-fill.c::main@0x130f
|
||||
- frac-calc/frac-calc.c::avaliatokens@0x1421
|
||||
- frac-calc/frac-calc.c::calcula@0x172a
|
||||
- frac-calc/frac-calc.c::copyr@0x12b5
|
||||
- frac-calc/frac-calc.c::divtokens@0x1636
|
||||
- frac-calc/frac-calc.c::help@0x11e9
|
||||
- frac-calc/frac-calc.c::main@0x18c1
|
||||
- fuzzy-match/fuzzy-match.c::fuzzy_match_recurse@0x21e9
|
||||
- fuzzy-match/fuzzy-match.c::main@0x2391
|
||||
- fy-shuffle/fy-shuffle.c::fy_shuffle@0x11e9
|
||||
- fy-shuffle/fy-shuffle.c::main@0x12de
|
||||
- gcd-list/gcd-list.c::gcd@0x11e9
|
||||
- gcd-list/gcd-list.c::main@0x121c
|
||||
- grad-descent/grad-descent.c::derivateWRTBias@0x1247
|
||||
- grad-descent/grad-descent.c::derivateWRTWeight@0x11e9
|
||||
- grad-descent/grad-descent.c::gradientDescent@0x129d
|
||||
- grad-descent/grad-descent.c::main@0x1312
|
||||
- graph-tests/graph-tests.c::addEdge@0x127b
|
||||
- graph-tests/graph-tests.c::addVertex@0x1743
|
||||
- graph-tests/graph-tests.c::bfs@0x144f
|
||||
- graph-tests/graph-tests.c::bfs_test@0x150f
|
||||
- graph-tests/graph-tests.c::bubbleSort@0x15e7
|
||||
- graph-tests/graph-tests.c::createGraph@0x1206
|
||||
- graph-tests/graph-tests.c::createNode@0x11e9
|
||||
- graph-tests/graph-tests.c::createQueue@0x12cd
|
||||
- graph-tests/graph-tests.c::dequeue@0x1357
|
||||
- graph-tests/graph-tests.c::enqueue@0x130a
|
||||
- graph-tests/graph-tests.c::insertAtTheBegin@0x15ae
|
||||
- graph-tests/graph-tests.c::link_list@0x163c
|
||||
- graph-tests/graph-tests.c::main@0x1a0e
|
||||
- graph-tests/graph-tests.c::printQueue@0x13cc
|
||||
- graph-tests/graph-tests.c::swap@0x15da
|
||||
- hanoi/hanoi.c::main@0x1261
|
||||
- heapsort/heapsort.c::main@0x13d4
|
||||
- heat-calc/heat-calc.c::main@0x11e9
|
||||
- huff-encode/huff-encode.c::main@0x15ef
|
||||
- idct-alg/idct-alg.c::main@0x140e
|
||||
- indirect-test/indirect-test.c::main@0x1257
|
||||
- k-means/k-means.c::calculateNearst@0x11e9
|
||||
- k-means/k-means.c::main@0x1922
|
||||
- k-means/k-means.c::printEPS@0x1546
|
||||
- kadane/kadane.c::main@0x123b
|
||||
- kepler/kepler.c::J@0x18c0
|
||||
- kepler/kepler.c::bin_fact@0x1718
|
||||
- kepler/kepler.c::binary@0x121d
|
||||
- kepler/kepler.c::e_series@0x17a2
|
||||
- kepler/kepler.c::j_series@0x19bb
|
||||
- kepler/kepler.c::main@0x131f
|
||||
- knapsack/knapsack.c::main@0x128b
|
||||
- knapsack/knapsack.c::max@0x11e9
|
||||
- knights-tour/knights-tour.c::solveKT@0x1341
|
||||
- life/life.c::getDown@0x1406
|
||||
- life/life.c::getDownLeft@0x1487
|
||||
- life/life.c::getDownRight@0x14b4
|
||||
- life/life.c::getLeft@0x1390
|
||||
- life/life.c::getNumNeigbors@0x14e2
|
||||
- life/life.c::getRight@0x13b7
|
||||
- life/life.c::getUp@0x13df
|
||||
- life/life.c::getUpLeft@0x142e
|
||||
- life/life.c::getUpRight@0x145a
|
||||
- life/life.c::main@0x1664
|
||||
- life/life.c::process@0x15a3
|
||||
- longdiv/longdiv.c::main@0x1691
|
||||
- longdiv/longdiv.c::sub@0x11e9
|
||||
- lu-decomp/lu-decomp.c::main@0x13ad
|
||||
- lu-decomp/lu-decomp.c::print_matrix@0x11e9
|
||||
- mandelbrot/mandelbrot.c::main@0x120d
|
||||
- matmult/matmult.c::main@0x11e9
|
||||
- max-subseq/max-subseq.c::lcsAlgo@0x11e9
|
||||
- max-subseq/max-subseq.c::main@0x14c4
|
||||
- mersenne/mersenne.c::genrand@0x125b
|
||||
- mersenne/mersenne.c::main@0x1398
|
||||
- mersenne/mersenne.c::sgenrand@0x11e9
|
||||
- minspan/minspan.c::displayGraph@0x13f5
|
||||
- minspan/minspan.c::displayGraph1@0x14f3
|
||||
- minspan/minspan.c::displayPath@0x15fa
|
||||
- minspan/minspan.c::main@0x175b
|
||||
- minspan/minspan.c::minSpanTree@0x1231
|
||||
- monte-carlo/monte-carlo.c::main@0x11e9
|
||||
- murmur-hash/murmur-hash.c::main@0x12a3
|
||||
- murmur-hash/murmur-hash.c::murmurhash@0x11e9
|
||||
- n-queens/n-queens.c::main@0x12b1
|
||||
- natlog/natlog.c::main@0x11e9
|
||||
- nbody-sim/nbody-sim.c::main@0x11e9
|
||||
- packet-filter/packet-filter.c::check_packet_filter@0x133d
|
||||
- packet-filter/packet-filter.c::generate_packet@0x11e9
|
||||
- packet-filter/packet-filter.c::main@0x145c
|
||||
- parrondo/parrondo.c::main@0x127d
|
||||
- parrondo/parrondo.c::play_c@0x1238
|
||||
- pascal/pascal.c::main@0x12d1
|
||||
- pascal/pascal.c::print_centered@0x122b
|
||||
- pi-calc/pi-calc.c::main@0x11e9
|
||||
- primal-test/primal-test.c::main@0x13ea
|
||||
- primal-test/primal-test.c::miller_rabin_int@0x1243
|
||||
- priority-queue/priority-queue.c::main@0x130a
|
||||
- qsort-demo/qsort-demo.c::main@0x163f
|
||||
- qsort-demo/qsort-demo.c::print_struct_array@0x1470
|
||||
- qsort-demo/qsort-demo.c::sort_cstrings_example@0x13b3
|
||||
- qsort-demo/qsort-demo.c::sort_integers_example@0x1292
|
||||
- qsort-demo/qsort-demo.c::sort_structs_example@0x14d2
|
||||
- qsort-test/qsort-test.c::main@0x133f
|
||||
- quaternions/quaternions.c::euler_from_quat@0x136c
|
||||
- quaternions/quaternions.c::main@0x15bf
|
||||
- quaternions/quaternions.c::quat_from_euler@0x11e9
|
||||
- quaternions/quaternions.c::quaternion_multiply@0x1487
|
||||
- rabinkarp-search/rabinkarp-search.c::main@0x1366
|
||||
- rabinkarp-search/rabinkarp-search.c::search@0x11e9
|
||||
- rand-test/rand-test.c::bad_rand@0x11e9
|
||||
- rand-test/rand-test.c::main@0x1514
|
||||
- rand-test/rand-test.c::run_tests@0x1220
|
||||
- ransac/ransac.c::main@0x13cf
|
||||
- ransac/ransac.c::ransac_line_fitting@0x1238
|
||||
- regex-parser/regex-parser.c::main@0x2b4b
|
||||
- regex-parser/regex-parser.c::matchalphanum@0x21fc
|
||||
- regex-parser/regex-parser.c::matchcharclass@0x222a
|
||||
- regex-parser/regex-parser.c::matchone@0x23e1
|
||||
- regex-parser/regex-parser.c::re_compile@0x270b
|
||||
- regex-parser/regex-parser.c::re_print@0x2964
|
||||
- rho-factor/rho-factor.c::main@0x3ef0
|
||||
- rle-compress/rle-compress.c::main@0x1318
|
||||
- rle-compress/rle-compress.c::run_length_encode@0x11e9
|
||||
- rsa-cipher/rsa-cipher.c::main@0x1527
|
||||
- rsa-cipher/rsa-cipher.c::mod_inverse@0x12f3
|
||||
- rsa-cipher/rsa-cipher.c::mod_pow@0x11e9
|
||||
- rsa-cipher/rsa-cipher.c::print_hex_int128@0x1444
|
||||
- sat-solver/sat-solver.c::main@0x141e
|
||||
- sat-solver/sat-solver.c::printFormula@0x12ff
|
||||
- shortest-path/shortest-path.c::main@0x1333
|
||||
- sieve/sieve.c::main@0x11e9
|
||||
- simple-grep/simple-grep.c::main@0x11e9
|
||||
- spelt2num/spelt2num.c::main@0x11e9
|
||||
- spirograph/spirograph.c::spirograph@0x11e9
|
||||
- sudoku-solver/sudoku-solver.c::isSafe@0x11e9
|
||||
- sudoku-solver/sudoku-solver.c::main@0x13e5
|
||||
- tetris-sim/tetris-sim.c::best_move@0x157c
|
||||
- tetris-sim/tetris-sim.c::evaluate_board@0x144b
|
||||
- tetris-sim/tetris-sim.c::main@0x180d
|
||||
- tiny-NN/tiny-NN.c::main@0x16a4
|
||||
- tiny-NN/tiny-NN.c::sampleSine@0x1251
|
||||
- tiny-NN/tiny-NN.c::train@0x133c
|
||||
- topo-sort/topo-sort.c::addEdge@0x127d
|
||||
- topo-sort/topo-sort.c::createGraph@0x1223
|
||||
- topo-sort/topo-sort.c::createListNode@0x1206
|
||||
- topo-sort/topo-sort.c::createStackNode@0x11e9
|
||||
- topo-sort/topo-sort.c::main@0x1424
|
||||
- topo-sort/topo-sort.c::topologicalSort@0x132c
|
||||
- topo-sort/topo-sort.c::topologicalSortUtil@0x12b7
|
||||
- totient/totient.c::main@0x12bf
|
||||
- totient/totient.c::my_gcd@0x11e9
|
||||
- transcend/transcend.c::main@0x11e9
|
||||
- uniquify/uniquify.c::main@0x1201
|
||||
- vectors-3d/vectors-3d.c::get_cross_matrix@0x13c2
|
||||
- vectors-3d/vectors-3d.c::main@0x14cb
|
||||
- vectors-3d/vectors-3d.c::print_vector@0x12dc
|
||||
- vectors-3d/vectors-3d.c::unit_vec@0x1331
|
||||
- vectors-3d/vectors-3d.c::vector_add@0x121f
|
||||
- vectors-3d/vectors-3d.c::vector_prod@0x127e
|
||||
- vectors-3d/vectors-3d.c::vector_sub@0x11e9
|
||||
- verlet/verlet.c::main@0x11e9
|
||||
- weekday/weekday.c::dayOfWeek@0x11e9
|
||||
- weekday/weekday.c::main@0x12ea
|
||||
|
||||
## Execution Failures
|
||||
- cipher/cipher.c::decipher@0x1251
|
||||
- idct-alg/idct-alg.c::idct_2d@0x1216
|
||||
- life/life.c::init@0x11e9
|
||||
- minspan/minspan.c::displayTree@0x16b7
|
||||
- regex-parser/regex-parser.c::matchpattern@0x2491
|
||||
- tetris-sim/tetris-sim.c::clear_lines@0x12b6
|
||||
- vectors-3d/vectors-3d.c::get_angle@0x1429
|
||||
345
sk2decompile/evaluation/bringupbench/reports/O2_results.md
Normal file
345
sk2decompile/evaluation/bringupbench/reports/O2_results.md
Normal file
|
|
@ -0,0 +1,345 @@
|
|||
# Infer-Out Model 2 Evaluation (merged.O2.func_map.infer-host)
|
||||
|
||||
- Timestamp: 20251119-170633
|
||||
- Source JSONL: merged.O2.func_map.infer.jsonl
|
||||
- Target: host
|
||||
- Total cases: 368
|
||||
- Replacement success: 368 (100.00%)
|
||||
- Compilable: 139 (37.77%)
|
||||
- Executable: 126 (34.24%)
|
||||
|
||||
## Benchmark Breakdown
|
||||
| Benchmark | Cases | Replacement% | Build% | Exec% |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| ackermann | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| aes | 10 | 100.00% | 20.00% | 20.00% |
|
||||
| anagram | 13 | 100.00% | 46.15% | 46.15% |
|
||||
| audio-codec | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| avl-tree | 15 | 100.00% | 20.00% | 20.00% |
|
||||
| banner | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| bit-kernels | 3 | 100.00% | 66.67% | 66.67% |
|
||||
| blake2b | 4 | 100.00% | 0.00% | 0.00% |
|
||||
| bloom-filter | 4 | 100.00% | 50.00% | 50.00% |
|
||||
| boyer-moore-search | 3 | 100.00% | 0.00% | 0.00% |
|
||||
| bubble-sort | 3 | 100.00% | 100.00% | 100.00% |
|
||||
| c-interp | 10 | 100.00% | 50.00% | 50.00% |
|
||||
| ccmac | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| checkers | 16 | 100.00% | 68.75% | 62.50% |
|
||||
| cipher | 3 | 100.00% | 66.67% | 0.00% |
|
||||
| congrad | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| connect4-minimax | 13 | 100.00% | 61.54% | 53.85% |
|
||||
| convex-hull | 4 | 100.00% | 75.00% | 75.00% |
|
||||
| dhrystone | 5 | 100.00% | 20.00% | 20.00% |
|
||||
| distinctness | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| fft-int | 4 | 100.00% | 50.00% | 50.00% |
|
||||
| flood-fill | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| frac-calc | 10 | 100.00% | 50.00% | 50.00% |
|
||||
| fuzzy-match | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| fy-shuffle | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| gcd-list | 2 | 100.00% | 50.00% | 0.00% |
|
||||
| grad-descent | 4 | 100.00% | 25.00% | 25.00% |
|
||||
| graph-tests | 20 | 100.00% | 10.00% | 10.00% |
|
||||
| hanoi | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| heapsort | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| heat-calc | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| huff-encode | 13 | 100.00% | 92.31% | 92.31% |
|
||||
| idct-alg | 3 | 100.00% | 66.67% | 33.33% |
|
||||
| indirect-test | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| k-means | 6 | 100.00% | 33.33% | 33.33% |
|
||||
| kadane | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| kepler | 7 | 100.00% | 14.29% | 14.29% |
|
||||
| knapsack | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| knights-tour | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| life | 14 | 100.00% | 21.43% | 14.29% |
|
||||
| longdiv | 6 | 100.00% | 50.00% | 50.00% |
|
||||
| lu-decomp | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| lz-compress | 2 | 100.00% | 100.00% | 100.00% |
|
||||
| mandelbrot | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| matmult | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| max-subseq | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| mersenne | 3 | 100.00% | 0.00% | 0.00% |
|
||||
| minspan | 8 | 100.00% | 25.00% | 25.00% |
|
||||
| monte-carlo | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| murmur-hash | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| n-queens | 3 | 100.00% | 66.67% | 66.67% |
|
||||
| natlog | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| nbody-sim | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| nr-solver | 1 | 100.00% | 100.00% | 100.00% |
|
||||
| packet-filter | 4 | 100.00% | 0.00% | 0.00% |
|
||||
| parrondo | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| pascal | 3 | 100.00% | 66.67% | 66.67% |
|
||||
| pi-calc | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| primal-test | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| priority-queue | 5 | 100.00% | 80.00% | 80.00% |
|
||||
| qsort-demo | 7 | 100.00% | 28.57% | 28.57% |
|
||||
| qsort-test | 5 | 100.00% | 80.00% | 80.00% |
|
||||
| quaternions | 4 | 100.00% | 0.00% | 0.00% |
|
||||
| rabinkarp-search | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| rand-test | 3 | 100.00% | 0.00% | 0.00% |
|
||||
| ransac | 2 | 100.00% | 50.00% | 0.00% |
|
||||
| regex-parser | 7 | 100.00% | 28.57% | 14.29% |
|
||||
| rho-factor | 3 | 100.00% | 66.67% | 66.67% |
|
||||
| rle-compress | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| rsa-cipher | 4 | 100.00% | 0.00% | 0.00% |
|
||||
| sat-solver | 5 | 100.00% | 60.00% | 60.00% |
|
||||
| shortest-path | 3 | 100.00% | 66.67% | 66.67% |
|
||||
| sieve | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| simple-grep | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| spelt2num | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| spirograph | 2 | 100.00% | 50.00% | 0.00% |
|
||||
| sudoku-solver | 4 | 100.00% | 50.00% | 50.00% |
|
||||
| tetris-sim | 12 | 100.00% | 75.00% | 58.33% |
|
||||
| tiny-NN | 4 | 100.00% | 25.00% | 25.00% |
|
||||
| topo-sort | 7 | 100.00% | 0.00% | 0.00% |
|
||||
| totient | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| transcend | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| uniquify | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| vectors-3d | 8 | 100.00% | 12.50% | 0.00% |
|
||||
| verlet | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| weekday | 2 | 100.00% | 0.00% | 0.00% |
|
||||
|
||||
## Compilation Failures
|
||||
- ackermann/ackermann.c::main@0x1100
|
||||
- aes/aes.c::aes_decrypt@0x18c0
|
||||
- aes/aes.c::aes_encrypt@0x1780
|
||||
- aes/aes.c::inv_mix_columns@0x1640
|
||||
- aes/aes.c::inv_shift_rows@0x14f0
|
||||
- aes/aes.c::key_expansion@0x16d0
|
||||
- aes/aes.c::main@0x1100
|
||||
- aes/aes.c::mix_columns@0x1580
|
||||
- aes/aes.c::shift_rows@0x1480
|
||||
- anagram/anagram.c::BuildMask@0x14c0
|
||||
- anagram/anagram.c::BuildWord@0x17d0
|
||||
- anagram/anagram.c::DumpCandidates@0x19a0
|
||||
- anagram/anagram.c::DumpWords@0x1a30
|
||||
- anagram/anagram.c::FindAnagram@0x1a90
|
||||
- anagram/anagram.c::ReadDict@0x1360
|
||||
- anagram/anagram.c::main@0x1120
|
||||
- audio-codec/audio-codec.c::decode@0x1440
|
||||
- audio-codec/audio-codec.c::main@0x1100
|
||||
- avl-tree/avlcore.c::CheckTreeNodeRotation@0x1c30
|
||||
- avl-tree/element.c::Compare@0x1ad0
|
||||
- avl-tree/avlcore.c::DeleteByElement@0x2860
|
||||
- avl-tree/avlcore.c::DeleteByElementRecursive@0x26d0
|
||||
- avl-tree/avlcore.c::DeleteLeftMost@0x2610
|
||||
- avl-tree/avlcore.c::DoubleLeftRotation@0x1c00
|
||||
- avl-tree/avlcore.c::DoubleRightRotation@0x1bd0
|
||||
- avl-tree/avlcore.c::FindByElement@0x1b00
|
||||
- avl-tree/avlcore.c::Insert@0x1f30
|
||||
- avl-tree/avlcore.c::MakeEmpty@0x1f80
|
||||
- avl-tree/avl-tree.c::breadth@0x1760
|
||||
- avl-tree/avl-tree.c::main@0x1120
|
||||
- banner/banner.c::main@0x1120
|
||||
- bit-kernels/bit-kernels.c::main@0x1120
|
||||
- blake2b/blake2b.c::F@0x12a0
|
||||
- blake2b/blake2b.c::G@0x1230
|
||||
- blake2b/blake2b.c::blake2b@0x1620
|
||||
- blake2b/blake2b.c::test@0x19d0
|
||||
- bloom-filter/bloom-filter.c::bad_search@0x1430
|
||||
- bloom-filter/bloom-filter.c::main@0x1120
|
||||
- boyer-moore-search/boyer-moore-search.c::badCharHeuristic@0x15d0
|
||||
- boyer-moore-search/boyer-moore-search.c::main@0x1140
|
||||
- boyer-moore-search/boyer-moore-search.c::search@0x1630
|
||||
- c-interp/c-interp.c::eval@0x3e90
|
||||
- c-interp/c-interp.c::function_body@0x37f0
|
||||
- c-interp/c-interp.c::function_declaration@0x3a10
|
||||
- c-interp/c-interp.c::main@0x1120
|
||||
- c-interp/c-interp.c::next@0x1580
|
||||
- ccmac/ccmac.c::main@0x1120
|
||||
- checkers/functions.c::fill_print_initial@0x1630
|
||||
- checkers/functions.c::free_tree@0x2460
|
||||
- checkers/functions.c::generate_node_children@0x21c0
|
||||
- checkers/functions.c::link_new_node@0x20e0
|
||||
- checkers/checkers.c::main@0x1150
|
||||
- cipher/cipher.c::main@0x1100
|
||||
- congrad/congrad.c::main@0x1100
|
||||
- connect4-minimax/connect4-minimax.c::init_board@0x1230
|
||||
- connect4-minimax/connect4-minimax.c::main@0x1100
|
||||
- connect4-minimax/connect4-minimax.c::minimax@0x1840
|
||||
- connect4-minimax/connect4-minimax.c::play_game@0x1c90
|
||||
- connect4-minimax/connect4-minimax.c::score_position@0x1620
|
||||
- convex-hull/convex-hull.c::main@0x1100
|
||||
- dhrystone/dhrystone.c::PFunc_1@0x1970
|
||||
- dhrystone/dhrystone.c::PFunc_2@0x1990
|
||||
- dhrystone/dhrystone.c::PProc_8@0x1900
|
||||
- dhrystone/dhrystone.c::main@0x1100
|
||||
- distinctness/distinctness.c::isDistinct@0x12a0
|
||||
- distinctness/distinctness.c::main@0x1100
|
||||
- fft-int/fft-int.c::db_from_ampl@0x1670
|
||||
- fft-int/fft-int.c::fix_fft@0x1320
|
||||
- flood-fill/flood-fill.c::main@0x1100
|
||||
- frac-calc/frac-calc.c::avaliatokens@0x15f0
|
||||
- frac-calc/frac-calc.c::copyr@0x1460
|
||||
- frac-calc/frac-calc.c::divtokens@0x1840
|
||||
- frac-calc/frac-calc.c::help@0x13b0
|
||||
- frac-calc/frac-calc.c::main@0x1120
|
||||
- fuzzy-match/fuzzy-match.c::fuzzy_match_recurse@0x2360
|
||||
- fuzzy-match/fuzzy-match.c::main@0x2100
|
||||
- fy-shuffle/fy-shuffle.c::fy_shuffle@0x1440
|
||||
- fy-shuffle/fy-shuffle.c::main@0x1100
|
||||
- gcd-list/gcd-list.c::main@0x1120
|
||||
- grad-descent/grad-descent.c::derivateWRTBias@0x12d0
|
||||
- grad-descent/grad-descent.c::derivateWRTWeight@0x1270
|
||||
- grad-descent/grad-descent.c::main@0x1100
|
||||
- graph-tests/graph-tests.c::DFS_test@0x1c20
|
||||
- graph-tests/graph-tests.c::addEdge@0x1320
|
||||
- graph-tests/graph-tests.c::addVertex@0x1a50
|
||||
- graph-tests/graph-tests.c::bfs@0x1540
|
||||
- graph-tests/graph-tests.c::bfs_test@0x1720
|
||||
- graph-tests/graph-tests.c::bubbleSort@0x1880
|
||||
- graph-tests/graph-tests.c::createGraph@0x1260
|
||||
- graph-tests/graph-tests.c::createNode@0x1240
|
||||
- graph-tests/graph-tests.c::createQueue@0x1390
|
||||
- graph-tests/graph-tests.c::depthFirstSearch@0x1b20
|
||||
- graph-tests/graph-tests.c::dequeue@0x1430
|
||||
- graph-tests/graph-tests.c::enqueue@0x13e0
|
||||
- graph-tests/graph-tests.c::getAdjUnvisitedVertex@0x1ac0
|
||||
- graph-tests/graph-tests.c::insertAtTheBegin@0x1840
|
||||
- graph-tests/graph-tests.c::link_list@0x18e0
|
||||
- graph-tests/graph-tests.c::main@0x1120
|
||||
- graph-tests/graph-tests.c::printQueue@0x14c0
|
||||
- graph-tests/graph-tests.c::swap@0x1870
|
||||
- hanoi/hanoi.c::main@0x1100
|
||||
- heapsort/heapsort.c::main@0x1100
|
||||
- heat-calc/heat-calc.c::main@0x1100
|
||||
- huff-encode/huff-encode.c::main@0x1120
|
||||
- idct-alg/idct-alg.c::main@0x1100
|
||||
- indirect-test/indirect-test.c::main@0x1100
|
||||
- k-means/k-means.c::calculateNearst@0x1310
|
||||
- k-means/k-means.c::kMeans@0x1420
|
||||
- k-means/k-means.c::main@0x1120
|
||||
- k-means/k-means.c::printEPS@0x16b0
|
||||
- kadane/kadane.c::main@0x1100
|
||||
- kepler/kepler.c::J@0x1920
|
||||
- kepler/kepler.c::bin_fact@0x1740
|
||||
- kepler/kepler.c::binary@0x16a0
|
||||
- kepler/kepler.c::e_series@0x17e0
|
||||
- kepler/kepler.c::j_series@0x1a20
|
||||
- kepler/kepler.c::main@0x1100
|
||||
- knapsack/knapsack.c::main@0x1100
|
||||
- knapsack/knapsack.c::max@0x1310
|
||||
- knights-tour/knights-tour.c::solveKT@0x1390
|
||||
- knights-tour/knights-tour.c::solveKTUtil@0x14f0
|
||||
- life/life.c::getDown@0x16e0
|
||||
- life/life.c::getDownLeft@0x1770
|
||||
- life/life.c::getDownRight@0x17a0
|
||||
- life/life.c::getLeft@0x1650
|
||||
- life/life.c::getNumNeigbors@0x1390
|
||||
- life/life.c::getRight@0x1680
|
||||
- life/life.c::getUp@0x16b0
|
||||
- life/life.c::getUpLeft@0x1710
|
||||
- life/life.c::getUpRight@0x1740
|
||||
- life/life.c::main@0x1100
|
||||
- life/life.c::process@0x1550
|
||||
- longdiv/longdiv.c::main@0x1120
|
||||
- longdiv/longdiv.c::sbc@0x1a20
|
||||
- longdiv/longdiv.c::sub@0x19c0
|
||||
- lu-decomp/lu-decomp.c::main@0x1100
|
||||
- lu-decomp/lu-decomp.c::print_matrix@0x13a0
|
||||
- mandelbrot/mandelbrot.c::main@0x1100
|
||||
- matmult/matmult.c::main@0x1100
|
||||
- max-subseq/max-subseq.c::lcsAlgo@0x1290
|
||||
- max-subseq/max-subseq.c::main@0x1120
|
||||
- mersenne/mersenne.c::genrand@0x1310
|
||||
- mersenne/mersenne.c::main@0x1100
|
||||
- mersenne/mersenne.c::sgenrand@0x1290
|
||||
- minspan/minspan.c::displayGraph@0x14f0
|
||||
- minspan/minspan.c::displayGraph1@0x15f0
|
||||
- minspan/minspan.c::displayPath@0x1700
|
||||
- minspan/minspan.c::displayTree@0x17a0
|
||||
- minspan/minspan.c::main@0x1100
|
||||
- minspan/minspan.c::minSpanTree@0x12f0
|
||||
- monte-carlo/monte-carlo.c::main@0x1100
|
||||
- murmur-hash/murmur-hash.c::main@0x1100
|
||||
- murmur-hash/murmur-hash.c::murmurhash@0x1290
|
||||
- n-queens/n-queens.c::main@0x1120
|
||||
- natlog/natlog.c::main@0x1100
|
||||
- nbody-sim/nbody-sim.c::main@0x1100
|
||||
- packet-filter/packet-filter.c::check_packet_filter@0x1430
|
||||
- packet-filter/packet-filter.c::generate_packet@0x12d0
|
||||
- packet-filter/packet-filter.c::main@0x1100
|
||||
- packet-filter/packet-filter.c::print_packet@0x1490
|
||||
- parrondo/parrondo.c::main@0x1100
|
||||
- pascal/pascal.c::main@0x1100
|
||||
- pi-calc/pi-calc.c::main@0x1100
|
||||
- primal-test/primal-test.c::main@0x1100
|
||||
- primal-test/primal-test.c::miller_rabin_int@0x1510
|
||||
- priority-queue/priority-queue.c::main@0x1120
|
||||
- qsort-demo/qsort-demo.c::main@0x1120
|
||||
- qsort-demo/qsort-demo.c::print_struct_array@0x15c0
|
||||
- qsort-demo/qsort-demo.c::sort_cstrings_example@0x14a0
|
||||
- qsort-demo/qsort-demo.c::sort_integers_example@0x1310
|
||||
- qsort-demo/qsort-demo.c::sort_structs_example@0x1640
|
||||
- qsort-test/qsort-test.c::main@0x1120
|
||||
- quaternions/quaternions.c::euler_from_quat@0x1580
|
||||
- quaternions/quaternions.c::main@0x1100
|
||||
- quaternions/quaternions.c::quat_from_euler@0x13f0
|
||||
- quaternions/quaternions.c::quaternion_multiply@0x16b0
|
||||
- rabinkarp-search/rabinkarp-search.c::main@0x1120
|
||||
- rabinkarp-search/rabinkarp-search.c::search@0x13a0
|
||||
- rand-test/rand-test.c::bad_rand@0x1240
|
||||
- rand-test/rand-test.c::main@0x1100
|
||||
- rand-test/rand-test.c::run_tests@0x1280
|
||||
- ransac/ransac.c::main@0x1100
|
||||
- regex-parser/regex-parser.c::main@0x2100
|
||||
- regex-parser/regex-parser.c::matchcharclass@0x23b0
|
||||
- regex-parser/regex-parser.c::matchone@0x2560
|
||||
- regex-parser/regex-parser.c::re_compile@0x2930
|
||||
- regex-parser/regex-parser.c::re_print@0x2bf0
|
||||
- rho-factor/rho-factor.c::main@0x1120
|
||||
- rle-compress/rle-compress.c::main@0x1120
|
||||
- rle-compress/rle-compress.c::run_length_encode@0x1330
|
||||
- rsa-cipher/rsa-cipher.c::main@0x1100
|
||||
- rsa-cipher/rsa-cipher.c::mod_inverse@0x1670
|
||||
- rsa-cipher/rsa-cipher.c::mod_pow@0x1580
|
||||
- rsa-cipher/rsa-cipher.c::print_hex_int128@0x1790
|
||||
- sat-solver/sat-solver.c::main@0x1100
|
||||
- sat-solver/sat-solver.c::printFormula@0x1390
|
||||
- shortest-path/shortest-path.c::main@0x1100
|
||||
- sieve/sieve.c::main@0x1100
|
||||
- simple-grep/simple-grep.c::main@0x1120
|
||||
- spelt2num/spelt2num.c::main@0x1100
|
||||
- spirograph/spirograph.c::spirograph@0x1230
|
||||
- sudoku-solver/sudoku-solver.c::isSafe@0x1250
|
||||
- sudoku-solver/sudoku-solver.c::main@0x1100
|
||||
- tetris-sim/tetris-sim.c::best_move@0x1860
|
||||
- tetris-sim/tetris-sim.c::evaluate_board@0x1640
|
||||
- tetris-sim/tetris-sim.c::main@0x1120
|
||||
- tiny-NN/tiny-NN.c::main@0x1120
|
||||
- tiny-NN/tiny-NN.c::sampleSine@0x12d0
|
||||
- tiny-NN/tiny-NN.c::train@0x13e0
|
||||
- topo-sort/topo-sort.c::addEdge@0x1370
|
||||
- topo-sort/topo-sort.c::createGraph@0x1300
|
||||
- topo-sort/topo-sort.c::createListNode@0x12e0
|
||||
- topo-sort/topo-sort.c::createStackNode@0x12c0
|
||||
- topo-sort/topo-sort.c::main@0x1120
|
||||
- topo-sort/topo-sort.c::topologicalSort@0x1450
|
||||
- topo-sort/topo-sort.c::topologicalSortUtil@0x13c0
|
||||
- totient/totient.c::main@0x1100
|
||||
- transcend/transcend.c::main@0x1120
|
||||
- uniquify/uniquify.c::main@0x1120
|
||||
- vectors-3d/vectors-3d.c::get_cross_matrix@0x1760
|
||||
- vectors-3d/vectors-3d.c::main@0x1100
|
||||
- vectors-3d/vectors-3d.c::print_vector@0x1620
|
||||
- vectors-3d/vectors-3d.c::unit_vec@0x1690
|
||||
- vectors-3d/vectors-3d.c::vector_add@0x1550
|
||||
- vectors-3d/vectors-3d.c::vector_prod@0x15c0
|
||||
- vectors-3d/vectors-3d.c::vector_sub@0x1510
|
||||
- verlet/verlet.c::main@0x1100
|
||||
- weekday/weekday.c::dayOfWeek@0x1350
|
||||
- weekday/weekday.c::main@0x1100
|
||||
|
||||
## Execution Failures
|
||||
- checkers/functions.c::all_possible_moves@0x1a60
|
||||
- cipher/cipher.c::decipher@0x1360
|
||||
- cipher/cipher.c::encipher@0x12f0
|
||||
- connect4-minimax/connect4-minimax.c::terminal_score@0x1800
|
||||
- gcd-list/gcd-list.c::gcd@0x1310
|
||||
- idct-alg/idct-alg.c::idct_2d@0x12f0
|
||||
- life/life.c::init@0x1220
|
||||
- ransac/ransac.c::ransac_line_fitting@0x1410
|
||||
- regex-parser/regex-parser.c::matchpattern@0x2670
|
||||
- spirograph/spirograph.c::test@0x1390
|
||||
- tetris-sim/tetris-sim.c::clear_lines@0x1480
|
||||
- tetris-sim/tetris-sim.c::simulate_board@0x17c0
|
||||
- vectors-3d/vectors-3d.c::get_angle@0x17d0
|
||||
355
sk2decompile/evaluation/bringupbench/reports/O3_results.md
Normal file
355
sk2decompile/evaluation/bringupbench/reports/O3_results.md
Normal file
|
|
@ -0,0 +1,355 @@
|
|||
# Infer-Out Model 2 Evaluation (merged.O3.func_map.infer-host)
|
||||
|
||||
- Timestamp: 20251119-171533
|
||||
- Source JSONL: merged.O3.func_map.infer.jsonl
|
||||
- Target: host
|
||||
- Total cases: 359
|
||||
- Replacement success: 359 (100.00%)
|
||||
- Compilable: 114 (31.75%)
|
||||
- Executable: 106 (29.53%)
|
||||
|
||||
## Benchmark Breakdown
|
||||
| Benchmark | Cases | Replacement% | Build% | Exec% |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| ackermann | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| aes | 11 | 100.00% | 27.27% | 27.27% |
|
||||
| anagram | 13 | 100.00% | 38.46% | 38.46% |
|
||||
| audio-codec | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| avl-tree | 15 | 100.00% | 13.33% | 13.33% |
|
||||
| banner | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| bit-kernels | 3 | 100.00% | 66.67% | 66.67% |
|
||||
| blake2b | 3 | 100.00% | 0.00% | 0.00% |
|
||||
| bloom-filter | 4 | 100.00% | 25.00% | 25.00% |
|
||||
| boyer-moore-search | 3 | 100.00% | 0.00% | 0.00% |
|
||||
| bubble-sort | 3 | 100.00% | 100.00% | 100.00% |
|
||||
| c-interp | 10 | 100.00% | 40.00% | 40.00% |
|
||||
| ccmac | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| checkers | 13 | 100.00% | 61.54% | 61.54% |
|
||||
| cipher | 3 | 100.00% | 33.33% | 0.00% |
|
||||
| congrad | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| connect4-minimax | 11 | 100.00% | 45.45% | 45.45% |
|
||||
| convex-hull | 4 | 100.00% | 50.00% | 50.00% |
|
||||
| dhrystone | 5 | 100.00% | 40.00% | 40.00% |
|
||||
| distinctness | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| fft-int | 4 | 100.00% | 0.00% | 0.00% |
|
||||
| flood-fill | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| frac-calc | 9 | 100.00% | 22.22% | 22.22% |
|
||||
| fuzzy-match | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| fy-shuffle | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| gcd-list | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| grad-descent | 4 | 100.00% | 0.00% | 0.00% |
|
||||
| graph-tests | 19 | 100.00% | 5.26% | 5.26% |
|
||||
| hanoi | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| heapsort | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| heat-calc | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| huff-encode | 12 | 100.00% | 83.33% | 83.33% |
|
||||
| idct-alg | 3 | 100.00% | 66.67% | 33.33% |
|
||||
| indirect-test | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| k-means | 5 | 100.00% | 0.00% | 0.00% |
|
||||
| kadane | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| kepler | 7 | 100.00% | 14.29% | 14.29% |
|
||||
| knapsack | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| knights-tour | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| life | 14 | 100.00% | 21.43% | 14.29% |
|
||||
| longdiv | 7 | 100.00% | 71.43% | 71.43% |
|
||||
| lu-decomp | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| lz-compress | 2 | 100.00% | 100.00% | 100.00% |
|
||||
| mandelbrot | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| max-subseq | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| mersenne | 4 | 100.00% | 0.00% | 0.00% |
|
||||
| minspan | 8 | 100.00% | 25.00% | 25.00% |
|
||||
| monte-carlo | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| murmur-hash | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| n-queens | 3 | 100.00% | 66.67% | 66.67% |
|
||||
| natlog | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| nbody-sim | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| nr-solver | 1 | 100.00% | 100.00% | 100.00% |
|
||||
| packet-filter | 4 | 100.00% | 0.00% | 0.00% |
|
||||
| parrondo | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| pascal | 3 | 100.00% | 66.67% | 66.67% |
|
||||
| pi-calc | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| primal-test | 3 | 100.00% | 66.67% | 66.67% |
|
||||
| priority-queue | 5 | 100.00% | 40.00% | 40.00% |
|
||||
| qsort-demo | 7 | 100.00% | 28.57% | 28.57% |
|
||||
| qsort-test | 5 | 100.00% | 80.00% | 80.00% |
|
||||
| quaternions | 4 | 100.00% | 0.00% | 0.00% |
|
||||
| rabinkarp-search | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| rand-test | 3 | 100.00% | 0.00% | 0.00% |
|
||||
| ransac | 2 | 100.00% | 50.00% | 0.00% |
|
||||
| regex-parser | 8 | 100.00% | 25.00% | 25.00% |
|
||||
| rho-factor | 1 | 100.00% | 100.00% | 100.00% |
|
||||
| rle-compress | 2 | 100.00% | 0.00% | 0.00% |
|
||||
| rsa-cipher | 4 | 100.00% | 0.00% | 0.00% |
|
||||
| sat-solver | 5 | 100.00% | 60.00% | 40.00% |
|
||||
| shortest-path | 3 | 100.00% | 33.33% | 33.33% |
|
||||
| sieve | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| simple-grep | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| spelt2num | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| spirograph | 2 | 100.00% | 50.00% | 0.00% |
|
||||
| sudoku-solver | 4 | 100.00% | 75.00% | 75.00% |
|
||||
| tetris-sim | 12 | 100.00% | 58.33% | 50.00% |
|
||||
| tiny-NN | 4 | 100.00% | 25.00% | 25.00% |
|
||||
| topo-sort | 7 | 100.00% | 0.00% | 0.00% |
|
||||
| totient | 2 | 100.00% | 50.00% | 50.00% |
|
||||
| transcend | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| uniquify | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| vectors-3d | 8 | 100.00% | 12.50% | 0.00% |
|
||||
| verlet | 1 | 100.00% | 0.00% | 0.00% |
|
||||
| weekday | 2 | 100.00% | 0.00% | 0.00% |
|
||||
|
||||
## Compilation Failures
|
||||
- ackermann/ackermann.c::main@0x1100
|
||||
- aes/aes.c::add_round_key@0x1810
|
||||
- aes/aes.c::aes_decrypt@0x2760
|
||||
- aes/aes.c::aes_encrypt@0x2200
|
||||
- aes/aes.c::inv_shift_rows@0x1af0
|
||||
- aes/aes.c::key_expansion@0x1ff0
|
||||
- aes/aes.c::main@0x1100
|
||||
- aes/aes.c::mix_columns@0x1bd0
|
||||
- aes/aes.c::shift_rows@0x1a30
|
||||
- anagram/anagram.c::BuildMask@0x1620
|
||||
- anagram/anagram.c::BuildWord@0x1940
|
||||
- anagram/anagram.c::DumpCandidates@0x1c10
|
||||
- anagram/anagram.c::DumpWords@0x1ca0
|
||||
- anagram/anagram.c::FindAnagram@0x1d00
|
||||
- anagram/anagram.c::ReadDict@0x14c0
|
||||
- anagram/anagram.c::SortCandidates@0x1f10
|
||||
- anagram/anagram.c::main@0x1120
|
||||
- audio-codec/audio-codec.c::decode@0x1590
|
||||
- audio-codec/audio-codec.c::main@0x1100
|
||||
- avl-tree/avlcore.c::CheckTreeNodeRotation@0x1c50
|
||||
- avl-tree/element.c::Compare@0x1af0
|
||||
- avl-tree/avlcore.c::DeleteByElement@0x2e50
|
||||
- avl-tree/avlcore.c::DeleteByElementRecursive@0x2bf0
|
||||
- avl-tree/avlcore.c::DeleteLeftMost@0x2720
|
||||
- avl-tree/avlcore.c::DoubleLeftRotation@0x1c20
|
||||
- avl-tree/avlcore.c::DoubleRightRotation@0x1bf0
|
||||
- avl-tree/avlcore.c::FindByElement@0x1b20
|
||||
- avl-tree/avlcore.c::Insert@0x1f40
|
||||
- avl-tree/avlcore.c::InsertNode@0x1e10
|
||||
- avl-tree/avlcore.c::MakeEmpty@0x2090
|
||||
- avl-tree/avl-tree.c::breadth@0x1780
|
||||
- avl-tree/avl-tree.c::main@0x1120
|
||||
- banner/banner.c::main@0x1120
|
||||
- bit-kernels/bit-kernels.c::main@0x1120
|
||||
- blake2b/blake2b.c::F@0x12e0
|
||||
- blake2b/blake2b.c::blake2b@0x17b0
|
||||
- blake2b/blake2b.c::test@0x1b50
|
||||
- bloom-filter/bloom-filter.c::bad_search@0x1450
|
||||
- bloom-filter/tinybloom.c::bfilter_intersect@0x1570
|
||||
- bloom-filter/bloom-filter.c::main@0x1120
|
||||
- boyer-moore-search/boyer-moore-search.c::badCharHeuristic@0x15d0
|
||||
- boyer-moore-search/boyer-moore-search.c::main@0x1140
|
||||
- boyer-moore-search/boyer-moore-search.c::search@0x1630
|
||||
- c-interp/c-interp.c::enum_declaration@0x34f0
|
||||
- c-interp/c-interp.c::eval@0x3ea0
|
||||
- c-interp/c-interp.c::function_body@0x37f0
|
||||
- c-interp/c-interp.c::function_declaration@0x3a10
|
||||
- c-interp/c-interp.c::main@0x1120
|
||||
- c-interp/c-interp.c::next@0x15a0
|
||||
- ccmac/ccmac.c::main@0x1120
|
||||
- checkers/functions.c::fill_print_initial@0x18e0
|
||||
- checkers/functions.c::free_tree@0x6210
|
||||
- checkers/functions.c::generate_node_children@0x35d0
|
||||
- checkers/functions.c::link_new_node@0x34c0
|
||||
- checkers/checkers.c::main@0x1130
|
||||
- cipher/cipher.c::encipher@0x12f0
|
||||
- cipher/cipher.c::main@0x1100
|
||||
- congrad/congrad.c::main@0x1100
|
||||
- connect4-minimax/connect4-minimax.c::board_full@0x1500
|
||||
- connect4-minimax/connect4-minimax.c::evaluate_window@0x2380
|
||||
- connect4-minimax/connect4-minimax.c::init_board@0x1230
|
||||
- connect4-minimax/connect4-minimax.c::main@0x1100
|
||||
- connect4-minimax/connect4-minimax.c::minimax@0x3c30
|
||||
- connect4-minimax/connect4-minimax.c::play_game@0x4260
|
||||
- convex-hull/convex-hull.c::main@0x1100
|
||||
- convex-hull/convex-hull.c::sortPoints@0x1740
|
||||
- dhrystone/dhrystone.c::PFunc_1@0x1980
|
||||
- dhrystone/dhrystone.c::PProc_8@0x1910
|
||||
- dhrystone/dhrystone.c::main@0x1100
|
||||
- distinctness/distinctness.c::isDistinct@0x12a0
|
||||
- distinctness/distinctness.c::main@0x1100
|
||||
- fft-int/fft-int.c::db_from_ampl@0x1c50
|
||||
- fft-int/fft-int.c::fix_fft@0x1370
|
||||
- fft-int/fft-int.c::fix_loud@0x1a90
|
||||
- fft-int/fft-int.c::window@0x1650
|
||||
- flood-fill/flood-fill.c::main@0x1100
|
||||
- frac-calc/frac-calc.c::avaliatokens@0x1730
|
||||
- frac-calc/frac-calc.c::copyr@0x1550
|
||||
- frac-calc/frac-calc.c::divtokens@0x1980
|
||||
- frac-calc/frac-calc.c::help@0x14a0
|
||||
- frac-calc/frac-calc.c::main@0x1120
|
||||
- frac-calc/frac-calc.c::misto@0x1610
|
||||
- frac-calc/frac-calc.c::simplifica@0x28f0
|
||||
- fuzzy-match/fuzzy-match.c::fuzzy_match_recurse@0x23e0
|
||||
- fuzzy-match/fuzzy-match.c::main@0x2100
|
||||
- fy-shuffle/fy-shuffle.c::fy_shuffle@0x1440
|
||||
- fy-shuffle/fy-shuffle.c::main@0x1100
|
||||
- gcd-list/gcd-list.c::gcd@0x1310
|
||||
- gcd-list/gcd-list.c::main@0x1120
|
||||
- grad-descent/grad-descent.c::derivateWRTBias@0x12e0
|
||||
- grad-descent/grad-descent.c::derivateWRTWeight@0x1270
|
||||
- grad-descent/grad-descent.c::gradientDescent@0x1350
|
||||
- grad-descent/grad-descent.c::main@0x1100
|
||||
- graph-tests/graph-tests.c::DFS_test@0x2340
|
||||
- graph-tests/graph-tests.c::addEdge@0x1610
|
||||
- graph-tests/graph-tests.c::addVertex@0x1f80
|
||||
- graph-tests/graph-tests.c::bfs@0x1830
|
||||
- graph-tests/graph-tests.c::bfs_test@0x1a70
|
||||
- graph-tests/graph-tests.c::bubbleSort@0x1db0
|
||||
- graph-tests/graph-tests.c::createGraph@0x1550
|
||||
- graph-tests/graph-tests.c::createNode@0x1530
|
||||
- graph-tests/graph-tests.c::createQueue@0x1680
|
||||
- graph-tests/graph-tests.c::depthFirstSearch@0x2110
|
||||
- graph-tests/graph-tests.c::dequeue@0x1720
|
||||
- graph-tests/graph-tests.c::enqueue@0x16d0
|
||||
- graph-tests/graph-tests.c::insertAtTheBegin@0x1d70
|
||||
- graph-tests/graph-tests.c::link_list@0x1e20
|
||||
- graph-tests/graph-tests.c::main@0x1180
|
||||
- graph-tests/graph-tests.c::printQueue@0x17b0
|
||||
- graph-tests/graph-tests.c::swap@0x1da0
|
||||
- graph-tests/graph-tests.c::towers@0x2490
|
||||
- hanoi/hanoi.c::main@0x1100
|
||||
- heapsort/heapsort.c::HSORT@0x12f0
|
||||
- heapsort/heapsort.c::main@0x11a0
|
||||
- heat-calc/heat-calc.c::main@0x1100
|
||||
- huff-encode/huff-encode.c::buildHuffmanTree@0x18b0
|
||||
- huff-encode/huff-encode.c::main@0x1120
|
||||
- idct-alg/idct-alg.c::main@0x1100
|
||||
- indirect-test/indirect-test.c::main@0x1100
|
||||
- k-means/k-means.c::calculateCentroid@0x1390
|
||||
- k-means/k-means.c::calculateNearst@0x1310
|
||||
- k-means/k-means.c::kMeans@0x1400
|
||||
- k-means/k-means.c::main@0x1120
|
||||
- k-means/k-means.c::printEPS@0x16c0
|
||||
- kadane/kadane.c::main@0x1100
|
||||
- kepler/kepler.c::J@0x1b80
|
||||
- kepler/kepler.c::bin_fact@0x1ad0
|
||||
- kepler/kepler.c::binary@0x16a0
|
||||
- kepler/kepler.c::e_series@0x1740
|
||||
- kepler/kepler.c::j_series@0x1920
|
||||
- kepler/kepler.c::main@0x1100
|
||||
- knapsack/knapsack.c::main@0x1100
|
||||
- knapsack/knapsack.c::max@0x1310
|
||||
- knights-tour/knights-tour.c::solveKT@0x1830
|
||||
- knights-tour/knights-tour.c::solveKTUtil@0x1980
|
||||
- life/life.c::getDown@0x1960
|
||||
- life/life.c::getDownLeft@0x19f0
|
||||
- life/life.c::getDownRight@0x1a20
|
||||
- life/life.c::getLeft@0x18d0
|
||||
- life/life.c::getNumNeigbors@0x16d0
|
||||
- life/life.c::getRight@0x1900
|
||||
- life/life.c::getUp@0x1930
|
||||
- life/life.c::getUpLeft@0x1990
|
||||
- life/life.c::getUpRight@0x19c0
|
||||
- life/life.c::main@0x1100
|
||||
- life/life.c::process@0x1430
|
||||
- longdiv/longdiv.c::main@0x1120
|
||||
- longdiv/longdiv.c::sub@0x1a80
|
||||
- lu-decomp/lu-decomp.c::main@0x1100
|
||||
- lu-decomp/lu-decomp.c::print_matrix@0x1320
|
||||
- mandelbrot/mandelbrot.c::main@0x1100
|
||||
- max-subseq/max-subseq.c::lcsAlgo@0x1290
|
||||
- max-subseq/max-subseq.c::main@0x1120
|
||||
- mersenne/mersenne.c::genrand@0x1380
|
||||
- mersenne/mersenne.c::lsgenrand@0x1320
|
||||
- mersenne/mersenne.c::main@0x1100
|
||||
- mersenne/mersenne.c::sgenrand@0x12d0
|
||||
- minspan/minspan.c::displayGraph@0x1db0
|
||||
- minspan/minspan.c::displayGraph1@0x1ee0
|
||||
- minspan/minspan.c::displayPath@0x2020
|
||||
- minspan/minspan.c::displayTree@0x20c0
|
||||
- minspan/minspan.c::main@0x1100
|
||||
- minspan/minspan.c::minSpanTree@0x1400
|
||||
- monte-carlo/monte-carlo.c::main@0x1100
|
||||
- murmur-hash/murmur-hash.c::main@0x1100
|
||||
- murmur-hash/murmur-hash.c::murmurhash@0x1290
|
||||
- n-queens/n-queens.c::main@0x1120
|
||||
- natlog/natlog.c::main@0x1100
|
||||
- nbody-sim/nbody-sim.c::main@0x1100
|
||||
- packet-filter/packet-filter.c::check_packet_filter@0x1520
|
||||
- packet-filter/packet-filter.c::generate_packet@0x13d0
|
||||
- packet-filter/packet-filter.c::main@0x1100
|
||||
- packet-filter/packet-filter.c::print_packet@0x1580
|
||||
- parrondo/parrondo.c::main@0x1100
|
||||
- pascal/pascal.c::main@0x1100
|
||||
- pi-calc/pi-calc.c::main@0x1100
|
||||
- primal-test/primal-test.c::main@0x1100
|
||||
- priority-queue/priority-queue.c::main@0x1120
|
||||
- priority-queue/priority-queue.c::newNode@0x13a0
|
||||
- priority-queue/priority-queue.c::push@0x1420
|
||||
- qsort-demo/qsort-demo.c::main@0x1120
|
||||
- qsort-demo/qsort-demo.c::print_struct_array@0x15b0
|
||||
- qsort-demo/qsort-demo.c::sort_cstrings_example@0x1480
|
||||
- qsort-demo/qsort-demo.c::sort_integers_example@0x1310
|
||||
- qsort-demo/qsort-demo.c::sort_structs_example@0x1630
|
||||
- qsort-test/qsort-test.c::main@0x1120
|
||||
- quaternions/quaternions.c::euler_from_quat@0x1550
|
||||
- quaternions/quaternions.c::main@0x1100
|
||||
- quaternions/quaternions.c::quat_from_euler@0x13e0
|
||||
- quaternions/quaternions.c::quaternion_multiply@0x1670
|
||||
- rabinkarp-search/rabinkarp-search.c::main@0x1120
|
||||
- rabinkarp-search/rabinkarp-search.c::search@0x15a0
|
||||
- rand-test/rand-test.c::bad_rand@0x1240
|
||||
- rand-test/rand-test.c::main@0x1100
|
||||
- rand-test/rand-test.c::run_tests@0x1280
|
||||
- ransac/ransac.c::main@0x1100
|
||||
- regex-parser/regex-parser.c::main@0x2100
|
||||
- regex-parser/regex-parser.c::matchcharclass@0x2420
|
||||
- regex-parser/regex-parser.c::matchone@0x25c0
|
||||
- regex-parser/regex-parser.c::matchpattern@0x26d0
|
||||
- regex-parser/regex-parser.c::re_compile@0x2ac0
|
||||
- regex-parser/regex-parser.c::re_print@0x2e30
|
||||
- rle-compress/rle-compress.c::main@0x1120
|
||||
- rle-compress/rle-compress.c::run_length_encode@0x1330
|
||||
- rsa-cipher/rsa-cipher.c::main@0x1100
|
||||
- rsa-cipher/rsa-cipher.c::mod_inverse@0x15a0
|
||||
- rsa-cipher/rsa-cipher.c::mod_pow@0x14b0
|
||||
- rsa-cipher/rsa-cipher.c::print_hex_int128@0x16c0
|
||||
- sat-solver/sat-solver.c::main@0x1100
|
||||
- sat-solver/sat-solver.c::printFormula@0x1680
|
||||
- shortest-path/shortest-path.c::floydWarshall@0x1330
|
||||
- shortest-path/shortest-path.c::main@0x1100
|
||||
- sieve/sieve.c::main@0x1100
|
||||
- simple-grep/simple-grep.c::main@0x1120
|
||||
- spelt2num/spelt2num.c::main@0x1100
|
||||
- spirograph/spirograph.c::spirograph@0x1230
|
||||
- sudoku-solver/sudoku-solver.c::main@0x1100
|
||||
- tetris-sim/tetris-sim.c::aggregate_height@0x1b20
|
||||
- tetris-sim/tetris-sim.c::best_move@0x21d0
|
||||
- tetris-sim/tetris-sim.c::count_holes@0x1b70
|
||||
- tetris-sim/tetris-sim.c::evaluate_board@0x1ca0
|
||||
- tetris-sim/tetris-sim.c::main@0x1100
|
||||
- tiny-NN/tiny-NN.c::main@0x1120
|
||||
- tiny-NN/tiny-NN.c::sampleSine@0x12d0
|
||||
- tiny-NN/tiny-NN.c::train@0x13e0
|
||||
- topo-sort/topo-sort.c::addEdge@0x13f0
|
||||
- topo-sort/topo-sort.c::createGraph@0x1380
|
||||
- topo-sort/topo-sort.c::createListNode@0x1360
|
||||
- topo-sort/topo-sort.c::createStackNode@0x1340
|
||||
- topo-sort/topo-sort.c::main@0x1120
|
||||
- topo-sort/topo-sort.c::topologicalSort@0x18b0
|
||||
- topo-sort/topo-sort.c::topologicalSortUtil@0x1440
|
||||
- totient/totient.c::main@0x1100
|
||||
- transcend/transcend.c::main@0x1120
|
||||
- uniquify/uniquify.c::main@0x1120
|
||||
- vectors-3d/vectors-3d.c::get_cross_matrix@0x1850
|
||||
- vectors-3d/vectors-3d.c::main@0x1100
|
||||
- vectors-3d/vectors-3d.c::print_vector@0x1730
|
||||
- vectors-3d/vectors-3d.c::unit_vec@0x17a0
|
||||
- vectors-3d/vectors-3d.c::vector_add@0x1650
|
||||
- vectors-3d/vectors-3d.c::vector_prod@0x16b0
|
||||
- vectors-3d/vectors-3d.c::vector_sub@0x1620
|
||||
- verlet/verlet.c::main@0x1100
|
||||
- weekday/weekday.c::dayOfWeek@0x1290
|
||||
- weekday/weekday.c::main@0x1100
|
||||
|
||||
## Execution Failures
|
||||
- cipher/cipher.c::decipher@0x1360
|
||||
- idct-alg/idct-alg.c::idct_2d@0x12f0
|
||||
- life/life.c::init@0x12c0
|
||||
- ransac/ransac.c::ransac_line_fitting@0x1410
|
||||
- sat-solver/sat-solver.c::solveSAT@0x13a0
|
||||
- spirograph/spirograph.c::test@0x1390
|
||||
- tetris-sim/tetris-sim.c::clear_lines@0x19a0
|
||||
- vectors-3d/vectors-3d.c::get_angle@0x18c0
|
||||
493
sk2decompile/evaluation/bringupbench/scripts/build-func-maps.py
Normal file
493
sk2decompile/evaluation/bringupbench/scripts/build-func-maps.py
Normal file
|
|
@ -0,0 +1,493 @@
|
|||
#!/usr/bin/env python3
|
||||
"""Generate function-level mappings across source, pseudo, and assembly outputs."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from typing import Dict, List, Optional
|
||||
import subprocess
|
||||
|
||||
FUNC_KEYWORDS = {"if", "for", "while", "switch", "return", "sizeof", "do", "case", "else"}
|
||||
|
||||
TYPEDEF_MAP = {
|
||||
"cpu_set_t": "int",
|
||||
"nl_item": "int",
|
||||
"__time_t": "int",
|
||||
"__mode_t": "unsigned short",
|
||||
"__off64_t": "long long",
|
||||
"__blksize_t": "long",
|
||||
"__ino_t": "unsigned long",
|
||||
"__blkcnt_t": "unsigned long long",
|
||||
"__syscall_slong_t": "long",
|
||||
"__ssize_t": "long int",
|
||||
"wchar_t": "unsigned short int",
|
||||
"wctype_t": "unsigned short int",
|
||||
"__int64": "long long",
|
||||
"__int32": "int",
|
||||
"__int16": "short",
|
||||
"__int8": "char",
|
||||
"_QWORD": "uint64_t",
|
||||
"_OWORD": "long double",
|
||||
"_DWORD": "uint32_t",
|
||||
"size_t": "unsigned int",
|
||||
"_BYTE": "uint8_t",
|
||||
"_TBYTE": "uint16_t",
|
||||
"_BOOL8": "uint8_t",
|
||||
"gcc_va_list": "va_list",
|
||||
"_WORD": "unsigned short",
|
||||
"_BOOL4": "int",
|
||||
"__va_list_tag": "va_list",
|
||||
"_IO_FILE": "FILE",
|
||||
"DIR": "int",
|
||||
"__fsword_t": "long",
|
||||
"__kernel_ulong_t": "int",
|
||||
"cc_t": "int",
|
||||
"speed_t": "int",
|
||||
"fd_set": "int",
|
||||
"__suseconds_t": "int",
|
||||
"_UNKNOWN": "void",
|
||||
"__sighandler_t": "void (*)(int)",
|
||||
"__compar_fn_t": "int (*)(const void *, const void *)",
|
||||
}
|
||||
|
||||
|
||||
def _load_config_env() -> dict:
|
||||
"""Load config.env from the eval project root."""
|
||||
eval_root = Path(__file__).resolve().parents[1]
|
||||
config_path = eval_root / "config.env"
|
||||
config = {}
|
||||
if config_path.exists():
|
||||
for line in config_path.read_text().splitlines():
|
||||
line = line.strip()
|
||||
if not line or line.startswith("#"):
|
||||
continue
|
||||
if "=" in line:
|
||||
key, _, value = line.partition("=")
|
||||
config[key.strip()] = value.strip()
|
||||
return config
|
||||
|
||||
|
||||
def _get_bench_root(cli_value: str | None = None) -> Path:
|
||||
"""Resolve the benchmark repo root from CLI arg, env var, or config.env."""
|
||||
if cli_value:
|
||||
return Path(cli_value).resolve()
|
||||
env_val = os.environ.get("BENCH_REPO_ROOT")
|
||||
if env_val:
|
||||
return Path(env_val).resolve()
|
||||
config = _load_config_env()
|
||||
if "BENCH_REPO_ROOT" in config:
|
||||
return Path(config["BENCH_REPO_ROOT"]).resolve()
|
||||
sys.exit("error: BENCH_REPO_ROOT not set. Use --bench-root, set the env var, or configure config.env")
|
||||
|
||||
|
||||
def _read_text(path: Path) -> str:
|
||||
return path.read_text(encoding="utf-8")
|
||||
|
||||
|
||||
def _strip_empty(code: str) -> str:
|
||||
return "\n".join(line for line in code.splitlines() if line.strip())
|
||||
|
||||
|
||||
def _good_func(func: str) -> bool:
|
||||
body = "{".join(func.split("{", 1)[1:]) if "{" in func else func
|
||||
total = 0
|
||||
for line in body.splitlines():
|
||||
if len(line.strip()) >= 3:
|
||||
total += 1
|
||||
return 3 < total < 300
|
||||
|
||||
|
||||
def _format_with_clang(func: str, style: str = "Google") -> Optional[str]:
|
||||
if not func:
|
||||
return None
|
||||
cmd = ["clang-format", f"--style={style}"]
|
||||
try:
|
||||
proc = subprocess.run(
|
||||
cmd,
|
||||
input=func,
|
||||
text=True,
|
||||
capture_output=True,
|
||||
check=True,
|
||||
timeout=15,
|
||||
)
|
||||
return proc.stdout
|
||||
except Exception as e:
|
||||
print(e)
|
||||
return None
|
||||
|
||||
|
||||
def _hex_to_dec(text: str) -> str:
|
||||
pattern = re.compile(r"\b(0x[0-9a-fA-F]+)([uUlL]{1,3})?\b")
|
||||
|
||||
def convert(match: re.Match[str]) -> str:
|
||||
hex_part = match.group(1)
|
||||
suffix = match.group(2) or ""
|
||||
return str(int(hex_part, 16)) + suffix
|
||||
|
||||
return pattern.sub(convert, text)
|
||||
|
||||
|
||||
def _remove_keywords(text: str) -> str:
|
||||
patterns = [
|
||||
r"\b__fastcall\b",
|
||||
r"\b__cdecl\b",
|
||||
r"\b__ptr32\b",
|
||||
r"\b__noreturn\s+noreturn\b",
|
||||
]
|
||||
combined = re.compile("|".join(patterns))
|
||||
return combined.sub("", text)
|
||||
|
||||
def _replace_typedefs(text: str) -> str:
|
||||
for alias, original in TYPEDEF_MAP.items():
|
||||
pattern = re.compile(rf"\b{re.escape(alias)}\b")
|
||||
text = pattern.sub(original, text)
|
||||
return text
|
||||
|
||||
|
||||
def _remove_comments(text: str) -> str:
|
||||
text = re.sub(r"/\*.*?\*/", "", text, flags=re.DOTALL)
|
||||
text = re.sub(r"//.*?$", "", text, flags=re.MULTILINE)
|
||||
return text
|
||||
|
||||
|
||||
def _process_code(code_str: str) -> str:
|
||||
code_str = _remove_comments(code_str)
|
||||
code_str = _hex_to_dec(code_str)
|
||||
code_str = _remove_keywords(code_str)
|
||||
code_str = _replace_typedefs(code_str)
|
||||
return code_str
|
||||
|
||||
|
||||
def _normalize_pseudo(text: str) -> str:
|
||||
processed = _process_code(text)
|
||||
if not processed.strip():
|
||||
return ""
|
||||
formatted = _format_with_clang(processed)
|
||||
if formatted is None:
|
||||
return ""
|
||||
cleaned = _strip_empty(formatted)
|
||||
if not cleaned or not _good_func(cleaned):
|
||||
return ""
|
||||
return cleaned
|
||||
|
||||
|
||||
def _strip_comments_and_strings(text: str) -> str:
|
||||
result = list(text)
|
||||
i = 0
|
||||
length = len(text)
|
||||
while i < length:
|
||||
nxt = text[i : i + 2]
|
||||
ch = text[i]
|
||||
if nxt == "//":
|
||||
end = text.find("\n", i)
|
||||
if end == -1:
|
||||
end = length
|
||||
for j in range(i, end):
|
||||
result[j] = " "
|
||||
i = end
|
||||
continue
|
||||
if nxt == "/*":
|
||||
end = text.find("*/", i + 2)
|
||||
if end == -1:
|
||||
end = length - 2
|
||||
for j in range(i, end + 2):
|
||||
result[j] = " "
|
||||
i = end + 2
|
||||
continue
|
||||
if ch in {'"', "'"}:
|
||||
quote = ch
|
||||
result[i] = " "
|
||||
i += 1
|
||||
while i < length:
|
||||
c = text[i]
|
||||
result[i] = " "
|
||||
if c == "\\":
|
||||
i += 2
|
||||
continue
|
||||
if c == quote:
|
||||
i += 1
|
||||
break
|
||||
i += 1
|
||||
continue
|
||||
i += 1
|
||||
return "".join(result)
|
||||
|
||||
def _find_matching_brace(text: str, start_idx: int) -> int:
|
||||
depth = 0
|
||||
i = start_idx
|
||||
length = len(text)
|
||||
while i < length:
|
||||
nxt = text[i : i + 2]
|
||||
ch = text[i]
|
||||
if nxt == "//":
|
||||
i = text.find("\n", i)
|
||||
if i == -1:
|
||||
return length - 1
|
||||
continue
|
||||
if nxt == "/*":
|
||||
i = text.find("*/", i + 2)
|
||||
if i == -1:
|
||||
return length - 1
|
||||
i += 2
|
||||
continue
|
||||
if ch in {'"', "'"}:
|
||||
quote = ch
|
||||
i += 1
|
||||
while i < length:
|
||||
c = text[i]
|
||||
if c == "\\":
|
||||
i += 2
|
||||
continue
|
||||
if c == quote:
|
||||
i += 1
|
||||
break
|
||||
i += 1
|
||||
continue
|
||||
if ch == "{":
|
||||
depth += 1
|
||||
elif ch == "}":
|
||||
depth -= 1
|
||||
if depth == 0:
|
||||
return i
|
||||
i += 1
|
||||
return length - 1
|
||||
|
||||
|
||||
def _extract_source_functions(path: Path, repo_root: Path) -> Dict[str, Dict[str, str]]:
|
||||
text = _read_text(path)
|
||||
sanitized = _strip_comments_and_strings(text)
|
||||
pattern = re.compile(
|
||||
r"(?P<prefix>^|[;\n}])(?P<signature>[^{;}]*?)\b(?P<name>[A-Za-z_][\w]*)\s*\([^;{}]*\)\s*\{",
|
||||
re.MULTILINE,
|
||||
)
|
||||
funcs: Dict[str, Dict[str, str]] = {}
|
||||
for match in pattern.finditer(sanitized):
|
||||
name = match.group("name")
|
||||
if name in FUNC_KEYWORDS:
|
||||
continue
|
||||
brace_idx = sanitized.find("{", match.start("signature"))
|
||||
if brace_idx == -1:
|
||||
continue
|
||||
end_idx = _find_matching_brace(text, brace_idx)
|
||||
if end_idx <= brace_idx:
|
||||
continue
|
||||
start_idx = match.start("signature")
|
||||
content = text[start_idx : end_idx + 1].strip("\n") + "\n"
|
||||
funcs.setdefault(
|
||||
name,
|
||||
{
|
||||
"path": str(path.relative_to(repo_root)),
|
||||
"function_name": name,
|
||||
"content": content,
|
||||
},
|
||||
)
|
||||
return funcs
|
||||
|
||||
def _parse_makefile(makefile: Path) -> List[Path]:
|
||||
text = _read_text(makefile)
|
||||
prog_match = re.search(r"^PROG\s*=\s*(\S+)", text, flags=re.MULTILINE)
|
||||
if not prog_match:
|
||||
raise RuntimeError(f"PROG not found in {makefile}")
|
||||
prog = prog_match.group(1).strip()
|
||||
objs_match = re.search(r"^LOCAL_OBJS\s*=\s*(.*)$", text, flags=re.MULTILINE)
|
||||
obj_tokens: List[str] = []
|
||||
if objs_match:
|
||||
obj_tokens = [token for token in objs_match.group(1).split() if token]
|
||||
if not obj_tokens:
|
||||
obj_tokens = [f"{prog}.o"]
|
||||
src_paths: List[Path] = []
|
||||
for token in obj_tokens:
|
||||
if not token.endswith(".o"):
|
||||
continue
|
||||
candidate = makefile.parent / token.replace(".o", ".c")
|
||||
if candidate.exists():
|
||||
src_paths.append(candidate)
|
||||
if not src_paths:
|
||||
fallback = makefile.parent / f"{prog}.c"
|
||||
if fallback.exists():
|
||||
src_paths.append(fallback)
|
||||
return src_paths
|
||||
|
||||
|
||||
def _collect_source_functions(bench_dir: Path, repo_root: Path) -> Dict[str, Dict[str, str]]:
|
||||
makefile = bench_dir / "Makefile"
|
||||
srcs = _parse_makefile(makefile)
|
||||
func_map: Dict[str, Dict[str, str]] = {}
|
||||
for src in srcs:
|
||||
func_map.update(_extract_source_functions(src, repo_root))
|
||||
return func_map
|
||||
|
||||
|
||||
def _parse_pseudo(pseudo_path: Path, repo_root: Path) -> Dict[str, Dict[str, str]]:
|
||||
text = _read_text(pseudo_path)
|
||||
lines = text.splitlines()
|
||||
pattern = re.compile(r"^/\*\s*(?P<name>[^@]+?)\s*@\s*(?P<addr>0x[0-9a-fA-F]+)\s*\*/$")
|
||||
current: Optional[str] = None
|
||||
current_addr: Optional[str] = None
|
||||
buffer: List[str] = []
|
||||
out: Dict[str, Dict[str, str]] = {}
|
||||
for raw_line in lines:
|
||||
line = raw_line.strip()
|
||||
match = pattern.match(line)
|
||||
if match:
|
||||
if current and buffer:
|
||||
content = "\n".join(buffer).strip("\n") + "\n"
|
||||
out.setdefault(
|
||||
current,
|
||||
{
|
||||
"path": str(pseudo_path.relative_to(repo_root)),
|
||||
"function_name": current,
|
||||
"address": current_addr,
|
||||
"label": current,
|
||||
"content": content,
|
||||
},
|
||||
)
|
||||
current = match.group("name").strip()
|
||||
current_addr = match.group("addr")
|
||||
buffer = []
|
||||
else:
|
||||
if current is not None:
|
||||
buffer.append(raw_line)
|
||||
if current and buffer:
|
||||
content = "\n".join(buffer).strip("\n") + "\n"
|
||||
out.setdefault(
|
||||
current,
|
||||
{
|
||||
"path": str(pseudo_path.relative_to(repo_root)),
|
||||
"function_name": current,
|
||||
"address": current_addr,
|
||||
"label": current,
|
||||
"content": content,
|
||||
},
|
||||
)
|
||||
return out
|
||||
|
||||
def _clean_instruction(raw: str) -> Optional[str]:
|
||||
stripped = raw.strip()
|
||||
if not stripped:
|
||||
return None
|
||||
parts = raw.split("\t")
|
||||
if len(parts) >= 3:
|
||||
relevant = parts[2:]
|
||||
elif len(parts) == 2:
|
||||
relevant = parts[1:]
|
||||
else:
|
||||
relevant = [stripped]
|
||||
instr = "\t".join(relevant)
|
||||
instr = instr.split("#")[0].strip()
|
||||
if not instr:
|
||||
return None
|
||||
if all(c in "0123456789abcdefABCDEF" for c in instr.replace(" ", "")):
|
||||
return None
|
||||
return instr
|
||||
|
||||
|
||||
def _clean_asm_block(name: str, lines: List[str]) -> str:
|
||||
cleaned = [f"<{name}>:"]
|
||||
for raw in lines[1:]:
|
||||
instr = _clean_instruction(raw)
|
||||
if instr:
|
||||
cleaned.append(instr)
|
||||
return "\n".join(cleaned) + "\n"
|
||||
|
||||
|
||||
def _parse_assembly(asm_path: Path) -> Dict[str, str]:
|
||||
lines = _read_text(asm_path).splitlines()
|
||||
header = re.compile(r"^\s*([0-9a-fA-F]+)\s+<([^>]+)>:\s*$")
|
||||
current: Optional[str] = None
|
||||
buffer: List[str] = []
|
||||
result: Dict[str, str] = {}
|
||||
for line in lines:
|
||||
match = header.match(line)
|
||||
if match:
|
||||
if current and buffer:
|
||||
result.setdefault(current, _clean_asm_block(current, buffer))
|
||||
current = match.group(2)
|
||||
buffer = [line]
|
||||
else:
|
||||
if current is not None:
|
||||
buffer.append(line)
|
||||
if current and buffer:
|
||||
result.setdefault(current, _clean_asm_block(current, buffer))
|
||||
return result
|
||||
|
||||
|
||||
def _discover_binaries(explicit: Optional[List[str]], repo_root: Path) -> List[Path]:
|
||||
if explicit:
|
||||
binaries: List[Path] = []
|
||||
for entry in explicit:
|
||||
candidate = Path(entry)
|
||||
if not candidate.is_absolute():
|
||||
candidate = repo_root / candidate
|
||||
if candidate.exists():
|
||||
binaries.append(candidate)
|
||||
return binaries
|
||||
matches = []
|
||||
for path in repo_root.rglob("*.O*"):
|
||||
suffix = path.suffix.lower()
|
||||
if suffix in {".o0", ".o1", ".o2", ".o3"}:
|
||||
matches.append(path)
|
||||
return sorted(matches)
|
||||
|
||||
def _build_map(binary: Path, repo_root: Path) -> None:
|
||||
pseudo_path = Path(str(binary) + ".pseudo")
|
||||
asm_path = Path(str(binary) + ".s")
|
||||
if not pseudo_path.exists() or not asm_path.exists():
|
||||
print(f"[skip] Missing pseudo or assembly for {binary.relative_to(repo_root)}")
|
||||
return
|
||||
bench_dir = binary.parent
|
||||
source_funcs = _collect_source_functions(bench_dir, repo_root)
|
||||
pseudo_funcs = _parse_pseudo(pseudo_path, repo_root)
|
||||
asm_funcs = _parse_assembly(asm_path)
|
||||
common = sorted(set(source_funcs) & set(pseudo_funcs) & set(asm_funcs))
|
||||
if not common:
|
||||
print(f"[warn] No overlapping functions for {binary.relative_to(repo_root)}")
|
||||
return
|
||||
output_path = Path(str(binary) + ".func_map.jsonl")
|
||||
rel_binary = str(binary.relative_to(repo_root))
|
||||
with output_path.open("w", encoding="utf-8") as handle:
|
||||
for name in common:
|
||||
pseudo_entry = pseudo_funcs[name]
|
||||
pseudo_norm = _normalize_pseudo(pseudo_entry.get("content", ""))
|
||||
record = {
|
||||
"source": source_funcs[name],
|
||||
"pseudo": pseudo_entry,
|
||||
"pseudo_normalize": pseudo_norm,
|
||||
"binary": rel_binary,
|
||||
"assembly": asm_funcs[name],
|
||||
}
|
||||
handle.write(json.dumps(record, ensure_ascii=False))
|
||||
handle.write("\n")
|
||||
print(f"[ok] {output_path.relative_to(repo_root)} -> {len(common)} functions")
|
||||
|
||||
|
||||
def main(argv: List[str]) -> int:
|
||||
parser = argparse.ArgumentParser(description="Map source/pseudo/assembly per function")
|
||||
parser.add_argument(
|
||||
"--binary",
|
||||
action="append",
|
||||
help="Specific binary path (relative to repo) to process; can be repeated.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--bench-root",
|
||||
default=None,
|
||||
help="Path to the Bringup-Bench repository root (default: from config.env).",
|
||||
)
|
||||
args = parser.parse_args(argv)
|
||||
repo_root = _get_bench_root(args.bench_root)
|
||||
binaries = _discover_binaries(args.binary, repo_root)
|
||||
if not binaries:
|
||||
print("No binaries found", file=sys.stderr)
|
||||
return 1
|
||||
for binary in binaries:
|
||||
_build_map(binary, repo_root)
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main(sys.argv[1:]))
|
||||
24
sk2decompile/evaluation/bringupbench/scripts/build-host-opt-levels.sh
Executable file
24
sk2decompile/evaluation/bringupbench/scripts/build-host-opt-levels.sh
Executable file
|
|
@ -0,0 +1,24 @@
|
|||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
EVAL_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
|
||||
|
||||
# Load config; allow environment overrides
|
||||
if [[ -f "${EVAL_ROOT}/config.env" ]]; then
|
||||
set -a
|
||||
source "${EVAL_ROOT}/config.env"
|
||||
set +a
|
||||
fi
|
||||
|
||||
BENCH_REPO_ROOT="${BENCH_REPO_ROOT:?Set BENCH_REPO_ROOT in config.env or environment}"
|
||||
|
||||
cd "${BENCH_REPO_ROOT}"
|
||||
|
||||
for opt in 0 1 2 3; do
|
||||
echo "==> Building host binaries with -O${opt}"
|
||||
make TARGET=host OPT_CFLAGS="-O${opt} -g" run-tests
|
||||
find . -maxdepth 2 -type f -name '*.host' -execdir mv {} {}.O${opt} \;
|
||||
done
|
||||
|
||||
echo "All host optimization builds complete."
|
||||
21
sk2decompile/evaluation/bringupbench/scripts/clean-all-benchmarks.sh
Executable file
21
sk2decompile/evaluation/bringupbench/scripts/clean-all-benchmarks.sh
Executable file
|
|
@ -0,0 +1,21 @@
|
|||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
EVAL_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
|
||||
|
||||
# Load config; allow environment overrides
|
||||
if [[ -f "${EVAL_ROOT}/config.env" ]]; then
|
||||
set -a
|
||||
source "${EVAL_ROOT}/config.env"
|
||||
set +a
|
||||
fi
|
||||
|
||||
BENCH_REPO_ROOT="${BENCH_REPO_ROOT:?Set BENCH_REPO_ROOT in config.env or environment}"
|
||||
|
||||
cd "${BENCH_REPO_ROOT}"
|
||||
|
||||
echo "==> Running make all-clean"
|
||||
make all-clean
|
||||
|
||||
echo "All benchmarks cleaned."
|
||||
50
sk2decompile/evaluation/bringupbench/scripts/decompile-all-pseudo.sh
Executable file
50
sk2decompile/evaluation/bringupbench/scripts/decompile-all-pseudo.sh
Executable file
|
|
@ -0,0 +1,50 @@
|
|||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
EVAL_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
|
||||
|
||||
# Load config; allow environment overrides
|
||||
if [[ -f "${EVAL_ROOT}/config.env" ]]; then
|
||||
set -a
|
||||
source "${EVAL_ROOT}/config.env"
|
||||
set +a
|
||||
fi
|
||||
|
||||
BENCH_REPO_ROOT="${BENCH_REPO_ROOT:?Set BENCH_REPO_ROOT in config.env or environment}"
|
||||
|
||||
IDA_BIN="${IDA_BIN:-/home/bairidreamer/software/IDA-Pro/idat}"
|
||||
DUMP_SCRIPT="${EVAL_ROOT}/scripts/dump_pseudo.py"
|
||||
|
||||
if [[ ! -x "${IDA_BIN}" ]]; then
|
||||
echo "error: IDA binary not found or not executable at ${IDA_BIN}" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if [[ ! -f "${DUMP_SCRIPT}" ]]; then
|
||||
echo "error: dump script not found at ${DUMP_SCRIPT}" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
readarray -t BINARIES < <(
|
||||
find "${BENCH_REPO_ROOT}" -mindepth 2 -maxdepth 2 -type f \
|
||||
\( -iname '*.o0' -o -iname '*.o1' -o -iname '*.o2' -o -iname '*.o3' \) \
|
||||
! -path "${BENCH_REPO_ROOT}/scripts/*" \
|
||||
! -path "${BENCH_REPO_ROOT}/target/*" \
|
||||
! -path "${BENCH_REPO_ROOT}/common/*" \
|
||||
! -path "${BENCH_REPO_ROOT}/.git/*" \
|
||||
| sort
|
||||
)
|
||||
|
||||
if [[ ${#BINARIES[@]} -eq 0 ]]; then
|
||||
echo "error: no O0/O1/O2/O3 binaries found under ${BENCH_REPO_ROOT}" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
for binary_path in "${BINARIES[@]}"; do
|
||||
output_path="${binary_path}.pseudo"
|
||||
echo "==> Decompiling ${binary_path#${BENCH_REPO_ROOT}/} -> ${output_path#${BENCH_REPO_ROOT}/}"
|
||||
"${IDA_BIN}" -A "-S${DUMP_SCRIPT} ${output_path}" "${binary_path}"
|
||||
done
|
||||
|
||||
echo "All pseudocode dumps are located alongside their binaries."
|
||||
66
sk2decompile/evaluation/bringupbench/scripts/disasm-all-objdump.sh
Executable file
66
sk2decompile/evaluation/bringupbench/scripts/disasm-all-objdump.sh
Executable file
|
|
@ -0,0 +1,66 @@
|
|||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
EVAL_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
|
||||
|
||||
# Load config; allow environment overrides
|
||||
if [[ -f "${EVAL_ROOT}/config.env" ]]; then
|
||||
set -a
|
||||
source "${EVAL_ROOT}/config.env"
|
||||
set +a
|
||||
fi
|
||||
|
||||
BENCH_REPO_ROOT="${BENCH_REPO_ROOT:?Set BENCH_REPO_ROOT in config.env or environment}"
|
||||
|
||||
OBJDUMP_BIN="${OBJDUMP:-objdump}"
|
||||
NUM_JOBS="${JOBS:-}"
|
||||
|
||||
if ! command -v "${OBJDUMP_BIN}" >/dev/null 2>&1; then
|
||||
echo "error: objdump binary '${OBJDUMP_BIN}' not found" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if [[ -z "${NUM_JOBS}" ]]; then
|
||||
if command -v nproc >/dev/null 2>&1; then
|
||||
NUM_JOBS="$(nproc)"
|
||||
elif [[ "$(uname)" == "Darwin" ]]; then
|
||||
NUM_JOBS="$(sysctl -n hw.ncpu)"
|
||||
else
|
||||
NUM_JOBS=4
|
||||
fi
|
||||
fi
|
||||
|
||||
if ! [[ "${NUM_JOBS}" =~ ^[0-9]+$ ]] || (( NUM_JOBS <= 0 )); then
|
||||
echo "error: invalid JOBS value '${NUM_JOBS}'" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
readarray -t BINARIES < <(
|
||||
find "${BENCH_REPO_ROOT}" -mindepth 2 -maxdepth 2 -type f \
|
||||
\( -iname '*.o0' -o -iname '*.o1' -o -iname '*.o2' -o -iname '*.o3' \) \
|
||||
! -path "${BENCH_REPO_ROOT}/scripts/*" \
|
||||
! -path "${BENCH_REPO_ROOT}/target/*" \
|
||||
! -path "${BENCH_REPO_ROOT}/common/*" \
|
||||
! -path "${BENCH_REPO_ROOT}/.git/*" \
|
||||
| sort
|
||||
)
|
||||
|
||||
if [[ ${#BINARIES[@]} -eq 0 ]]; then
|
||||
echo "error: no O0/O1/O2/O3 binaries found under ${BENCH_REPO_ROOT}" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
export OBJDUMP_BIN BENCH_REPO_ROOT
|
||||
|
||||
printf '%s\0' "${BINARIES[@]}" | xargs -0 -n1 -P "${NUM_JOBS}" bash -c '
|
||||
binary_path="$1"
|
||||
bench_repo_root="${BENCH_REPO_ROOT}"
|
||||
output_path="${binary_path}.s"
|
||||
rel_in="${binary_path#"${bench_repo_root}/"}"
|
||||
rel_out="${output_path#"${bench_repo_root}/"}"
|
||||
echo "==> Disassembling ${rel_in} -> ${rel_out}"
|
||||
"${OBJDUMP_BIN}" -d "${binary_path}" > "${output_path}"
|
||||
' _
|
||||
|
||||
echo "Assembly listings written alongside each binary (extension .s)."
|
||||
62
sk2decompile/evaluation/bringupbench/scripts/dump_pseudo.py
Normal file
62
sk2decompile/evaluation/bringupbench/scripts/dump_pseudo.py
Normal file
|
|
@ -0,0 +1,62 @@
|
|||
"""
|
||||
Headless IDA/Hex-Rays helper to dump pseudocode for every discovered function.
|
||||
Usage (from shell):
|
||||
idat -A -S"scripts/dump_pseudo.py /path/to/output" /path/to/binary
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import sys
|
||||
|
||||
import ida_auto
|
||||
import ida_funcs
|
||||
import ida_hexrays
|
||||
import ida_pro
|
||||
import idautils
|
||||
import idc
|
||||
|
||||
|
||||
def _get_output_path() -> str:
|
||||
# IDA populates idc.ARGV with the script path at index 0 and the
|
||||
# user-provided arguments afterwards.
|
||||
if len(idc.ARGV) < 2:
|
||||
raise RuntimeError("output path argument missing")
|
||||
return os.path.abspath(idc.ARGV[1])
|
||||
|
||||
|
||||
def main() -> None:
|
||||
try:
|
||||
output_path = _get_output_path()
|
||||
except Exception as exc: # pragma: no cover - defensive
|
||||
print(f"[dump_pseudo] {exc}", file=sys.stderr)
|
||||
ida_pro.qexit(1)
|
||||
return
|
||||
|
||||
ida_auto.auto_wait()
|
||||
|
||||
if not ida_hexrays.init_hexrays_plugin():
|
||||
print("[dump_pseudo] Hex-Rays decompiler is unavailable", file=sys.stderr)
|
||||
ida_pro.qexit(1)
|
||||
return
|
||||
|
||||
os.makedirs(os.path.dirname(output_path), exist_ok=True)
|
||||
|
||||
with open(output_path, "w", encoding="utf-8") as handle:
|
||||
for ea in idautils.Functions():
|
||||
name = ida_funcs.get_func_name(ea)
|
||||
handle.write(f"/* {name} @ 0x{ea:x} */\n")
|
||||
try:
|
||||
cfunc = ida_hexrays.decompile(ea)
|
||||
except ida_hexrays.DecompilationFailure as exc:
|
||||
handle.write(f"// decompilation failed: {exc}\n\n")
|
||||
continue
|
||||
|
||||
handle.write(str(cfunc))
|
||||
handle.write("\n\n")
|
||||
|
||||
ida_pro.qexit(0)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
682
sk2decompile/evaluation/bringupbench/scripts/eval_infer_out.py
Normal file
682
sk2decompile/evaluation/bringupbench/scripts/eval_infer_out.py
Normal file
|
|
@ -0,0 +1,682 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
Evaluate infer-out-model2 functions by patching benchmark sources inside an
|
||||
isolated workspace, rebuilding, executing, and collecting structured logs for
|
||||
every case listed in a JSONL file.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import shutil
|
||||
import subprocess
|
||||
import sys
|
||||
from concurrent.futures import ThreadPoolExecutor, as_completed
|
||||
from dataclasses import asdict, dataclass, field
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from typing import Dict, Iterable, List, Optional, Tuple
|
||||
|
||||
|
||||
def _load_config_env() -> dict:
|
||||
"""Load config.env from the eval project root."""
|
||||
eval_root = Path(__file__).resolve().parents[1]
|
||||
config_path = eval_root / "config.env"
|
||||
config = {}
|
||||
if config_path.exists():
|
||||
for line in config_path.read_text().splitlines():
|
||||
line = line.strip()
|
||||
if not line or line.startswith("#"):
|
||||
continue
|
||||
if "=" in line:
|
||||
key, _, value = line.partition("=")
|
||||
config[key.strip()] = value.strip()
|
||||
return config
|
||||
|
||||
|
||||
def _get_bench_root(cli_value: str | None = None) -> Path:
|
||||
"""Resolve the benchmark repo root from CLI arg, env var, or config.env."""
|
||||
if cli_value:
|
||||
return Path(cli_value).resolve()
|
||||
env_val = os.environ.get("BENCH_REPO_ROOT")
|
||||
if env_val:
|
||||
return Path(env_val).resolve()
|
||||
config = _load_config_env()
|
||||
if "BENCH_REPO_ROOT" in config:
|
||||
return Path(config["BENCH_REPO_ROOT"]).resolve()
|
||||
sys.exit("error: BENCH_REPO_ROOT not set. Use --bench-root, set the env var, or configure config.env")
|
||||
|
||||
|
||||
@dataclass
|
||||
class CaseResult:
|
||||
"""Container for the outcome of processing a single case."""
|
||||
|
||||
case_id: str
|
||||
source_path: str
|
||||
benchmark_dir: str
|
||||
output_dir: str
|
||||
workspace_dir: str = ""
|
||||
artifact_dir: str = ""
|
||||
replacement_applied: bool = False
|
||||
build_status: str = "skipped" # succeeded | failed | skipped
|
||||
test_status: str = "skipped"
|
||||
notes: List[str] = field(default_factory=list)
|
||||
errors: List[str] = field(default_factory=list)
|
||||
log_files: Dict[str, str] = field(default_factory=dict)
|
||||
|
||||
|
||||
def parse_args() -> argparse.Namespace:
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Replace functions with infer-out-model2 bodies, build, "
|
||||
"execute, and record results without modifying the original benchmarks."
|
||||
)
|
||||
parser.add_argument(
|
||||
"jsonl",
|
||||
help="Path to the merged.*.jsonl file containing cases to evaluate.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--bench-root",
|
||||
default=None,
|
||||
help="Path to the Bringup-Bench repository root (default: from config.env).",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--limit",
|
||||
type=int,
|
||||
default=None,
|
||||
help="Optional limit on the number of cases to process.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--target",
|
||||
default="host",
|
||||
help="Benchmark build target passed as TARGET=<target> (default: host).",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--report-dir",
|
||||
default="reports/infer_out_eval",
|
||||
help="Directory (relative to eval root) where aggregated reports are written.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--workspace-root",
|
||||
default="reports/infer_out_eval/workspaces",
|
||||
help="Directory (relative to eval root) to host temporary build workspaces.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--skip-clean",
|
||||
action="store_true",
|
||||
help="Skip running 'make clean' inside the workspace (useful when iterating).",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--keep-workspaces",
|
||||
action="store_true",
|
||||
help="Keep temporary workspaces after each case finishes (default removes them).",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--command-timeout",
|
||||
type=int,
|
||||
default=20,
|
||||
help="Timeout (in seconds) for each make invocation; 0 disables the timeout.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--jobs",
|
||||
type=int,
|
||||
default=96,
|
||||
help="Number of cases to process in parallel (default: 1).",
|
||||
)
|
||||
return parser.parse_args()
|
||||
|
||||
|
||||
def canonicalize(text: str) -> str:
|
||||
"""Normalize newlines for reliable substring matching."""
|
||||
return text.replace("\r\n", "\n")
|
||||
|
||||
|
||||
def replace_function_body(
|
||||
full_source: str, reference_function: str, inferred_function: str
|
||||
) -> Tuple[str, bool]:
|
||||
"""
|
||||
Replace the exact reference_function text with inferred_function.
|
||||
|
||||
Returns the updated source and a boolean indicating if replacement happened.
|
||||
"""
|
||||
source_norm = canonicalize(full_source)
|
||||
reference_norm = canonicalize(reference_function)
|
||||
inferred_norm = canonicalize(inferred_function).rstrip() + "\n"
|
||||
|
||||
candidates = (
|
||||
reference_norm,
|
||||
reference_norm.rstrip() + "\n",
|
||||
reference_norm.strip(),
|
||||
)
|
||||
|
||||
for snippet in candidates:
|
||||
start_idx = source_norm.find(snippet)
|
||||
if start_idx == -1:
|
||||
continue
|
||||
end_idx = start_idx + len(snippet)
|
||||
updated = source_norm[:start_idx] + inferred_norm + source_norm[end_idx:]
|
||||
return updated, True
|
||||
return full_source, False
|
||||
|
||||
|
||||
def compose_case_id(case: Dict) -> str:
|
||||
"""Build a stable identifier for a case."""
|
||||
return (
|
||||
f"{case['source']['path']}::{case['source']['function_name']}"
|
||||
f"@{case['pseudo']['address']}"
|
||||
)
|
||||
|
||||
|
||||
def ensure_case_output_dir(
|
||||
output_root: Path, pseudo_path_str: str, pseudo_address: str, result: CaseResult
|
||||
) -> Path:
|
||||
"""Create the per-case output directory, handling file path collisions."""
|
||||
pseudo_rel = Path(pseudo_path_str)
|
||||
base_dir = output_root / pseudo_rel
|
||||
|
||||
if base_dir.exists() and base_dir.is_file():
|
||||
fallback = base_dir.parent / f"{base_dir.name}.infer_eval"
|
||||
fallback.mkdir(parents=True, exist_ok=True)
|
||||
result.notes.append(
|
||||
f"pseudo.path '{pseudo_path_str}' is a file; using '{fallback.relative_to(output_root)}' for logs."
|
||||
)
|
||||
base_dir = fallback
|
||||
else:
|
||||
base_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
case_dir = base_dir / pseudo_address
|
||||
if case_dir.exists():
|
||||
shutil.rmtree(case_dir)
|
||||
case_dir.mkdir(parents=True, exist_ok=True)
|
||||
return case_dir
|
||||
|
||||
|
||||
def run_command(
|
||||
command: List[str],
|
||||
cwd: Path,
|
||||
log_handle,
|
||||
step_name: str,
|
||||
timeout: Optional[int],
|
||||
) -> Optional[int]:
|
||||
"""Run a command, capture stdout/stderr, and write everything to log_handle."""
|
||||
log_handle.write(f"\n[{step_name}] $ {' '.join(command)}\n")
|
||||
log_handle.flush()
|
||||
try:
|
||||
completed = subprocess.run(
|
||||
command,
|
||||
cwd=str(cwd),
|
||||
stdout=subprocess.PIPE,
|
||||
stderr=subprocess.STDOUT,
|
||||
text=True,
|
||||
encoding="utf-8",
|
||||
errors="replace",
|
||||
timeout=timeout if timeout and timeout > 0 else None,
|
||||
)
|
||||
log_handle.write(completed.stdout)
|
||||
log_handle.write(f"[{step_name}] exit code: {completed.returncode}\n")
|
||||
log_handle.flush()
|
||||
return completed.returncode
|
||||
except subprocess.TimeoutExpired as exc:
|
||||
output = exc.output or exc.stdout
|
||||
if output:
|
||||
if isinstance(output, bytes):
|
||||
log_handle.write(output.decode("utf-8", "replace"))
|
||||
else:
|
||||
log_handle.write(output)
|
||||
log_handle.write(
|
||||
f"[{step_name}] timed out after {timeout} seconds; terminating process.\n"
|
||||
)
|
||||
log_handle.flush()
|
||||
return None
|
||||
|
||||
|
||||
def write_case_artifacts(
|
||||
case_dir: Path,
|
||||
case: Dict,
|
||||
modified_source: str,
|
||||
original_source: str,
|
||||
) -> None:
|
||||
"""Persist reusable artifacts for a case."""
|
||||
(case_dir / "case.json").write_text(json.dumps(case, indent=2), encoding="utf-8")
|
||||
(case_dir / "modified_source.c").write_text(modified_source, encoding="utf-8")
|
||||
(case_dir / "original_source.c").write_text(original_source, encoding="utf-8")
|
||||
(case_dir / "original_function.c").write_text(
|
||||
canonicalize(case["source"]["content"]), encoding="utf-8"
|
||||
)
|
||||
(case_dir / "infer_function.c").write_text(
|
||||
canonicalize(case["pseudo"]["content-fix"]), encoding="utf-8"
|
||||
)
|
||||
|
||||
|
||||
def sanitize_case_id(case_id: str) -> str:
|
||||
"""Generate filesystem-safe case identifier."""
|
||||
sanitized = re.sub(r"[^A-Za-z0-9._-]+", "_", case_id)
|
||||
return sanitized.strip("_") or "case"
|
||||
|
||||
|
||||
def copy_ignore_eval_dirs(_src: str, names: List[str]) -> List[str]:
|
||||
"""Ignore helper to skip evaluation artifacts when copying benchmark dirs."""
|
||||
ignored: List[str] = []
|
||||
for name in names:
|
||||
if name.endswith(".infer_eval"):
|
||||
ignored.append(name)
|
||||
return ignored
|
||||
|
||||
|
||||
def prepare_workspace(
|
||||
repo_root: Path,
|
||||
benchmark_dir: Path,
|
||||
workspace_root: Path,
|
||||
case_id: str,
|
||||
) -> Tuple[Path, Path]:
|
||||
"""Clone the necessary subset of the repo into a temporary workspace."""
|
||||
workspace_case_root = workspace_root / sanitize_case_id(case_id)
|
||||
if workspace_case_root.exists():
|
||||
shutil.rmtree(workspace_case_root)
|
||||
workspace_repo_root = workspace_case_root / "repo"
|
||||
workspace_repo_root.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
shutil.copy2(repo_root / "Makefile", workspace_repo_root / "Makefile")
|
||||
shutil.copytree(repo_root / "common", workspace_repo_root / "common", dirs_exist_ok=True)
|
||||
shutil.copytree(repo_root / "target", workspace_repo_root / "target", dirs_exist_ok=True)
|
||||
shutil.copytree(
|
||||
benchmark_dir,
|
||||
workspace_repo_root / benchmark_dir.name,
|
||||
dirs_exist_ok=True,
|
||||
ignore=copy_ignore_eval_dirs,
|
||||
)
|
||||
return workspace_case_root, workspace_repo_root
|
||||
|
||||
|
||||
def relative_to_repo(path: Path, repo_root: Path) -> str:
|
||||
"""Return a path relative to repo_root when possible."""
|
||||
try:
|
||||
return str(path.relative_to(repo_root))
|
||||
except ValueError:
|
||||
return str(path)
|
||||
|
||||
|
||||
def init_case_result(case: Dict, repo_root: Path) -> CaseResult:
|
||||
"""Create a CaseResult with basic metadata for the given case."""
|
||||
source_rel = Path(case["source"]["path"])
|
||||
benchmark_dir_path = (repo_root / source_rel).parent
|
||||
try:
|
||||
benchmark_rel = str(benchmark_dir_path.relative_to(repo_root))
|
||||
except ValueError:
|
||||
benchmark_rel = str(benchmark_dir_path)
|
||||
return CaseResult(
|
||||
case_id=compose_case_id(case),
|
||||
source_path=str(source_rel),
|
||||
benchmark_dir=benchmark_rel,
|
||||
output_dir="",
|
||||
)
|
||||
|
||||
|
||||
def snapshot_artifacts(
|
||||
case_dir: Path,
|
||||
workspace_benchmark_dir: Path,
|
||||
eval_root: Path,
|
||||
result: CaseResult,
|
||||
) -> None:
|
||||
"""Copy the workspace benchmark directory into the case directory."""
|
||||
artifacts_dir = case_dir / "artifacts"
|
||||
if artifacts_dir.exists():
|
||||
shutil.rmtree(artifacts_dir)
|
||||
try:
|
||||
shutil.copytree(workspace_benchmark_dir, artifacts_dir)
|
||||
result.artifact_dir = relative_to_repo(artifacts_dir, eval_root)
|
||||
except Exception as exc: # pragma: no cover - defensive
|
||||
result.notes.append(f"Failed to copy artifacts: {exc}")
|
||||
|
||||
|
||||
def process_case(
|
||||
case: Dict,
|
||||
args: argparse.Namespace,
|
||||
repo_root: Path,
|
||||
eval_root: Path,
|
||||
) -> CaseResult:
|
||||
"""Process a single JSONL entry."""
|
||||
case_id = compose_case_id(case)
|
||||
source_rel = Path(case["source"]["path"])
|
||||
source_path = repo_root / source_rel
|
||||
benchmark_dir = source_path.parent
|
||||
|
||||
result = init_case_result(case, repo_root)
|
||||
|
||||
if not source_path.exists():
|
||||
result.errors.append(f"Source file '{source_rel}' does not exist.")
|
||||
return result
|
||||
|
||||
try:
|
||||
case_dir = ensure_case_output_dir(
|
||||
eval_root, case["pseudo"]["path"], case["pseudo"]["address"], result
|
||||
)
|
||||
except Exception as exc: # pragma: no cover - defensive
|
||||
result.errors.append(f"Failed to prepare case directory: {exc}")
|
||||
return result
|
||||
|
||||
result.output_dir = str(case_dir.relative_to(eval_root))
|
||||
|
||||
full_source_text = source_path.read_text(encoding="utf-8")
|
||||
updated_source, replaced = replace_function_body(
|
||||
full_source_text,
|
||||
case["source"]["content"],
|
||||
case["pseudo"]["content-fix"],
|
||||
)
|
||||
|
||||
if not replaced:
|
||||
result.errors.append(
|
||||
"Could not locate the original function snippet in source file."
|
||||
)
|
||||
return result
|
||||
|
||||
result.replacement_applied = True
|
||||
write_case_artifacts(case_dir, case, updated_source, full_source_text)
|
||||
|
||||
workspace_root = Path(args.workspace_root)
|
||||
if not workspace_root.is_absolute():
|
||||
workspace_root = eval_root / workspace_root
|
||||
workspace_root.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
workspace_case_root: Optional[Path] = None
|
||||
try:
|
||||
workspace_case_root, workspace_repo_root = prepare_workspace(
|
||||
repo_root, benchmark_dir, workspace_root, case_id
|
||||
)
|
||||
workspace_benchmark_dir = workspace_repo_root / benchmark_dir.name
|
||||
artifacts_captured = False
|
||||
|
||||
def capture_artifacts() -> None:
|
||||
nonlocal artifacts_captured
|
||||
if artifacts_captured:
|
||||
return
|
||||
snapshot_artifacts(case_dir, workspace_benchmark_dir, eval_root, result)
|
||||
artifacts_captured = True
|
||||
|
||||
workspace_source_path = workspace_repo_root / source_rel
|
||||
workspace_source_path.write_text(updated_source, encoding="utf-8")
|
||||
|
||||
result.workspace_dir = relative_to_repo(workspace_case_root, eval_root)
|
||||
|
||||
log_path = case_dir / "case.log"
|
||||
with log_path.open("w", encoding="utf-8") as log_handle:
|
||||
log_handle.write(f"Case: {case_id}\n")
|
||||
log_handle.write(f"Workspace: {workspace_case_root}\n")
|
||||
log_handle.write(f"Benchmark copy: {workspace_benchmark_dir}\n")
|
||||
log_handle.write(f"Target: {args.target}\n")
|
||||
log_handle.flush()
|
||||
|
||||
if not args.skip_clean:
|
||||
clean_rc = run_command(
|
||||
["make", f"TARGET={args.target}", "clean"],
|
||||
workspace_benchmark_dir,
|
||||
log_handle,
|
||||
"clean",
|
||||
args.command_timeout,
|
||||
)
|
||||
if clean_rc is None:
|
||||
result.errors.append(
|
||||
f"'make clean' timed out after {args.command_timeout} seconds."
|
||||
)
|
||||
capture_artifacts()
|
||||
result.log_files["case"] = relative_to_repo(log_path, eval_root)
|
||||
return result
|
||||
if clean_rc != 0:
|
||||
result.build_status = "failed"
|
||||
result.errors.append("make clean failed.")
|
||||
capture_artifacts()
|
||||
result.log_files["case"] = relative_to_repo(log_path, eval_root)
|
||||
return result
|
||||
else:
|
||||
log_handle.write("Skipping 'make clean' per --skip-clean flag.\n")
|
||||
|
||||
build_rc = run_command(
|
||||
["make", f"TARGET={args.target}", "build"],
|
||||
workspace_benchmark_dir,
|
||||
log_handle,
|
||||
"build",
|
||||
args.command_timeout,
|
||||
)
|
||||
|
||||
result.log_files["case"] = relative_to_repo(log_path, eval_root)
|
||||
if build_rc is None:
|
||||
result.build_status = "failed"
|
||||
result.errors.append(
|
||||
f"'make build' timed out after {args.command_timeout} seconds."
|
||||
)
|
||||
capture_artifacts()
|
||||
log_handle.write("Skipping test because build timed out.\n")
|
||||
return result
|
||||
if build_rc == 0:
|
||||
result.build_status = "succeeded"
|
||||
else:
|
||||
result.build_status = "failed"
|
||||
result.errors.append("make build failed.")
|
||||
log_handle.write("Skipping test because build failed.\n")
|
||||
capture_artifacts()
|
||||
return result
|
||||
|
||||
test_rc = run_command(
|
||||
["make", f"TARGET={args.target}", "test"],
|
||||
workspace_benchmark_dir,
|
||||
log_handle,
|
||||
"test",
|
||||
args.command_timeout,
|
||||
)
|
||||
|
||||
if test_rc is None:
|
||||
result.test_status = "failed"
|
||||
result.errors.append(
|
||||
f"'make test' timed out after {args.command_timeout} seconds."
|
||||
)
|
||||
elif test_rc == 0:
|
||||
result.test_status = "succeeded"
|
||||
else:
|
||||
result.test_status = "failed"
|
||||
result.errors.append("make test failed.")
|
||||
|
||||
capture_artifacts()
|
||||
|
||||
finally:
|
||||
if (
|
||||
workspace_case_root
|
||||
and workspace_case_root.exists()
|
||||
and not args.keep_workspaces
|
||||
):
|
||||
shutil.rmtree(workspace_case_root, ignore_errors=True)
|
||||
|
||||
return result
|
||||
|
||||
|
||||
def collect_cases(jsonl_path: Path, limit: Optional[int]) -> Iterable[Dict]:
|
||||
"""Yield cases from jsonl file respecting the optional limit."""
|
||||
processed = 0
|
||||
with jsonl_path.open("r", encoding="utf-8") as handle:
|
||||
for line in handle:
|
||||
stripped = line.strip()
|
||||
if not stripped:
|
||||
continue
|
||||
yield json.loads(stripped)
|
||||
processed += 1
|
||||
if limit is not None and processed >= limit:
|
||||
break
|
||||
|
||||
|
||||
def compute_summary(results: List[CaseResult]) -> Dict:
|
||||
"""Aggregate statistics over all case results."""
|
||||
total = len(results)
|
||||
replacements = sum(1 for r in results if r.replacement_applied)
|
||||
build_success = sum(1 for r in results if r.build_status == "succeeded")
|
||||
test_success = sum(1 for r in results if r.test_status == "succeeded")
|
||||
|
||||
def frac(passed: int, denom: int) -> float:
|
||||
return round(passed / denom, 4) if denom else 0.0
|
||||
|
||||
per_benchmark: Dict[str, Dict[str, float]] = {}
|
||||
for r in results:
|
||||
stats = per_benchmark.setdefault(
|
||||
r.benchmark_dir,
|
||||
{
|
||||
"cases": 0,
|
||||
"replacements": 0,
|
||||
"build_success": 0,
|
||||
"test_success": 0,
|
||||
},
|
||||
)
|
||||
stats["cases"] += 1
|
||||
if r.replacement_applied:
|
||||
stats["replacements"] += 1
|
||||
if r.build_status == "succeeded":
|
||||
stats["build_success"] += 1
|
||||
if r.test_status == "succeeded":
|
||||
stats["test_success"] += 1
|
||||
|
||||
for stats in per_benchmark.values():
|
||||
stats["replacement_rate"] = frac(stats["replacements"], stats["cases"])
|
||||
stats["build_rate"] = frac(stats["build_success"], stats["cases"])
|
||||
stats["test_rate"] = frac(stats["test_success"], stats["cases"])
|
||||
|
||||
summary = {
|
||||
"total_cases": total,
|
||||
"replacement_success_count": replacements,
|
||||
"replacement_success_rate": frac(replacements, total),
|
||||
"compilable_count": build_success,
|
||||
"compilable_rate": frac(build_success, total),
|
||||
"executable_count": test_success,
|
||||
"executable_rate": frac(test_success, total),
|
||||
"compilation_failures": [
|
||||
r.case_id for r in results if r.build_status == "failed"
|
||||
],
|
||||
"execution_failures": [
|
||||
r.case_id
|
||||
for r in results
|
||||
if r.build_status == "succeeded" and r.test_status == "failed"
|
||||
],
|
||||
"cases": [asdict(r) for r in results],
|
||||
"by_benchmark": per_benchmark,
|
||||
}
|
||||
return summary
|
||||
|
||||
|
||||
def write_summary(
|
||||
eval_root: Path,
|
||||
args: argparse.Namespace,
|
||||
jsonl_path: Path,
|
||||
summary: Dict,
|
||||
) -> Tuple[Path, Path]:
|
||||
"""Write JSON and Markdown summary reports."""
|
||||
report_root = eval_root / args.report_dir
|
||||
report_root.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
timestamp = datetime.now().strftime("%Y%m%d-%H%M%S")
|
||||
base_name = f"{jsonl_path.stem}-{args.target}"
|
||||
json_report = report_root / f"{base_name}-{timestamp}.json"
|
||||
markdown_report = report_root / f"{base_name}-{timestamp}.md"
|
||||
|
||||
json_report.write_text(json.dumps(summary, indent=2), encoding="utf-8")
|
||||
|
||||
benchmark_lines = [
|
||||
"| Benchmark | Cases | Replacement% | Build% | Exec% |",
|
||||
"| --- | --- | --- | --- | --- |",
|
||||
]
|
||||
for bench, stats in sorted(summary["by_benchmark"].items()):
|
||||
benchmark_lines.append(
|
||||
f"| {bench} | {stats['cases']} | "
|
||||
f"{stats['replacement_rate']*100:.2f}% | "
|
||||
f"{stats['build_rate']*100:.2f}% | "
|
||||
f"{stats['test_rate']*100:.2f}% |"
|
||||
)
|
||||
if len(benchmark_lines) == 2:
|
||||
benchmark_lines.append("| (none) | 0 | 0.00% | 0.00% | 0.00% |")
|
||||
|
||||
compilation_items = summary["compilation_failures"] or ["None"]
|
||||
execution_items = summary["execution_failures"] or ["None"]
|
||||
|
||||
relative_jsonl = relative_to_repo(jsonl_path, eval_root)
|
||||
|
||||
lines = [
|
||||
f"# Infer-Out Model 2 Evaluation ({base_name})",
|
||||
"",
|
||||
f"- Timestamp: {timestamp}",
|
||||
f"- Source JSONL: {relative_jsonl}",
|
||||
f"- Target: {args.target}",
|
||||
f"- Total cases: {summary['total_cases']}",
|
||||
f"- Replacement success: {summary['replacement_success_count']} "
|
||||
f"({summary['replacement_success_rate']*100:.2f}%)",
|
||||
f"- Compilable: {summary['compilable_count']} "
|
||||
f"({summary['compilable_rate']*100:.2f}%)",
|
||||
f"- Executable: {summary['executable_count']} "
|
||||
f"({summary['executable_rate']*100:.2f}%)",
|
||||
"",
|
||||
"## Benchmark Breakdown",
|
||||
*benchmark_lines,
|
||||
"",
|
||||
"## Compilation Failures",
|
||||
]
|
||||
lines.extend(f"- {cid}" for cid in compilation_items)
|
||||
lines.append("")
|
||||
lines.append("## Execution Failures")
|
||||
lines.extend(f"- {cid}" for cid in execution_items)
|
||||
|
||||
markdown_report.write_text("\n".join(lines), encoding="utf-8")
|
||||
return json_report, markdown_report
|
||||
|
||||
|
||||
def main() -> int:
|
||||
args = parse_args()
|
||||
eval_root = Path(__file__).resolve().parents[1]
|
||||
repo_root = _get_bench_root(args.bench_root)
|
||||
jsonl_path = Path(args.jsonl)
|
||||
if not jsonl_path.is_absolute():
|
||||
jsonl_path = eval_root / jsonl_path
|
||||
|
||||
if not jsonl_path.exists():
|
||||
print(f"JSONL file '{jsonl_path}' not found.", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
cases = list(collect_cases(jsonl_path, args.limit))
|
||||
if not cases:
|
||||
print("No cases to process.")
|
||||
return 0
|
||||
|
||||
results: List[Optional[CaseResult]] = [None] * len(cases)
|
||||
|
||||
def record_result(idx: int, case_result: CaseResult) -> None:
|
||||
results[idx] = case_result
|
||||
status = (
|
||||
f"build={case_result.build_status}, test={case_result.test_status}"
|
||||
if case_result.replacement_applied
|
||||
else "replacement_failed"
|
||||
)
|
||||
print(f"[{idx + 1}] {case_result.case_id}: {status}")
|
||||
|
||||
if args.jobs <= 1:
|
||||
for idx, case in enumerate(cases):
|
||||
case_result = process_case(case, args, repo_root, eval_root)
|
||||
record_result(idx, case_result)
|
||||
else:
|
||||
with ThreadPoolExecutor(max_workers=args.jobs) as executor:
|
||||
future_to_idx = {
|
||||
executor.submit(process_case, case, args, repo_root, eval_root): idx
|
||||
for idx, case in enumerate(cases)
|
||||
}
|
||||
for future in as_completed(future_to_idx):
|
||||
idx = future_to_idx[future]
|
||||
try:
|
||||
case_result = future.result()
|
||||
except Exception as exc: # pragma: no cover - defensive
|
||||
case_result = init_case_result(cases[idx], repo_root)
|
||||
case_result.errors.append(f"Unhandled exception: {exc}")
|
||||
record_result(idx, case_result)
|
||||
|
||||
final_results = [res for res in results if res is not None]
|
||||
|
||||
summary = compute_summary(final_results)
|
||||
json_report, markdown_report = write_summary(eval_root, args, jsonl_path, summary)
|
||||
print(f"Wrote summary reports:\n - {json_report}\n - {markdown_report}")
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
Loading…
Add table
Add a link
Reference in a new issue