mirror of https://github.com/NomaDamas/k-skill.git synced 2026-06-24 02:04:11 +00:00

History

Jeffrey (Dongkyu) Kim 271ea185c4 Sync dev → main: 신규 스킬 6종 (emergency-room-beds · korean-cinema-search · kstartup-search · local-election-candidate-search · ohou-today-deal · sh-notice-search) + k-skill-qa-bot + daiso/danawa 보강 (#271 ) * docs(flight-ticket-search): register skill in README table and add feature guide PR #224 머지 시 README "어떤 걸 할 수 있나" 표와 "포함된 기능" 리스트, 그리고 docs/features/flight-ticket-search.md 가이드가 등록되지 않아 main에 있는 다른 모든 스킬과 달리 사용자/에이전트가 README만 봐서는 이 스킬을 발견할 수 없는 상태였다. 누락분을 hotfix로 보강한다. - README 표에 `flight-ticket-search` 행 추가 (마이리얼트립 옆 항공 클러스터) - README "포함된 기능" 리스트에 가이드 링크 추가 - docs/features/flight-ticket-search.md 신규 작성: · 사용 시나리오, 구현 표면(fast-flights==2.2, 사용자 venv 격리) · search / compare-month / compare-range / compare-years CLI 예시 · 응답 필드, IATA 입력 가이드, 예약 링크 정책 · 검증된 노선 목록, 실패 모드, 비범위, 출처 검증: - node --test scripts/skill-docs.test.js → 138/138 pass - ./scripts/validate-skills.sh → skill layout looks valid 코드 변경 없음 → changeset 불필요. * feat(daiso-product-search): replace blocked-API fallback with Bearer token auth selStrPkupStck는 더 이상 차단 상태가 아니며, /api/auth/request로 비로그인 JWT를 발급받아 AES-128-CBC(키: PRE_AUTH_ENC_KEY)로 암호화한 Bearer 토큰으로 접근한다. 403 응답 시 토큰을 재발급해 1회 재시도한다. pickupEligibility(selPkupStr) 폴백 로직은 제거했다. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Preserve Daiso pickup answers when Bearer auth degrades Keep exact stock lookup on the official Bearer-token path while restoring the public selPkupStr fallback for repeated auth blocks. Constraint: PR #250 review required Bearer auth to remain primary without removing the resilient pickup eligibility API. Rejected: Throwing after the retry \| it collapses callers back to a brittle single upstream-auth dependency. Confidence: high Scope-risk: narrow Directive: Keep pickupStock quantity semantics separate from pickupEligibility yes/no fallback. Tested: node --test packages/daiso-product-search/test/index.test.js; npm test --workspace daiso-product-search; npm run lint --workspace daiso-product-search; npm run ci; live lookupStoreProductAvailability smoke for 강남역2호점 / VT 리들샷 100. Not-tested: Live forced 403 from Daiso upstream; covered with injected fetch regression tests. * Prove Daiso stock retry sends auth headers Strengthen the retry regression so the Bearer-token contract cannot regress while still returning success from mocked stock responses.\n\nConstraint: PR #250 review requested explicit Authorization, X-DM-UID, and request body assertions on the retry path.\nRejected: Counting requests only \| it allowed header/body regressions to pass.\nConfidence: high\nScope-risk: narrow\nDirective: Keep auth-header assertions on both initial and retry stock requests when editing this flow.\nTested: node --test packages/daiso-product-search/test/index.test.js; npm test --workspace daiso-product-search; npm run lint --workspace daiso-product-search; npm run ci; live lookupStoreProductAvailability smoke for 강남역2호점 / VT 리들샷 100; repeated-403 fixture probe.\nNot-tested: Live repeated upstream 403 because forcing Daiso production auth failure is not available without changing upstream state. * Preserve Daiso caller headers through Bearer stock lookup Keep advanced caller headers on the authenticated stock endpoint while generated Bearer and X-DM-UID values remain authoritative. Document the degraded selPkupStr fallback order in skill and source docs so the public workflow matches the restored API surface.\n\nConstraint: PR #250 review required resilient Bearer-primary stock lookup plus selPkupStr fallback and header/body contract coverage.\nRejected: Replacing caller headers with only auth headers \| It regressed tracing/test-control header pass-through.\nConfidence: high\nScope-risk: narrow\nDirective: Keep Authorization and X-DM-UID generated by the auth flow even when callers provide same-named headers.\nTested: node --test packages/daiso-product-search/test/index.test.js; npm test --workspace daiso-product-search; npm run lint --workspace daiso-product-search; node --test scripts/skill-docs.test.js; npm run ci; live lookupStoreProductAvailability smoke for 강남역2호점 / VT 리들샷 100.\nNot-tested: Forced live upstream repeated 403; covered by injected fixture tests. * fix(danawa-price-search): capture .ico.* payment-condition badges and surface as row labels PR #226 row 파서에 결제조건 배지(`.ico.cash`/`.ico.point`/`.ico.coupon`/`.ico.card`) selector가 누락돼, 카드 결제 불가능한 현금/쿠폰/포인트 전용가가 일반 최저가로 노출되는 결함을 고친다. - `offers()` row 파싱부에 결제조건 배지 화이트리스트 캡처 블록 추가 (클래스 `cash`/`point`/`coupon`/`discount`/`card`/`membership` 또는 텍스트 `현금`/`포인트`/`쿠폰`/`할인`만 인정 — 빠른배송/안내/상품리뷰 노이즈 차단) - row dict 신규 필드 6개: `payment_badges`, `cash_only`, `point_only`, `coupon_only`, `card_only_badge`, `is_conditional_price` - 반환 dict에 `normal_count`, `conditional_count` 추가 - `SKILL.md` / `docs/features/danawa-price-search.md` 갱신 (Output shape · Response style · Workflow · Failure modes에 결제조건 정책과 표 예시 명시) 정렬 정책은 그대로 `total_price` 단일 기준이며, 결제조건은 row 단위 플래그/라벨로만 노출해 호출자가 결제수단에 맞춰 직접 판단하도록 한다. 회귀 (pcode=75001853, 갤럭시 S25 256GB 자급제 `offers --limit 5`): - 1위 킴스클럽 979,000원 / `cash_only=True` / `payment_badges=["현금"]` - 2위 롯데ON 1,072,080원 / `cash_only=False` / `payment_badges=[]` - 3~5위 일반가 row 모두 `payment_badges` 빈 리스트 (노이즈 0건) Closes #252 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Ensure captured Danawa payment badges stay conditional Classify every whitelisted payment badge into normalized condition types so callers cannot count captured discount, membership, or text-only card rows as normal prices. Constraint: PR #253 review required TDD follow-up on feature/#252 without changing total_price sorting.\nRejected: Removing discount and membership from the whitelist \| would lose Danawa condition labels already captured by the parser.\nConfidence: high\nScope-risk: narrow\nDirective: Keep payment_badge whitelist and payment_condition_types in sync whenever adding new badge classes or text keywords.\nTested: PYTHONPATH=.:scripts python3 -m unittest scripts.test_danawa_price_search; live offers 75001853 --limit 5; npm run lint; npm run typecheck; npm run test; architect verification CLEAR.\nNot-tested: Danawa markup variants not represented by current live page or synthetic badge fixtures. * Keep icon-only Danawa payment badges visible Class-only Danawa payment icons can carry eligibility information without visible text, so synthesize display labels from the same normalized condition map used for types and booleans. This keeps raw row labels, condition fields, and returned-window counts aligned for downstream table renderers.\n\nConstraint: PR #253 review follow-up requires TDD coverage before parser changes.\nRejected: Leaving payment_badges text-only \| icon-only conditional rows would still render without visible payment labels.\nConfidence: high\nScope-risk: narrow\nDirective: Derive future payment badge labels, types, and booleans from one canonical mapping.\nTested: python3 -m py_compile danawa-price-search/scripts/danawa_search.py scripts/test_danawa_price_search.py; PYTHONPATH=.:scripts python3 -m unittest scripts.test_danawa_price_search; python3 danawa-price-search/scripts/danawa_search.py offers 75001853 --limit 5; npm run lint; npm run typecheck; npm run test\nNot-tested: Danawa icon-only markup was verified with synthetic fixtures rather than a live page snapshot. * Merge pull request #249 from NomaDamas/feature/#248 Feature/#248 * Restore SH notice lookup without proxy policy drift Reintroduce SH notice search as a direct public HTML client so the skill complies with the free-API proxy boundary while preserving verifiable keyword, pagination, and attachment behavior. Constraint: i-sh.co.kr board is public unauthenticated HTML, so k-skill-proxy must not host the scraper.\nRejected: Re-adding /v1/sh-notice proxy routes \| public HTML scraping in proxy violates repository policy.\nConfidence: high\nScope-risk: moderate\nDirective: Keep SH public HTML access local/direct unless a key-required official free API is discovered and documented.\nTested: npm run ci; npm run lint --workspace sh-notice-search; npm test --workspace sh-notice-search; live SH smoke for 행복주택, 매입임대, 신혼희망타운, page 1/page 5, 1/6/9/11/0 attachment details.\nNot-tested: authenticated SH flows, 청약 application/submission, direct attachment downloads. * Preserve public SH helper semantics Route exported URL builders through the same normalization as the CLI/API so natural category aliases cannot bypass srchTp title narrowing or category mapping.\n\nConstraint: PR #254 review found exported helper callers could pass Korean/English public category inputs and get broken or broadened SH URLs.\nRejected: Keep normalized-only fast paths \| exported helpers are public API and must protect natural inputs.\nConfidence: high\nScope-risk: narrow\nDirective: Keep exported helper behavior aligned with normalizeSearchOptions and normalizeDetailOptions when adding new public aliases.\nTested: npm test --workspace sh-notice-search; npm run lint --workspace sh-notice-search; npm run typecheck; npm run ci; node helper smoke for 임대 search/detail URLs.\nNot-tested: Live SH network smoke was not rerun for this helper-only change. * Preserve SH parser helper aliases Route exported parser helpers through the same public normalizers used by the SH fetch and URL-builder APIs so natural category aliases stay consistent across the package surface. Constraint: PR #254 Round 2 review found parser helpers still treated raw category aliases as pre-normalized inputs. Rejected: Keep parser helpers normalized-only \| inconsistent with exported URL builders and public helper ergonomics. Confidence: high Scope-risk: narrow Directive: Keep exported SH helper entry points on canonical normalizeSearchOptions/normalizeDetailOptions unless a separate internal-only API is introduced. Tested: npm test --workspace sh-notice-search; npm run lint --workspace sh-notice-search; npm run typecheck; npm pack --workspace sh-notice-search --dry-run; npm run ci; parser smoke for Korean 임대 list/detail helpers; Ralph architect verification CLEAR; post-deslop regression npm run ci Not-tested: Live SH network smoke for this follow-up; fixture and injected-fetch coverage exercised the helper contract. * Make SH parser failures explicit Warn when SH returns block or maintenance HTML without the expected public board markup, and constrain exposed preview links to the SH converter origin/path.\n\nConstraint: Round 3 review required TDD coverage for block/maintenance HTML and untrusted preview URLs.\nRejected: Throwing on unexpected HTML \| Existing parser helpers return partial fixture-friendly results, so warnings preserve compatibility while exposing failure evidence.\nConfidence: high\nScope-risk: narrow\nDirective: Keep SH public HTML lookup direct; do not add proxy routing unless a key-required official free API is adopted.\nTested: npm run lint --workspace sh-notice-search; npm test --workspace sh-notice-search; npm run typecheck; npm pack --workspace sh-notice-search --dry-run; npm run ci; Node smoke for blocked HTML warnings and external preview filtering.\nNot-tested: Live blocked/NetFunnel SH response, because no live blocked page was available during implementation. * ci: install beautifulsoup4 so danawa price search tests can import bs4 The new scripts/test_danawa_price_search.py imports danawa_search.py, which requires beautifulsoup4. CI only runs npm ci, so the bs4 import fails with 'beautifulsoup4 is required: python -m pip install beautifulsoup4' and the validate job exits with code 1. Install beautifulsoup4 via pip before running npm run ci so the Python test suite can import danawa_search and run the new payment badge regression tests. * Revert "ci: install beautifulsoup4 so danawa price search tests can import bs4" This reverts commit `8330e5adf7`. * test: install beautifulsoup4 inside npm test before Python tests The new scripts/test_danawa_price_search.py imports danawa_search.py, which requires beautifulsoup4. CI runs npm ci + npm run ci and does not install Python packages, so the bs4 import fails at module load. Install beautifulsoup4 via 'pip install --user' as the first step of the test script so it is available when Python unittests import the danawa helper. Local dev environments are unaffected because pip install is idempotent and quiet. * feat(qa-bot): add k-skill-qa-bot under tools/ External macOS daemon that clones NomaDamas/k-skill main every 3 days, runs each skill through codex exec, has an LLM judge grade pass/fail/skip via codex exec --output-schema, and files dedup'd GitHub issues for true failures. Layout: - install.sh copies tools/k-skill-qa-bot/ to ~/.local/share/k-skill-qa-bot/ and registers a LaunchAgent at ~/Library/LaunchAgents/. - update-clone.sh has a hard guard: refuses any K_SKILL_CLONE outside K_QA_HOME/k-skill-clone unless ALLOW_EXTERNAL_CLONE_TARGET=1. - Force-skip 10 destructive/login-required skills (ktx-booking, srt-booking, catchtable-sniper, kakaotalk-mac, hipass-receipt, toss-securities, etc.) so the bot never triggers reservation abuse. - Deprecated skills (strike-through + 지원 중단 in README) auto-detected and skipped, never failed. - First-run safety: CREATE_ISSUES=false by default. - mkdir-based concurrency lock with atomic stale reclaim. - Issue dedup: sha1(skill_name + symptom_class)[:12] body marker. - Deterministic gates override LLM judge to FAIL on exit_code != 0, missing VERDICT line, or near-timeout duration. * Support nearby ER status checks Add an E-Gen based emergency-room skill that resolves a user location, queries the public nearby emergency-room list, and reports operation flags while documenting that exact remaining bed counts are not exposed by this surface. Constraint: Issue #255 requested NEMC emergency bed status using public monitoring/E-Gen surfaces. Rejected: Scraping private monitoring dashboards or claiming exact bed utilization \| public endpoints expose operation flags, not per-hospital remaining bed counts. Confidence: high Scope-risk: narrow Directive: Preserve the public-data limitation text unless a verified official bed-count endpoint is added. Tested: npm run lint --workspace emergency-room-beds; npm test --workspace emergency-room-beds; node --test scripts/skill-docs.test.js; npm run typecheck; npm pack --workspace emergency-room-beds --dry-run; ./scripts/validate-skills.sh; live E-Gen coordinate smoke. Not-tested: npm run ci end-to-end due local Python 3.14 pip/pyexpat import error before tests. * Prevent ER status ambiguity from reaching users Constraint: Health-adjacent public E-Gen/Kakao data can be absent, delayed, schema-drifted, or partially unknown. Rejected: Mapping all non-Y operation flags to false \| It misrepresents missing upstream data as a negative operating status. Rejected: Treating unknown E-Gen payloads as empty results \| It hides upstream failure behind a false no-results response. Confidence: high Scope-risk: narrow Directive: Keep unknown health availability data explicit and preserve upstream failure evidence. Tested: npm run lint --workspace emergency-room-beds; npm test --workspace emergency-room-beds; node --test scripts/skill-docs.test.js; npm run typecheck; npm pack --workspace emergency-room-beds --dry-run; ./scripts/validate-skills.sh; direct Node smoke for tri-state/schema/coordinate guards. Not-tested: npm run ci due pre-existing local Python 3.14 pyexpat/libexpat bootstrap failure noted on PR. Co-authored-by: OmX <omx@oh-my-codex.dev> * fix(ci): exclude tools/ from skill validator The tools/ directory hosts repo tooling (e.g. k-skill-qa-bot), not skills, so validate-skills.sh should skip it like other non-skill top-level directories. * 영화관 검색 스킬 추가 (#260) * Add korean cinema search skill * Document playDate for cinema skill * feat(kstartup-search): 창업진흥원 K-Startup 조회 스킬 + 프록시 라우트 4종 (#259) * feat(kstartup-search): 창업진흥원 K-Startup 조회 스킬과 프록시 라우트 추가 공공데이터포털 dataset 15125364 (창업진흥원_K-Startup(사업소개,사업공고,콘텐츠 등)_조회서비스) 의 4개 endpoint 를 k-skill-proxy 경유로 조회하는 스킬을 추가한다. - 신규 라우트: GET /v1/kstartup/{business-info,announcements,contents,statistics} - 각각 getBusinessInformation01/getAnnouncementInformation01/getContentInformation01/ getStatisticalInformation01 으로 중계 - ServiceKey 는 서버 측 DATA_GO_KR_API_KEY 로 주입, returnType=json 강제 - 정상 응답만 캐시, data.go.kr 에러 envelope (resultCode != "00", errMsg 등) 은 캐시 우회 - helper: kstartup-search/scripts/run_kstartup.py (stdlib only) - 일반 조회는 hosted proxy 사용 → 사용자 키 불필요 - --direct 옵션은 사용자가 본인 KSKILL_KSTARTUP_API_KEY (혹은 DATA_GO_KR_API_KEY) 로 upstream 직접 호출 + --dry-run 시 키 redact - 입력 검증: page/perPage 정수·범위, YYYYMMDD 날짜 + 시작일 ≤ 종료일, Y/N 대문자화, 텍스트 필드 길이 상한, biz_yr 4자리 - 테스트: k-skill-proxy 서버 테스트 10건 신규 (normalizer, 라우트, 캐시 분리, returnType=json 강제, 503/400/502, 키 누수 회귀), Python unittest 13건 - 문서: SKILL.md, docs/features/kstartup-search.md, README 표/리스트, docs/sources.md, .changeset/kstartup-search.md (k-skill-proxy minor) * docs(kstartup-search): docs/setup·security·k-skill-setup·proxy README 에 K-Startup 항목 추가 seoul-density · KOSIS · NTS 선례와 동일한 위치·문구로 다음을 보강한다. - docs/setup.md: dotenv 예시에 KSKILL_KSTARTUP_API_KEY 추가, credential 표에 K-Startup 행 추가, "다음에 볼 문서" 리스트 추가 - docs/security-and-secrets.md: standard variable names 에 KSKILL_KSTARTUP_API_KEY 추가, hosted proxy 사용 스킬 목록·proxy 운영 prose 에 K-Startup 추가, dotenv 예시 추가 - k-skill-setup/SKILL.md: credential resolution prose 와 시크릿 요약 표에 K-Startup 안내 추가 - packages/k-skill-proxy/README.md: 라우트 목록에 /v1/kstartup/{business-info,announcements,contents,statistics} 추가 - docs/features/k-skill-proxy.md: 라우트 목록에 같은 4개 추가 * fix(kstartup-search): strict calendar-date validation in Python helper validate_yyyymmdd() previously only checked month in [1,12] and day in [1,31], which accepted impossible dates like 20240230 or 20240431 in --direct mode. The proxy-side normalizer in packages/k-skill-proxy/src/kstartup.js already uses Date.UTC() to reject such inputs, so this aligns the --direct path with the proxy path and eliminates validator drift. Uses datetime.date(year, month, day) and raises HelperError on ValueError. Adds regression test covering impossible calendar dates (Feb 30, Apr 31, month 13, day 0) and the leap-year boundary (2024-02-29 valid, 2023-02-29 not). --------- Co-authored-by: Jeffrey (Dongkyu) Kim <vkehfdl1@gmail.com> * fix(qa-bot): upgrade judge to gpt-5.5 and run codex with sandbox bypass PR #257 follow-up. Two changes: 1. JUDGE_MODEL default: gpt-5.4-mini -> gpt-5.5 The cheaper judge was misclassifying every wrong-output verdict because the offline matcher fell through to the dumb 'VERDICT: FAIL in transcript' check. Re-running the same 10 historical fail cases with gpt-5.5 + real LLM judge correctly reclassified 7 of them as pass (the codex agent actually accomplished the skill goal) and the remaining 3 as network-error / partial-success / skip with accurate reasons. 2. Drop -s read-only, add --dangerously-bypass-approvals-and-sandbox The read-only codex sandbox was triggering spurious DNS resolution failures inside the test runs (host blocked at the syscall level even for legitimate proxy / public-API calls). Live re-test with the bypass flag and provider pin produced clean transcripts: cheap-gas-nearby, daangn-realty-search, han-river-water-level, naver-news-search, naver-shopping-search, seoul-density, seoul-subway-arrival all PASS. The QA bot is sandboxed externally by launchd anyway. 3. New CODEX_PROVIDER env (default: openai) Lets users pin the codex model_provider explicitly so the bot does not accidentally route through a private OpenAI-compatible proxy that may not have keys registered for all model names. * Add Ohou today deal skill * fix spacing in package.json * fix(qa-bot): per-skill test_prompt overrides and smarter judge 11 skills that need specific inputs (not just a 'demonstrate' query) now ship with a hardcoded test_prompt in config/skill-overrides.yml: flight-ticket-search ICN -> NRT, 2026-08-20 one-way nts-business-registration 124-81-00998 (Samsung Electronics) korean-stock-search 005930 Samsung 5-day quote joseon-sillok-search 키워드 훈민정음 korean-law-search 산업안전보건법 제5조 library-book-search 코스모스 칼 세이건 lotto-results latest round k-schoollunch-menu 서울특별시교육청 초등학교 오늘 식단 delivery-tracking CJ dummy invoice (negative case ok) ticket-availability YES24 / 인터파크 sample zipcode-search 서울특별시 강남구 테헤란로 152 These were previously synthesized from the SKILL.md first When-to-use bullet, which is a one-line teaser without concrete inputs. The agent would then either ask the user for the missing input (partial-success) or fall back to a generic demo (often producing a VERDICT: FAIL response). Both got mis-classified as fail by the judge. qa_utils.synthesize_test_prompt now honors default_inputs.test_prompt as a verbatim override (only appending the VERDICT line if the override does not already include it). Two additional fixes for negative-case correctness: 1. judge-prompt.md: explicitly tells the judge that the agent's literal VERDICT: PASS / VERDICT: FAIL is just a hint, not binding. A skill that correctly returns 'no such business number' or 'invoice not found' for a deliberately invalid input is PASS, not fail. 2. judge-skill.py: drop the deterministic gate that flipped pass to fail when 'VERDICT: PASS' literal was missing from the transcript. That gate was producing false fails for negative-case tests where the agent correctly responded with VERDICT: FAIL because the skill rejected an invalid input. The judge LLM (gpt-5.5) is now trusted to evaluate the transcript against the SKILL.md 'Done when' criteria. Verified live: - nts-business-registration with valid number -> pass/success (0.99) - nts-business-registration with fake number -> pass/success (0.99) - flight-ticket-search ICN->NRT 2026-08-20 -> pass/success (0.99) * fix(ohou-today-deal): address PR #264 review (live UA, explicit feed selection, argv validators) - HIGH: switch fetch_html() to well-formed bot UA with contact URL (k-skill-ohou-today-deal/1.0 (+https://github.com/NomaDamas/k-skill)). ohou.se Akamai bot manager 403s anonymous UAs but allows identified bot UAs that include a contact URL. Live default workflow now returns 74 deals end-to-end instead of failing with HTTP 403. - MEDIUM: extract_deals() now explicitly selects React Query entries with queryKey == ['today-deal-feed'] or ['special-today-deal-feed'] and reads only state.data.todayDealFeed.slots[type=='DEAL']. Unrelated DEAL-shaped nodes from navigation/banner modules are excluded. Legacy fixture/JSON-payload fallback path preserved for tests that construct simplified payloads. - LOW: --limit now requires a positive integer; --min-discount is constrained to 0..100. Both validated via argparse.ArgumentTypeError so users get a clear CLI error instead of silent slicing or nonsensical thresholds. - Tests: add 9 new unit tests covering explicit feed selection, navigation/GOODS exclusion, fallback compatibility, and argv validators. Strengthen skill-docs.test.js to lock the special-today-deal-feed surface and well-formed UA signature. - Docs: update SKILL.md and feature doc to document the explicit today-deal-feed + special-today-deal-feed extraction boundary and the Akamai UA policy. * Merge pull request #263 from NomaDamas/feature/#257 Feature/#257 * Feature/#256 (#266) * Enable public local-election candidate lookups Add an NEC integrated-search skill and helper package so agents can answer 지방선거 후보자 lookup requests without credentials or proxy routes. Constraint: Issue #256 requested TDD, Ralph completion, branch feature/#256, and PR targeting dev. Rejected: k-skill-proxy route \| NEC integrated candidate search is public and requires no API key. Confidence: high Scope-risk: moderate Directive: Keep the helper read-only and do not automate NEC login, CAPTCHA, filing, or privileged election workflows. Tested: git diff --check; node --test packages/local-election-candidate-search/test/index.test.js; npm run lint --workspace local-election-candidate-search; npm run test --workspace local-election-candidate-search; npm pack --workspace local-election-candidate-search --dry-run; node packages/local-election-candidate-search/src/cli.js 오세훈 --election 시도지사 --region 서울 --limit 1; PATH=/usr/bin:/bin:/usr/sbin:/sbin:/opt/homebrew/bin:/Users/jeffrey/.codex/tmp/arg0/codex-arg0a6JueA:/opt/homebrew/lib/node_modules/@openai/codex/node_modules/@openai/codex-darwin-arm64/vendor/aarch64-apple-darwin/path:/Users/jeffrey/.cmuxterm/omo-bin:/opt/homebrew/share/android-commandlinetools/platform-tools:/opt/homebrew/share/android-commandlinetools/emulator:/opt/homebrew/share/android-commandlinetools/cmdline-tools/latest/bin:/Users/jeffrey/.local/bin:/Users/jeffrey/.bun/bin:/opt/homebrew/opt/node@22/bin:/opt/homebrew/opt/openjdk@21/bin:/opt/homebrew/opt/postgresql@18/bin:/Users/jeffrey/.jenv/shims:/Users/jeffrey/.jenv/bin:/opt/homebrew/opt/imagemagick/bin:/opt/homebrew/Cellar/pyenv-virtualenv/1.4.0/shims:/Users/jeffrey/.pyenv/shims:/opt/homebrew/opt/openssl@3/bin:/Users/jeffrey/.rbenv/shims:/Users/jeffrey/.rbenv/bin:/Users/jeffrey/google-cloud-sdk/bin:/Applications/cmux.app/Contents/Resources/bin:/Users/jeffrey/Library/pnpm:/Users/jeffrey/.nvm/versions/node/v24.13.0/bin:/Users/jeffrey/.cops/bin:/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/local/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/appleinternal/bin:/opt/pmk/env/global/bin:/Library/Apple/usr/bin:/Library/TeX/texbin:/Users/jeffrey/.cargo/bin:/Users/jeffrey/Library/Application Support/JetBrains/Toolbox/scripts:/Library/Java/JavaVirtualMachines/zulu-17.jdk/Contents/Home/bin:/Users/jeffrey/xcode-projects/marshroom/cli npm run ci Not-tested: Exhaustive NEC markup variants for every historical election type. Co-authored-by: OmX <omx@oh-my-codex.dev> * Enforce fail-closed candidate identity parsing Constraint: PR #266 review required exact candidate-name matching and CLI help regression coverage.\nRejected: fallback-to-query-name on missing upstream markup \| it can mislabel unrelated candidates as exact matches.\nConfidence: high\nScope-risk: narrow\nDirective: Keep NEC parser changes fail-closed when candidate identity cannot be parsed.\nTested: git diff --check; node --test packages/local-election-candidate-search/test/index.test.js; npm run lint --workspace local-election-candidate-search; npm run test --workspace local-election-candidate-search; npm pack --workspace local-election-candidate-search --dry-run; live CLI smoke for 오세훈; CLI --help smoke.\nNot-tested: repo-wide npm run ci remains blocked by pre-existing missing SKILL.md: ohou-today-deal. * Preserve unique candidate lookup results Deduplicate parsed NEC candidate/election rows before applying user limits, and make expected CLI validation failures concise by default while keeping an explicit debug stack escape hatch. Constraint: PR #266 round-2 follow-up requested TDD fixes for duplicate NEC rows and CLI validation UX.\nRejected: Deduplicating after limit \| would still allow duplicates to crowd out unique rows.\nRejected: Always printing stack traces \| exposes local paths for normal user-input failures.\nConfidence: high\nScope-risk: narrow\nDirective: Keep dedupe keys stable enough to avoid collapsing legitimately distinct historical election rows.\nTested: git diff --check; node --test packages/local-election-candidate-search/test/index.test.js; npm run lint --workspace local-election-candidate-search; npm run test --workspace local-election-candidate-search; npm pack --workspace local-election-candidate-search --dry-run; live 오세훈 smoke; live 김동연 duplicate repro; CLI no-args/help.\nNot-tested: Full npm run ci remains blocked by pre-existing missing SKILL.md: ohou-today-deal. * Prevent filtered NEC lookup false negatives Fix the candidate parser so documented education-superintendent and filtered local-election lookups return bounded, evidence-backed results instead of silently dropping valid rows. Constraint: PR #266 round-3 review required TDD, Ralph verification, and branch update for issue #256. Rejected: Full NEC pagination in this follow-up \| broader than the approved change; bounded 100-row fetch now avoids user-limit false negatives and warns when capped. Confidence: high Scope-risk: narrow Directive: Preserve exact-name fail-closed parsing and count raw parsed upstream rows before cap-warning decisions. Tested: git diff --check; node --test packages/local-election-candidate-search/test/index.test.js; npm run lint --workspace local-election-candidate-search; npm run test --workspace local-election-candidate-search; npm pack --workspace local-election-candidate-search --dry-run; live CLI smokes for 오세훈, 조희연, 김동연; CLI help/no-args checks; architect verification CLEAR. Not-tested: Full npm run ci remains blocked by pre-existing repo-wide missing SKILL.md: ohou-today-deal. --------- Co-authored-by: OmX <omx@oh-my-codex.dev> * chore(changesets): rename daiso bearer-auth changeset to avoid name collision with consumed main release PR #245 already consumed .changeset/issue-207-daiso-pickup-eligibility.md into daiso-product-search v0.3.0 on main. The dev branch later modified that same changeset file in `d7263a5` to describe the newer Bearer-auth fix, which collides with main's deletion on the next dev→main sync. Renaming the still-unreleased Bearer-auth note to issue-207-daiso-bearer-auth.md preserves the release entry for the next version-packages run and clears the modify/delete conflict on PR #271 without losing the changelog content. * fix(kstartup-search): implement promised client-side filter to deliver on SKILL.md L121 Live data revealed two unmet contracts in the kstartup-search helper: 1. SKILL.md L121 promised the helper re-applies supt_regin / aply_trgt / biz_enyy filters on the client side because K-Startup upstream ignores them server-side. The helper had no such logic — calling `--supt-regin 서울특별시 --rcrt-prgs-yn Y` returned 경북/충북/충남 announcements as-is, silently misleading callers. 2. The upstream `supt_regin` field is stored as the short form (`서울`, `경기`, `충북`, ...) but every CLI example in the skill used the standard 광역지자체 long form (`서울특별시`), which would never substring-match even after a client filter was added. Add `apply_client_filters()` that runs after `urlopen` returns. It honors the SKILL.md contract literally: substring match per token, AND-joined across comma-separated user values, with a 17-region (+`전국`) shortname normalisation table so both `--supt-regin 서울특별시` and `--supt-regin 서울` resolve to upstream's `서울`. Filtered responses expose a new `client_filter: {fields, upstream_returned, after_filter}` metadata block so callers can detect "first page was depleted by filter" and page through. Tests: 9 new ClientFilterTests + 2 normalisation tests on top of the existing 14 (25 total, all passing). Live smoke (against a dev proxy with DATA_GO_KR_API_KEY activated for dataset 15125364): `--supt-regin 서울특별시 --rcrt-prgs-yn Y --per-page 10` now returns 4 actual 서울 announcements (upstream returned 10 mixed-region rows; client filter narrowed to 4), with detl_pg_url to k-startup.go.kr. Confidence: high. Scope-risk: narrow — purely additive on the response path; other endpoints (business-info / contents / statistics) pass through unchanged. --------- Co-authored-by: arnold714 <arnold714@naver.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: chanmin <cmju@cowave.kr> Co-authored-by: OmX <omx@oh-my-codex.dev> Co-authored-by: hmmhmmhm/ <hmmhmmhm@naver.com> Co-authored-by: 배기민 <53887180+BAEM1N@users.noreply.github.com> Co-authored-by: lee-ji-hong <zhffktkdlekghksxk@naver.com>		2026-05-19 11:08:10 +09:00
..
bin	Sync dev → main: 신규 스킬 6종 (emergency-room-beds · korean-cinema-search · kstartup-search · local-election-candidate-search · ohou-today-deal · sh-notice-search) + k-skill-qa-bot + daiso/danawa 보강 (#271 )	2026-05-19 11:08:10 +09:00
config	Sync dev → main: 신규 스킬 6종 (emergency-room-beds · korean-cinema-search · kstartup-search · local-election-candidate-search · ohou-today-deal · sh-notice-search) + k-skill-qa-bot + daiso/danawa 보강 (#271 )	2026-05-19 11:08:10 +09:00
launchd	Sync dev → main: 신규 스킬 6종 (emergency-room-beds · korean-cinema-search · kstartup-search · local-election-candidate-search · ohou-today-deal · sh-notice-search) + k-skill-qa-bot + daiso/danawa 보강 (#271 )	2026-05-19 11:08:10 +09:00
test	Sync dev → main: 신규 스킬 6종 (emergency-room-beds · korean-cinema-search · kstartup-search · local-election-candidate-search · ohou-today-deal · sh-notice-search) + k-skill-qa-bot + daiso/danawa 보강 (#271 )	2026-05-19 11:08:10 +09:00
.gitignore	Sync dev → main: 신규 스킬 6종 (emergency-room-beds · korean-cinema-search · kstartup-search · local-election-candidate-search · ohou-today-deal · sh-notice-search) + k-skill-qa-bot + daiso/danawa 보강 (#271 )	2026-05-19 11:08:10 +09:00
AGENTS.md	Sync dev → main: 신규 스킬 6종 (emergency-room-beds · korean-cinema-search · kstartup-search · local-election-candidate-search · ohou-today-deal · sh-notice-search) + k-skill-qa-bot + daiso/danawa 보강 (#271 )	2026-05-19 11:08:10 +09:00
install.sh	Sync dev → main: 신규 스킬 6종 (emergency-room-beds · korean-cinema-search · kstartup-search · local-election-candidate-search · ohou-today-deal · sh-notice-search) + k-skill-qa-bot + daiso/danawa 보강 (#271 )	2026-05-19 11:08:10 +09:00
Makefile	Sync dev → main: 신규 스킬 6종 (emergency-room-beds · korean-cinema-search · kstartup-search · local-election-candidate-search · ohou-today-deal · sh-notice-search) + k-skill-qa-bot + daiso/danawa 보강 (#271 )	2026-05-19 11:08:10 +09:00
README.md	Sync dev → main: 신규 스킬 6종 (emergency-room-beds · korean-cinema-search · kstartup-search · local-election-candidate-search · ohou-today-deal · sh-notice-search) + k-skill-qa-bot + daiso/danawa 보강 (#271 )	2026-05-19 11:08:10 +09:00
uninstall.sh	Sync dev → main: 신규 스킬 6종 (emergency-room-beds · korean-cinema-search · kstartup-search · local-election-candidate-search · ohou-today-deal · sh-notice-search) + k-skill-qa-bot + daiso/danawa 보강 (#271 )	2026-05-19 11:08:10 +09:00

README.md

k-skill-qa-bot

Automated QA daemon for the k-skill skill library. Runs every 3 days via macOS launchd, tests every suitable skill via codex exec --json --dangerously-bypass-approvals-and-sandbox, has a read-only/no-approval LLM judge grade pass/fail/skip, and files dedup'd GitHub issues for skills that have broken.

What it does

Refreshes a shallow clone of NomaDamas/k-skill main every 3 days.
Discovers every <skill>/SKILL.md.
Classifies each skill (read-only / location / login / destructive / api-key / proxy-dependent / deprecated).
Runs each suitable skill through codex exec --json --dangerously-bypass-approvals-and-sandbox with a smoke-test prompt synthesized from the skill's ## When to use bullets. The daemon runs as a dedicated LaunchAgent with non-interactive approvals; avoiding the Codex sandbox prevents false DNS/network failures during skill smoke tests.
Judges the result via a second read-only/no-approval codex exec call using the configured judge model and a strict JSON Schema.
Files dedup'd issues on NomaDamas/k-skill for true failures (with auto-qa label). Skipped skills (deprecated, login-required, missing API key) never create issues.

The k-skill repo itself is never modified by the bot — it is read-only SSOT. Test prompts are synthesized from each SKILL.md.

Install

Prereqs (one-time):

brew install bats-core coreutils gh jq python@3
pip3 install pyyaml jsonschema pytest

codex --version       # codex-cli >= 0.130
codex login           # one-time

gh auth login         # one-time, needs `repo` scope

Then:

cd /path/to/k-skill
bash tools/k-skill-qa-bot/install.sh

Re-run install.sh to upgrade — it is idempotent and preserves state/.

Configure

The default CREATE_ISSUES=false means the first run does NOT file any issues. After reviewing the first summary.md, opt in:

echo 'CREATE_ISSUES=true' >> ~/.local/share/k-skill-qa-bot/.env

Overridable variables (see config/defaults.sh):

Var	Default	Meaning
`CREATE_ISSUES`	`false`	File GH issues for failures
`CODEX_MODEL`	`gpt-5.5`	Model for skill exec
`JUDGE_MODEL`	`gpt-5.5`	Model for LLM judge
`CODEX_PROVIDER`	`openai`	Codex model provider for skill exec and judge calls
`TIMEOUT_SECS`	`180`	Per-skill timeout
`JUDGE_TIMEOUT_SECS`	`60`	Per-judge timeout
`MAX_PARALLEL`	`4`	Concurrent skill tests
`LAST_RUN_MIN_AGE`	`259200`	Min seconds between runs (72h)
`GH_REPO`	`NomaDamas/k-skill`	Where to file issues

config/skill-overrides.yml controls per-skill force_skip and category overrides. Destructive booking flows (ktx-booking, srt-booking, catchtable-sniper, etc.) and session-required skills (kakaotalk-mac, hipass-receipt, toss-securities, iros-registry-automation) are force-skipped by default so the bot never abuses an account.

Logs and inspection

tail -f ~/Library/Logs/k-skill-qa-bot/stderr.log
cat ~/.local/share/k-skill-qa-bot/state/runs/$(ls -t ~/.local/share/k-skill-qa-bot/state/runs/ | head -1)/summary.md

The bot keeps the most recent 12 runs and purges older ones.

Force a run

~/.local/share/k-skill-qa-bot/bin/run-qa.sh --force
~/.local/share/k-skill-qa-bot/bin/run-qa.sh --force --only kbo-results
~/.local/share/k-skill-qa-bot/bin/run-qa.sh --force --dry-run     # no issues regardless of CREATE_ISSUES

Uninstall

bash ~/.local/share/k-skill-qa-bot/uninstall.sh
bash ~/.local/share/k-skill-qa-bot/uninstall.sh --yes --purge --purge-logs

Safety

Skill smoke tests use --dangerously-bypass-approvals-and-sandbox because the Codex sandbox can block legitimate DNS/network lookups for public skill endpoints exercised by smoke tests.
A dedicated LaunchAgent is scheduling isolation only; it is not a separate OS user, container, or filesystem sandbox.
The bot-managed clone is not write-protected from the unsandboxed smoke agent; treat it as mutable bot state and judge only against inputs whose provenance is understood.
The LLM judge stays on the safer -s read-only path with approval_policy="never"; read-only/no-approval limits writes and approval prompts, but does not make the judge a no-tools or file-isolated model call. Treat transcript and skill Markdown as untrusted input.
10 destructive/login-required skills are force-skipped before any codex call is issued.
Deprecated skills (~~name~~ ⚠️ 지원 중단 in README) are detected and skipped.
update-clone.sh refuses any K_SKILL_CLONE outside K_QA_HOME/k-skill-clone unless ALLOW_EXTERNAL_CLONE_TARGET=1 (prevents the script from git-reset'ing the wrong directory).
CREATE_ISSUES=false first-run default prevents accidental issue spam.
Local state only: ~/.local/share/k-skill-qa-bot/. Expected network egress is limited to git fetch, codex API, gh API, k-skill-proxy health checks, and the public skill endpoints exercised by smoke tests.

Troubleshooting

codex: command not found → check the plist's EnvironmentVariables.PATH. Default is /opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin.
gh: not authenticated → run gh auth login with repo scope.
gtimeout: command not found → brew install coreutils.
LaunchAgent state via launchctl print "gui/$(id -u)/org.nomadamas.k-skill-qa-bot" | head.
Force a re-run: launchctl kickstart -k "gui/$(id -u)/org.nomadamas.k-skill-qa-bot".