mirror of
https://github.com/NomaDamas/k-skill.git
synced 2026-06-24 02:04:11 +00:00
* Add a guided Hola Poke Yeoksam skill without widening repo scope Issue #120 only needs a repository skill payload, discoverability docs, and regression coverage. This change adds the new skill, wires it into existing docs surfaces, and locks the remote-MCP-only contract in tests so future edits keep the phone-only event flow and verbatim message relay behavior. Constraint: The upstream Hola Poke flow lives on a remote MCP server, so this repo should not add proxy/runtime code Constraint: Tests must be written before refining the new docs/skill wording Rejected: Add local package or proxy support for Hola Poke | would over-scope a docs-only skill addition Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep this skill limited to 올라포케 역삼점 and treat the MCP response message as the event source of truth Tested: node --test scripts/skill-docs.test.js --test-name-pattern='hola-poke-yeoksam' Tested: npm run ci Tested: Live MCP initialize/tools/list/get_menu/get_shop_info/enter_event(phone_format) smoke checks against https://hola-poke-yeoksam-skill.onrender.com/mcp Not-tested: Successful live event entry with a real phone number * Help users find nearby public restrooms from Korean location queries This adds a new public-restroom-nearby skill and reusable package that resolves a user-provided location, narrows the official 공중화장실정보 dataset by region when possible, and ranks nearby restroom results with opening-time hints and map links. Constraint: Must use free official/open surfaces without introducing new dependencies Constraint: Must follow TDD and keep release/docs metadata aligned in the same change Rejected: Add a proxy route first | direct official CSV access already works and keeps scope narrower Rejected: Use nationwide-only ranking without regional narrowing | too much noisy data for dense urban anchors Confidence: high Scope-risk: moderate Reversibility: clean Directive: If Kakao place-panel or localdata CSV schema changes, update parser fixtures before broad logic changes Tested: npm run ci; live smoke via searchNearbyPublicRestroomsByLocationQuery('광화문', { limit: 3 }); architect review APPROVED Not-tested: Non-Seoul live smoke across every regional orgCode * Pin the Hola Poke MCP contract in repo-owned regression fixtures The earlier issue #120 regression only matched prose, so this follow-up records the verified remote MCP tool/result snapshot in a checked-in fixture and makes both docs surfaces byte-align to it. That keeps the discoverability docs honest while turning the review claim into a real contract lock for tools/list, get_menu, get_shop_info, and the invalid-phone event flow. Constraint: The upstream remote MCP server can change independently of this repo Rejected: Keep prose-only regex checks | would not catch contract drift Confidence: high Scope-risk: narrow Reversibility: clean Directive: Refresh the fixture, both JSON fences, and the live-smoke evidence together whenever the upstream contract changes Tested: node --test scripts/skill-docs.test.js --test-name-pattern='hola-poke-yeoksam'; npm run ci; live MCP smoke check against https://hola-poke-yeoksam-skill.onrender.com/mcp (initialize, tools/list, get_menu, get_shop_info, invalid enter_event) Not-tested: Successful enter_event with a real phone number (intentionally avoided to prevent live event participation) * Keep nearby restroom lookups resilient to flaky Kakao place panels The review caught two regressions in the new public-restroom-nearby package: a single broken Kakao panel aborted anchor resolution, and coordinate search dropped maxDistanceMeters before normalization. This change adds targeted regression coverage first, keeps per-candidate HTTP failures recoverable, and hardens request errors with explicit status/url metadata so fallback logic no longer depends on parsing error strings. Constraint: Must preserve the published package surface and keep the fix scoped to PR #123 follow-up Rejected: Swallow all panel errors | would hide non-HTTP failures like network faults Rejected: Parse request error messages for status codes | brittle coupling to string formatting Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep recoverable Kakao panel handling aligned with request() error annotations if request() changes again Tested: npm test --workspace public-restroom-nearby Tested: npm run ci Tested: live smoke searchNearbyPublicRestroomsByLocationQuery('광화문', { limit: 3 }) Tested: LSP diagnostics on packages/public-restroom-nearby/src/index.js and test/index.test.js Not-tested: Live Kakao fallback against a real upstream 5xx place-panel response * Keep the Hola Poke contract claims aligned with verified coverage The reviewed fixture-based regression already locks the documented remote snapshot, but the docs still implied the enter_event success path had live proof. Narrow the docs and the regression so they explicitly say the success fields are pinned by the recorded snapshot while the live smoke only verifies the invalid-phone retry path. Constraint: Live success-path verification would trigger a real event entry and is intentionally avoided Rejected: Leave the broader wording in place | review feedback showed it overstated the live evidence Confidence: high Scope-risk: narrow Reversibility: clean Directive: If a safe non-mutating success-path probe becomes available, update the docs and fixture wording together Tested: node --test scripts/skill-docs.test.js --test-name-pattern='hola-poke-yeoksam'; npm run ci; live MCP smoke against https://hola-poke-yeoksam-skill.onrender.com/mcp (initialize, tools/list, get_menu subset, get_shop_info subset, invalid enter_event) Not-tested: Real enter_event success-path invocation * Document the restroom distance-cap contract with regression coverage The approved issue-117 code fix already restored maxDistanceMeters behavior, but the published docs did not lock or explain that contract. This follow-up adds a failing-first doc regression, then updates the feature guide and package README with the verified 100m example so users and future reviewers see the same behavior the package now ships. Constraint: Must stay scoped to the existing PR #123 follow-up without reopening the implementation surface Rejected: Leave the behavior implicit in code/tests only | published docs would lag the verified contract Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep the public-restroom-nearby docs and skill-docs regression aligned with live maxDistanceMeters smoke evidence if the sample query changes Tested: node --test scripts/skill-docs.test.js (red then green) Tested: npm test --workspace public-restroom-nearby Tested: npm run ci Tested: live smoke searchNearbyPublicRestroomsByLocationQuery('광화문', { limit: 3 }) Tested: live smoke searchNearbyPublicRestroomsByLocationQuery('광화문', { limit: 3, maxDistanceMeters: 100 }) Tested: architect review APPROVED Not-tested: Alternative landmark queries with a non-zero maxDistanceMeters hit set * Expose KRX partial failures instead of misreporting stock lookups The Korean stock proxy used to silently drop failed market snapshots during search and could turn an empty holiday trade snapshot into a 502 by falling back into base-info lookup. This change surfaces degraded market metadata on partial search success, short-circuits empty trade snapshots to not_found, and refreshes the user docs to use a real trading day in examples. Constraint: KOSPI base-info approval is granted separately from other KRX routes Constraint: Healthy markets should still return usable search results during a partial outage Rejected: Return 502 on every partial search failure | hides still-usable markets and breaks current clients unnecessarily Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep degraded search metadata when any market snapshot fetch fails so partial outages stay visible Tested: npm test --workspace k-skill-proxy Tested: node --test scripts/skill-docs.test.js Tested: npm run ci Not-tested: Live KOSPI base-info behavior after the new KRX permission is approved * Adopt kordoc for the hwp skill workflow Issue #119 replaces the previous HWP guidance with kordoc so the skill matches the newer agent-native document flow. The docs and regression tests now center the HWP skill on kordoc parsing, JSON extraction, diffing, form filling, and Markdown-to-HWPX round-tripping, while the install/source references stay in sync. Constraint: The repository treats skill behavior as documentation contracts backed by regression tests Constraint: The requested branch/PR flow must target dev with TDD and verified execution evidence Rejected: Keep @ohah/hwpjs or hwp-mcp as fallback guidance | issue #119 explicitly approves replacing the prior stack with kordoc Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep future hwp skill/docs/tests aligned to a single kordoc-first contract unless a new issue explicitly reintroduces multi-backend routing Tested: node --test scripts/skill-docs.test.js; npm run ci; temp-dir kordoc roundtrip via markdownToHwpx -> sample.hwpx -> kordoc CLI markdown output; architect review APPROVED Not-tested: Live parsing of user-provided proprietary HWP/HWPX samples outside the generated roundtrip fixture * Prevent degraded stock search outages from sticking in cache Reviewer feedback showed that partial KRX market failures could be cached as full search answers, masking recovery on the next identical request. This change adds a regression that fails first, skips route-level caching for degraded search payloads, and keeps the trade-info empty-snapshot contract documented alongside the partial-failure response semantics. Constraint: Existing PR #124 already targets dev and must remain the follow-up lane for issue #99 Constraint: Proxy behavior must stay read-only and dependency-free Rejected: Cache degraded search payloads for a short TTL | still risks transient false negatives during the TTL window Rejected: Broaden trade-info fallback behavior | empty snapshots should stay explicit not_found results Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep degraded search responses out of the long-lived route cache unless a future design adds explicit revalidation semantics Tested: npm test --workspace k-skill-proxy; node --test scripts/skill-docs.test.js; npm run ci; explicit buildServer degraded-search recovery repro Not-tested: Live KRX production endpoints from this branch * Align HWP docs with the published kordoc surface The issue #119 follow-up needs the repository contract to match what the currently published kordoc package actually supports. This narrows the HWP skill/docs/tests to the verified install requirement and supported CLI/Node API surfaces, and removes unsupported fill/mcp claims. Constraint: Published kordoc CLI fails at startup without pdfjs-dist Constraint: Docs/tests must reflect the current npm package behavior, not intended future features Rejected: Keep fill/mcp examples with caveats | still documents unsupported entrypoints Confidence: high Scope-risk: narrow Directive: Reintroduce fill/mcp docs only after verifying the published package exposes them in both CLI and Node API Tested: node --test scripts/skill-docs.test.js; npm run ci; temp-dir clean install smoke; temp-dir kordoc+pdfjs-dist watch/parse/extractFormFields/compare/markdownToHwpx/roundtrip smoke; Claude architect review Not-tested: Real-world HWPX template that produces non-empty extractFormFields output * Keep HWP docs runnable against the published kordoc package The follow-up closes the last runnable-contract gaps from review by documenting the working one-shot npx form and separating Node API examples into a local project install path. The regression suite now locks both install notes so future edits do not drift back to broken command shapes. Constraint: Published kordoc CLI still requires pdfjs-dist at startup Constraint: Global NODE_PATH does not make ESM imports from kordoc resolvable in the documented examples Rejected: Keep bare `npx kordoc` examples | fails in a clean environment Rejected: Keep global-install Node API guidance | ESM import remains unresolved Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep HWP docs aligned to verified published kordoc surfaces until the package contract changes upstream Tested: node --test scripts/skill-docs.test.js Tested: npm run ci Tested: temp-dir local npm install kordoc pdfjs-dist plus markdownToHwpx -> sample.hwpx -> one-shot kordoc roundtrip smoke Not-tested: upstream unpublished kordoc features beyond the verified CLI and Node API surfaces * Add Korean scholarship search skill and reporting workflow (#116) * Add nationwide scholarship search skill workflow * Rename scholarship skill to 장학금 주세요 쮜에발 * Fix scholarship skill validation in CI * Trigger GitHub PR diff refresh after dev rebase on main * Fix scholarship helper status handling and test coverage * Use KST as scholarship helper default date basis * Rename scholarship skill display name --------- Co-authored-by: Jeffrey (Dongkyu) Kim <vkehfdl1@gmail.com> * Feature/#121 (#127) * Recover KakaoTalk mac skill auth when upstream user_id detection fails Issue #121 reproduces on a real MacBook because `kakaocli auth` can fail even when the encrypted hex-named DB exists. This change adds a thin repo-owned helper that recovers the active user_id from plist revision hashes, caches the validated DB/key tuple, and reuses it for read-only `kakaocli` commands. The skill and feature docs now steer users to the helper when upstream auto-detection stops at candidate key mismatch, and regression tests lock the recovery flow before the implementation. Constraint: Must stay a thin adapter around upstream kakaocli rather than forking the CLI Constraint: Must verify on a real local macOS KakaoTalk install where issue #121 reproduces Rejected: Full kakaocli reimplementation inside k-skill | too broad for the user_id/key-derivation failure scope Rejected: Docs-only workaround | does not actually fix the broken auth path for users Confidence: high Scope-risk: moderate Reversibility: clean Directive: Keep this helper limited to auth/key recovery and read-only passthrough unless upstream gaps widen materially Tested: python3 -m unittest scripts.test_kakaotalk_mac Tested: node --test scripts/skill-docs.test.js Tested: npm run ci Tested: python3 scripts/kakaotalk_mac.py auth --refresh --max-user-id 800000000 --workers 8 --chunk-size 2000000 Tested: python3 scripts/kakaotalk_mac.py chats --limit 1 --json Not-tested: Other kakaocli subcommands beyond auth/chats/messages/search/query/schema * Protect the KakaoTalk helper's safe recovery path Address the PR follow-up by treating malformed auth cache files as cache misses, removing write-capable passthrough from the wrapper surface, and redacting human-readable auth output so the cached SQLCipher key is not echoed back into terminal history. The docs and regression suite now describe and enforce the read-only contract that the helper is meant to preserve. Constraint: Helper must remain a read-only recovery wrapper around local kakaocli access Rejected: Keep query support with SQL validation | still leaves a risky write-capable escape hatch Confidence: high Scope-risk: narrow Reversibility: clean Directive: Do not re-expose arbitrary SQL passthrough or print the SQLCipher key in default text output Tested: python3 -m unittest scripts.test_kakaotalk_mac; node --test scripts/skill-docs.test.js; npm run ci; python3 scripts/kakaotalk_mac.py auth --refresh --max-user-id 800000000 --workers 8 --chunk-size 2000000; python3 scripts/kakaotalk_mac.py chats --limit 1 --json; python3 scripts/kakaotalk_mac.py auth --cache-path <bad-json>; python3 scripts/kakaotalk_mac.py query --help Not-tested: External automation consumers that depend on shell/json auth output beyond the documented helper flows * Lock the helper CLI surface against accidental regressions The approved issue #121 fixes already hardened the KakaoTalk Mac helper, but the test suite still only exercised the passthrough validator directly. Add an explicit parser-level regression so the public CLI contract stays read-only and `query` cannot quietly reappear in future edits. Constraint: Follow-up is on the existing feature/#121 PR branch and must stay minimal Rejected: Re-open helper implementation changes | current code already satisfies the approved review findings Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep parser exposure tests aligned with READ_ONLY_COMMANDS whenever helper subcommands change Tested: python3 -m unittest scripts.test_kakaotalk_mac; node --test scripts/skill-docs.test.js; npm run ci; python3 scripts/kakaotalk_mac.py auth --refresh --max-user-id 800000000 --workers 8 --chunk-size 2000000; python3 scripts/kakaotalk_mac.py chats --limit 1 --json; python3 scripts/kakaotalk_mac.py auth --cache-path <bad-json> Not-tested: No new production code paths changed in this follow-up * Honor explicit Kakao auth recovery overrides The helper now treats manual auth overrides as a cache-bypassing recovery request and rejects invalid brute-force tuning flags at the CLI boundary so users get deterministic behavior instead of stale cached tuples or Python tracebacks. Regression coverage locks both paths before the PR follow-up lands. Constraint: The helper must remain a thin read-only wrapper around kakaocli auth recovery Rejected: Require --refresh whenever --user-id/--uuid is passed | worse UX than honoring overrides directly Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep explicit auth overrides ahead of cache reuse unless the CLI contract is redesigned and documented Tested: python3 -m unittest scripts.test_kakaotalk_mac; node --test scripts/skill-docs.test.js; npm run ci; python3 scripts/kakaotalk_mac.py auth --refresh --max-user-id 800000000 --workers 8 --chunk-size 2000000; python3 scripts/kakaotalk_mac.py chats --limit 1 --json; python3 scripts/kakaotalk_mac.py auth --cache-path <bad-json>; python3 scripts/kakaotalk_mac.py auth --refresh --max-user-id -1; python3 scripts/kakaotalk_mac.py auth --refresh --workers 2 --chunk-size 0 --max-user-id 10; python3 scripts/kakaotalk_mac.py auth --cache-path <temp-cache> --user-id 999; python3 scripts/kakaotalk_mac.py auth --cache-path <temp-cache> --uuid <live-uuid> Not-tested: Manual override success with a truly alternate valid user_id/uuid pair on a multi-account local install * Feature/#129 (#131) * Add official KBL results support so basketball queries use live league data Issue #129 needs a read-only skill and reusable package for KBL schedules, results, and standings. The implementation follows the existing sports package pattern and uses the league's live JSON APIs after verifying they respond successfully in real requests. Constraint: Must use official KBL JSON surfaces before considering scraping Constraint: Packaging changes must pass npm run ci and include docs plus Changesets updates Rejected: Browser scraping first | official api.kbl.or.kr endpoints are live and simpler to maintain Rejected: Reuse KBO/K League package shapes verbatim | KBL payload and team/status fields differ materially Confidence: high Scope-risk: moderate Reversibility: clean Directive: Keep seasonGrade=1 as the default KBL path unless future docs/tests explicitly widen to D-League flows Tested: npm run ci; npm run lint --workspace kbl-results; npm test --workspace kbl-results; live getKBLSummary("2026-04-01", { team: "KCC", includeStandings: true }) Not-tested: Historical standings snapshots for past seasons via alternative KBL endpoints * Prevent optional standings lookups from over-fetching the KBL API The new kbl-results summary helper exposes includeStandings=false, so the regression suite now proves that path stays schedule-only and never calls the standings endpoint when the caller opts out. Constraint: The KBL package should preserve the caller's no-standings contract Rejected: Rely on manual inspection of the helper options | a targeted test is cheaper and safer Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep includeStandings=false side-effect free unless the public API contract changes explicitly Tested: npm test --workspace kbl-results; npm run lint --workspace kbl-results Not-tested: Full-repo CI before stacking this commit onto the rebased branch --------- Co-authored-by: minsing-jin <ironman0722@naver.com>
7 KiB
7 KiB
| name | description | license | metadata | ||||||
|---|---|---|---|---|---|---|---|---|---|
| hwp | Use kordoc for agent-native HWP/HWPX document parsing, JSON extraction, diffing, form-field extraction, and Markdown→HWPX reverse conversion. | MIT |
|
HWP
What this skill does
kordoc으로 .hwp / .hwpx / .hwpml 문서를 AI가 읽기 좋은 Markdown 또는 JSON으로 바꾸고,
필요하면 문서 비교, 양식 필드 추출, Markdown→HWPX 역변환까지 수행한다.
이 스킬의 기본 엔진은 항상 kordoc 이다. 문서 변환, 비교, 필드 추출, 역변환까지 같은 도구로 일관되게 처리한다.
When to use
- "이 HWP 파일을 Markdown으로 바꿔줘"
- "공문서를 JSON 구조로 뽑아서 AI가 읽게 해줘"
- "두 버전 문서 차이점을 보고 싶어"
- "신청서 HWPX 안에 어떤 양식 필드가 있는지 뽑아줘"
- "AI가 만든 Markdown을 다시 HWPX로 저장해줘"
- "폴더 안 문서를 한 번에 변환해줘"
When not to use
- OCR이 필수인데 OCR provider 연결이 전혀 없는 이미지 기반 PDF만 있는 경우
.docx,.xlsx,.pdf만 다루더라도 문서 파싱 자체가 아니라 편집기 GUI 자동화가 필요한 경우- 원본 프로그램의 실시간 UI 제어가 반드시 필요한 경우
Prerequisites
- Node.js 18+
- 출력 경로 쓰기 권한
kordoc과pdfjs-dist를 같은 전역/로컬 환경에 설치했거나, 둘 다 포함된npx --yes --package kordoc --package pdfjs-dist kordoc ...실행 환경- 현재 배포된
kordocCLI는 시작 시pdfjs-dist를 바로 로드하므로 PDF를 안 써도 함께 설치해야 한다
Inputs
- 원본
.hwp,.hwpx,.hwpml파일 경로 또는 폴더/글롭 경로 - 원하는 결과 형태:
markdown,json,hwpx - 출력 파일/디렉터리 경로
- 페이지 범위 지정 여부
- 비교 / 양식 필드 추출 / 역변환 여부
Routing policy
Default: kordoc
다음 작업은 모두 기본적으로 kordoc으로 처리한다.
- HWP/HWPX/HWPML → Markdown
- HWP/HWPX/HWPML → JSON (
blocks,metadata) - 배치 변환
- 페이지 범위 파싱
- 이미지/표/양식이 포함된 공문서 구조 추출
- 디렉터리 감시 변환 (
watch) - Markdown→HWPX 역변환
- HWPX 양식 필드 추출
Optional library path
CLI만으로 부족하면 Node API를 사용한다.
parse()— Markdown + 구조화 블록compare()— 신구 문서 비교extractFormFields()— 파싱된 블록에서 양식 필드 추출markdownToHwpx()— Markdown→HWPX 역변환
Workflow
1. Prepare the CLI runtime
일회성 변환이면 둘 다 포함한 npx 형태를 바로 쓴다.
npx --yes --package kordoc --package pdfjs-dist kordoc --help
반복 실행용 전역 설치가 필요하면:
npm install -g kordoc pdfjs-dist
현재 배포된 kordoc CLI는 pdfjs-dist가 없으면 kordoc --help 단계부터 실패하므로
깨끗한 환경에서는 두 패키지를 같이 설치한 뒤 실행한다.
2. Prepare a local project for Node API examples
parse(), compare(), extractFormFields(), markdownToHwpx() 같은 ESM 예시는
전역 NODE_PATH가 아니라 로컬 프로젝트 설치 기준으로 실행한다.
mkdir -p ./kordoc-local && cd ./kordoc-local
npm init -y
npm install kordoc pdfjs-dist
이미 package.json이 있는 작업 디렉터리라면 npm install kordoc pdfjs-dist만 추가로 실행하면 된다.
3. Convert a document to Markdown
npx --yes --package kordoc --package pdfjs-dist kordoc 보고서.hwp -o 보고서.md
여러 문서를 한 번에 처리하려면:
npx --yes --package kordoc --package pdfjs-dist kordoc ./문서함/* -d ./변환결과
특정 페이지 범위만 읽고 싶으면:
npx --yes --package kordoc --package pdfjs-dist kordoc 보고서.hwp --pages 1-3
4. Extract structured JSON for AI/automation
npx --yes --package kordoc --package pdfjs-dist kordoc 검토서.hwpx --format json > 검토서.json
JSON 결과에서는 success, markdown, blocks, metadata를 우선 확인한다.
표나 이미지가 중요하면 blocks 안의 table, image 타입을 확인한다.
5. Inspect HWPX form fields from parsed blocks
node --input-type=module - <<'EOF'
import { parse, extractFormFields } from "kordoc";
const result = await parse("신청서.hwpx");
if (!result.success) {
console.error(result.error);
process.exit(1);
}
const fields = extractFormFields(result.blocks);
console.log(JSON.stringify(fields, null, 2));
EOF
자동 변환이 계속 들어오는 폴더면 CLI의 watch 명령을 쓴다.
npx --yes --package kordoc --package pdfjs-dist kordoc watch ./문서함
6. Reverse-convert Markdown back to HWPX
node --input-type=module - <<'EOF'
import { markdownToHwpx } from "kordoc";
import { writeFileSync } from "node:fs";
const hwpx = await markdownToHwpx("# 제목\n\n본문\n\n| 항목 | 값 |\n| --- | --- |\n| 성명 | 홍길동 |");
writeFileSync("출력.hwpx", Buffer.from(hwpx));
EOF
7. Compare two document versions when diff matters
node --input-type=module - <<'EOF'
import { compare } from "kordoc";
import { readFileSync } from "node:fs";
const before = readFileSync("이전버전.hwp");
const after = readFileSync("최신버전.hwpx");
const diff = await compare(before, after);
console.log(diff.stats);
EOF
Verify outputs after every run
- Markdown: 파일이 생성되었고 제목/본문/표 구조가 깨지지 않았는지 확인
- JSON:
success: true와blocks/metadata존재 여부 확인 - 배치 처리: 입력 수와 출력 수가 크게 어긋나지 않는지 확인
- 양식 필드 추출:
extractFormFields(result.blocks)결과가 비어 있지 않은지 확인 - 역변환: 생성된
.hwpx파일이 열리고 기본 서식/테이블 구조가 유지되는지 확인 - 비교:
diff.stats에 added / removed / modified 값이 합리적인지 확인
Done when
- 요청한 Markdown / JSON / HWPX 결과물이 생성되어 있다
- 공문서 표·이미지·메타데이터가 필요한 수준으로 확인되어 있다
- 양식 필드 추출이나 역변환 요청이 있었다면 결과/출력 구조까지 검증되어 있다
- 배치 요청이면 처리 범위와 실패 건수가 정리되어 있다
Failure modes
- 손상된 HWP/HWPX/HWPML 파일
- 암호화/배포 제한 문서에서 일부 파싱 한계 발생
- 이미지 기반 PDF인데 OCR provider가 없음
- 출력 디렉터리 권한 부족
- 양식 라벨이 템플릿 안에서 예상과 다르게 배치되어 일부 필드가 인식되지 않음
Notes
kordoc은 HWP/HWPX뿐 아니라 HWPML, PDF, XLSX, DOCX도 함께 다룬다.- 기본 목적은 AI가 읽을 수 있는 Markdown/JSON 변환 이다.
- 현재 배포본 기준으로 문서화된 CLI 명령은 기본 변환과
watch이며, 양식 처리는extractFormFields()같은 Node API로 연결한다.