Commit graph

1 commit

Author SHA1 Message Date
Jeffrey (Dongkyu) Kim
b2eedd827b Standardize Korean text counts for form-limited writing
Issue #65 needs deterministic Korean character, line, and byte counting for self-intros and similar forms, so this adds a bundled helper plus skill/docs/tests around a fixed grapheme/UTF-8 contract with an explicit NEIS compatibility profile.

Constraint: Must avoid LLM estimation and keep results deterministic across repeated runs
Constraint: No new dependencies; rely on Node 18+ Intl.Segmenter and Buffer.byteLength
Rejected: Per-request ad hoc scripts only | contract drift would make repeated counts inconsistent
Rejected: Legacy encoding heuristics as the default | modern UTF-8 contract is safer unless the target explicitly says otherwise
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep  and  profile semantics in sync with docs/tests before changing counting rules
Tested: node --test scripts/skill-docs.test.js scripts/test_korean_character_count.js; helper smoke via text/file/stdin; npm test; npm run ci; LSP diagnostics on changed JS files
Not-tested: Proprietary site-specific counters beyond the documented default/NEIS contracts
2026-04-08 22:55:31 +09:00