mirror of
https://github.com/NomaDamas/k-skill.git
synced 2026-06-24 02:04:11 +00:00
commit
9287ce1418
12 changed files with 1145 additions and 1 deletions
5
.changeset/quick-lamps-search.md
Normal file
5
.changeset/quick-lamps-search.md
Normal file
|
|
@ -0,0 +1,5 @@
|
|||
---
|
||||
"daishin-report-search": minor
|
||||
---
|
||||
|
||||
Add a Daishin Securities report search skill backed by the public GitHub Pages report mirror.
|
||||
|
|
@ -55,6 +55,7 @@ Claude Code, Codex, OpenCode, OpenClaw/ClawHub 등 각종 코딩 에이전트
|
|||
| 식품 안전 체크 | `mfds-food-safety` | 식약처 부적합 식품·식품안전나라 회수 정보를 인터뷰-first 흐름으로 프록시 조회 | 불필요 | [식품 안전 체크 가이드](docs/features/mfds-food-safety.md) |
|
||||
| 한국 주식 정보 조회 | `korean-stock-search` | KRX 상장 종목 검색, 기본정보, 일별 시세 조회 | 불필요 | [한국 주식 정보 조회 가이드](docs/features/korean-stock-search.md) |
|
||||
| 금감원 DART 전자공시 조회 | `k-dart` | 공시검색, 기업개황, 재무제표, 배당, 증자/감자, 감사의견, 주요사항보고서 등 14개 endpoint | 필요 | [금감원 DART 전자공시 조회 가이드](docs/features/k-dart.md) |
|
||||
| 대신증권 리포트 조회 | `daishin-report-search` | GitHub Pages에 공개된 대신증권 리포트 HTML 미러에서 최신 리포트 목록, 원문, 설명 페이지, Rating/Target 표를 조회 | 불필요 | [대신증권 리포트 조회 가이드](docs/features/daishin-report-search.md) |
|
||||
| 국가데이터처 KOSIS 통계 조회 | `kosis-stats` | 국가데이터처가 운영하는 KOSIS(국가통계포털) Open API로 통계표 검색·메타·데이터·대용량 자료 조회 (조회 전용) | 일반 조회 불필요 (`bigdata`/`--direct` 필요) | [국가데이터처 KOSIS 통계 조회 가이드](docs/features/kosis-stats.md) |
|
||||
| 조선왕조실록 검색 | `joseon-sillok-search` | 조선왕조실록 키워드 검색과 왕별/연도별 필터, 기사 발췌 조회 | 불필요 | [조선왕조실록 검색 가이드](docs/features/joseon-sillok-search.md) |
|
||||
| 한국 특허 정보 검색 | `korean-patent-search` | 한국 특허/실용신안 키워드 검색 및 출원번호 상세 조회 | 필요 | [한국 특허 정보 검색 가이드](docs/features/korean-patent-search.md) |
|
||||
|
|
@ -174,6 +175,7 @@ Claude Code, Codex, OpenCode, OpenClaw/ClawHub 등 각종 코딩 에이전트
|
|||
- [K리그 경기 결과 조회](docs/features/kleague-results.md)
|
||||
- [LCK 경기 분석 가이드](docs/features/lck-analytics.md)
|
||||
- [토스증권 조회 가이드](docs/features/toss-securities.md)
|
||||
- [대신증권 리포트 조회 가이드](docs/features/daishin-report-search.md)
|
||||
- [하이패스 영수증 발급 가이드](docs/features/hipass-receipt.md)
|
||||
- [캐치테이블 예약 스나이핑 가이드](docs/features/catchtable-sniper.md)
|
||||
- [공연 일정·잔여석 조회 가이드](docs/features/ticket-availability.md)
|
||||
|
|
|
|||
148
daishin-report-search/SKILL.md
Normal file
148
daishin-report-search/SKILL.md
Normal file
|
|
@ -0,0 +1,148 @@
|
|||
---
|
||||
name: daishin-report-search
|
||||
description: 대신증권 리포트 GitHub Pages 미러에서 최신 HTML 리포트 목록과 원문/설명 페이지를 조회한다.
|
||||
license: MIT
|
||||
metadata:
|
||||
category: finance
|
||||
locale: ko-KR
|
||||
phase: v1
|
||||
---
|
||||
|
||||
# Daishin Report Search
|
||||
|
||||
## What this skill does
|
||||
|
||||
대신증권 리포트 HTML 미러(`jay-jo-0/github_pages_repo`)에서 최신 리포트 목록을 찾고, 특정 리포트의 원문 텍스트·제목·헤딩·Rating/Target 표·원문 링크를 에이전트가 재사용하기 쉬운 JSON으로 반환한다.
|
||||
|
||||
이 스킬은 투자 조언, 매매 자동화, 추천을 하지 않는다. 공개 HTML 리포트를 읽어 요약 가능한 자료로 정리하는 조회 전용 스킬이다.
|
||||
|
||||
## When to use
|
||||
|
||||
- "대신증권 최신 리포트 보여줘"
|
||||
- "대신증권 반도체 리포트 찾아줘"
|
||||
- "20260511082352 리포트 원문과 설명 페이지를 읽어줘"
|
||||
- "대신증권 리포트 목록을 에이전트가 쓰기 좋은 JSON으로 줘"
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- 인터넷 연결
|
||||
- Node.js 18+
|
||||
- 이 저장소의 `daishin-report-search` npm package 또는 동일 로직
|
||||
|
||||
## Public access path discovered
|
||||
|
||||
### Primary source: GitHub recursive tree API
|
||||
|
||||
- list endpoint: `https://api.github.com/repos/jay-jo-0/github_pages_repo/git/trees/main?recursive=1`
|
||||
- selected paths: repository-root files matching `YYYYMMDDHHMMSS.html`
|
||||
- optional companion paths: `YYYYMMDDHHMMSS_explain.html`
|
||||
- detail raw HTML: `https://raw.githubusercontent.com/Jay-jo-0/github_pages_repo/main/<path>`
|
||||
- browser detail URL: `https://jay-jo-0.github.io/github_pages_repo/<path>`
|
||||
- reason selected: the sample GitHub Pages URL maps directly to a public GitHub repository. The recursive tree API exposes all timestamped HTML filenames without relying on a brittle directory listing screen scrape. Raw GitHub URLs provide stable unauthenticated detail fetches.
|
||||
|
||||
### Fallback source: GitHub contents API for an exact file
|
||||
|
||||
- exact-file endpoint: `https://api.github.com/repos/jay-jo-0/github_pages_repo/contents/<path>?ref=main`
|
||||
- used automatically for a known timestamp when the raw detail URL is unavailable; it also provides GitHub content metadata for manual diagnostics.
|
||||
|
||||
No `k-skill-proxy` route is used because the upstream is public and does not require an API key.
|
||||
|
||||
## Workflow
|
||||
|
||||
### 1. List latest reports
|
||||
|
||||
```js
|
||||
const { listReports } = require("daishin-report-search")
|
||||
|
||||
const result = await listReports({
|
||||
limit: 10,
|
||||
query: "반도체", // optional; matches title/headings/detail text
|
||||
maxInspect: 100, // optional query crawl budget among newest pages
|
||||
githubToken: process.env.GITHUB_TOKEN // optional; raises GitHub API limits when caller has one
|
||||
})
|
||||
|
||||
console.log(result.items)
|
||||
```
|
||||
|
||||
CLI:
|
||||
|
||||
```bash
|
||||
node packages/daishin-report-search/src/cli.js --limit 10
|
||||
node packages/daishin-report-search/src/cli.js 반도체 --limit 5 --max-inspect 100
|
||||
```
|
||||
|
||||
Return each item with:
|
||||
|
||||
- `id` (`YYYYMMDDHHMMSS`)
|
||||
- `date`, `time`, `timestamp` (filename-derived KST timestamp)
|
||||
- `title`
|
||||
- `headings`
|
||||
- `excerpt`
|
||||
- `ratingTargets` when a Rating/Target table is present
|
||||
- `pageUrl`, `rawUrl`, `apiUrl`
|
||||
- `hasExplain`, `explainUrl` when a companion explanation page exists
|
||||
|
||||
### 2. Fetch one report
|
||||
|
||||
```js
|
||||
const { fetchReport } = require("daishin-report-search")
|
||||
|
||||
const report = await fetchReport("20260511082352", {
|
||||
includeExplain: true
|
||||
})
|
||||
|
||||
console.log(report.title)
|
||||
console.log(report.text)
|
||||
console.log(report.explain?.text)
|
||||
```
|
||||
|
||||
CLI:
|
||||
|
||||
```bash
|
||||
node packages/daishin-report-search/src/cli.js --id 20260511082352 --include-explain
|
||||
```
|
||||
|
||||
### 3. Summarize conservatively
|
||||
|
||||
When answering a user, show:
|
||||
|
||||
```text
|
||||
- 제목: ...
|
||||
게시 추정 시각: 2026-05-11 08:23:52 KST (파일명 기준)
|
||||
주요 헤딩: ...
|
||||
Rating/Target: ... (있는 경우)
|
||||
원문: https://jay-jo-0.github.io/github_pages_repo/...
|
||||
설명 페이지: ... (있는 경우)
|
||||
```
|
||||
|
||||
Always state that the timestamp is filename-derived and that report contents can change in the public mirror.
|
||||
|
||||
## Fallback order
|
||||
|
||||
1. GitHub recursive tree API → filter timestamped root HTML files → sort newest filename first → fetch raw detail HTML for selected/latest candidates.
|
||||
2. If a query is present, inspect newer candidates up to `maxInspect` until enough matches are found or the budget is exhausted; return a warning if the budget is exhausted.
|
||||
3. For a known id, fetch raw detail directly. If explanation is requested, fetch `<id>_explain.html`; if absent, return the original report plus a warning.
|
||||
4. If the tree endpoint is truncated, blocked, rate-limited, or changed, report that as a source warning/failure instead of guessing hidden pages.
|
||||
5. For a known id, if the raw detail URL fails, fall back to the GitHub contents API for that exact file path. Explanation pages use the same exact-file fallback but remain optional and return a warning if unavailable.
|
||||
6. If the caller has authenticated GitHub access, pass `githubToken` / `githubHeaders` in library calls or set `DAISHIN_GITHUB_TOKEN` / `GITHUB_TOKEN` for the CLI; these credentials are scoped to `api.github.com` requests and are not sent to raw detail URLs. Do not require or proxy a token by default.
|
||||
|
||||
## Done when
|
||||
|
||||
- Latest report rows or a specific report are returned with direct source URLs.
|
||||
- Query and limit were applied or explicitly left broad.
|
||||
- Explanation pages were included only when requested or when listing metadata shows they exist.
|
||||
- Empty results and upstream warnings are disclosed.
|
||||
|
||||
## Failure modes
|
||||
|
||||
- GitHub unauthenticated API rate limits can return 403/429; latest/search returns empty `items` plus `source.error.kind = "rate_limit"` and rate-limit reset metadata when GitHub exposes it. Retry later or use caller-supplied authenticated GitHub access if appropriate.
|
||||
- The repository path or branch can change; then tree/raw URLs will fail.
|
||||
- The tree response could become truncated; in that case the latest-list completeness is not guaranteed.
|
||||
- HTML structure can change; title/headings/table extraction may be partial, but URLs and raw text fallback should still be returned when available.
|
||||
- Some pages may not be authored by Daishin even though they are in the issue-scoped public mirror. Do not infer provenance beyond page title/content.
|
||||
|
||||
## Notes
|
||||
|
||||
- Read-only lookup only; no login, trading, order placement, recommendation, or investment advice.
|
||||
- Do not scrape private Daishin services or bypass CAPTCHA/login walls.
|
||||
- No secrets or API keys are required. Optional GitHub tokens are caller-owned, used only when explicitly supplied via options or environment, and scoped to GitHub API hosts.
|
||||
45
docs/features/daishin-report-search.md
Normal file
45
docs/features/daishin-report-search.md
Normal file
|
|
@ -0,0 +1,45 @@
|
|||
# 대신증권 리포트 조회 가이드
|
||||
|
||||
`daishin-report-search`는 `jay-jo-0/github_pages_repo` GitHub Pages 미러에 올라오는 대신증권 리포트 HTML을 최신순으로 찾고 원문/설명 페이지를 JSON으로 정리하는 조회 전용 스킬이다.
|
||||
|
||||
## 공개 접근 경로
|
||||
|
||||
- 목록: `https://api.github.com/repos/jay-jo-0/github_pages_repo/git/trees/main?recursive=1`
|
||||
- 원문 HTML: `https://raw.githubusercontent.com/Jay-jo-0/github_pages_repo/main/<YYYYMMDDHHMMSS.html>`
|
||||
- exact-file fallback: `https://api.github.com/repos/jay-jo-0/github_pages_repo/contents/<YYYYMMDDHHMMSS.html>?ref=main`
|
||||
- 브라우저 URL: `https://jay-jo-0.github.io/github_pages_repo/<YYYYMMDDHHMMSS.html>`
|
||||
- 설명 페이지: `<YYYYMMDDHHMMSS_explain.html>`이 있을 때만 제공
|
||||
|
||||
파일명 timestamp를 KST 게시 추정 시각으로 표시한다. GitHub API와 raw 파일은 공개 unauthenticated endpoint라서 proxy를 쓰지 않는다.
|
||||
|
||||
## 사용 예시
|
||||
|
||||
```bash
|
||||
node packages/daishin-report-search/src/cli.js --limit 10
|
||||
GITHUB_TOKEN=... node packages/daishin-report-search/src/cli.js --limit 10
|
||||
node packages/daishin-report-search/src/cli.js 반도체 --limit 5 --max-inspect 100
|
||||
node packages/daishin-report-search/src/cli.js --id 20260511082352 --include-explain
|
||||
```
|
||||
|
||||
```js
|
||||
const { listReports, fetchReport } = require("daishin-report-search")
|
||||
|
||||
const latest = await listReports({ limit: 10 })
|
||||
const semis = await listReports({ query: "반도체", limit: 5, maxInspect: 100 })
|
||||
const withToken = await listReports({ githubToken: process.env.GITHUB_TOKEN })
|
||||
const detail = await fetchReport("20260511082352", { includeExplain: true })
|
||||
```
|
||||
|
||||
## 출력 필드
|
||||
|
||||
목록 항목은 `id`, `date`, `time`, `timestamp`, `title`, `headings`, `excerpt`, `ratingTargets`, `pageUrl`, `rawUrl`, `apiUrl`, `hasExplain`, `explainUrl`을 포함한다.
|
||||
|
||||
상세 조회는 원문 `text`를 추가하고, `includeExplain`이 켜져 있으면 `explain` 객체에 설명 페이지의 `title`, `headings`, `text`, `excerpt`, `pageUrl`을 포함한다.
|
||||
|
||||
## 주의 사항
|
||||
|
||||
- 투자 판단이나 매매 추천이 아니라 공개 리포트 조회 보조 기능이다.
|
||||
- GitHub unauthenticated API rate limit, upstream repository 변경, HTML 구조 변경 시 경고나 오류가 반환될 수 있다. 목록 조회의 GitHub tree API가 403/429로 막히면 예외 대신 빈 `items`와 `source.error`/rate-limit metadata를 반환한다.
|
||||
- API limit을 높여야 할 때는 caller-owned `githubToken`/`githubHeaders` 옵션 또는 CLI 환경변수 `DAISHIN_GITHUB_TOKEN`/`GITHUB_TOKEN`을 사용할 수 있다. 이 값은 GitHub API host(tree discovery와 exact-file fallback)에만 전송되고 raw 원문 URL에는 전송되지 않는다. 기본 동작에는 토큰이나 proxy가 필요 없다.
|
||||
- 상세 조회는 raw 원문 URL을 먼저 읽고, 실패하면 알려진 timestamp 경로의 GitHub contents API로 fallback한다.
|
||||
- 검색어가 있으면 최신 파일부터 `maxInspect`개까지 원문을 읽어 매칭하므로 너무 낮게 잡으면 결과가 누락될 수 있다.
|
||||
|
|
@ -66,6 +66,7 @@ npx --yes skills add <owner/repo> \
|
|||
--skill real-estate-search \
|
||||
--skill korean-scholarship-search \
|
||||
--skill korean-stock-search \
|
||||
--skill daishin-report-search \
|
||||
--skill household-waste-info \
|
||||
--skill mfds-drug-safety \
|
||||
--skill mfds-food-safety \
|
||||
|
|
|
|||
14
package-lock.json
generated
14
package-lock.json
generated
|
|
@ -612,6 +612,10 @@
|
|||
"node": ">= 8"
|
||||
}
|
||||
},
|
||||
"node_modules/daishin-report-search": {
|
||||
"resolved": "packages/daishin-report-search",
|
||||
"link": true
|
||||
},
|
||||
"node_modules/daiso-product-search": {
|
||||
"resolved": "packages/daiso-product-search",
|
||||
"link": true
|
||||
|
|
@ -1759,6 +1763,16 @@
|
|||
"rebrowser-playwright": ">=1.40.0"
|
||||
}
|
||||
},
|
||||
"packages/daishin-report-search": {
|
||||
"version": "0.1.0",
|
||||
"license": "MIT",
|
||||
"bin": {
|
||||
"daishin-report-search": "src/cli.js"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=18"
|
||||
}
|
||||
},
|
||||
"packages/daiso-product-search": {
|
||||
"version": "0.2.0",
|
||||
"license": "MIT",
|
||||
|
|
|
|||
|
|
@ -13,7 +13,7 @@
|
|||
"lint": "node --check scripts/skill-docs.test.js scripts/korean_character_count.js scripts/test_korean_character_count.js scripts/build-manus-bundle.js scripts/test_build_manus_bundle.js && python3 -m py_compile scripts/k_skill_cleaner.py scripts/test_k_skill_cleaner.py corporate-registration-consulting/scripts/fill_official_hwp.py k-skill-cleaner/scripts/k_skill_cleaner.py scripts/fine_dust.py scripts/test_fine_dust.py scripts/ktx_booking.py scripts/test_ktx_booking.py scripts/sillok_search.py scripts/test_sillok_search.py scripts/korean_spell_check.py scripts/test_korean_spell_check.py scripts/patent_search.py scripts/test_patent_search.py scripts/mfds_drug_safety.py scripts/test_mfds_drug_safety.py scripts/nts_business_registration.py scripts/test_nts_business_registration.py scripts/mfds_food_safety.py scripts/test_mfds_food_safety.py scripts/zipcode_search.py scripts/test_zipcode_search.py scripts/subway_lost_property.py scripts/test_subway_lost_property.py scripts/geeknews_search.py scripts/test_geeknews_search.py nts-business-registration/scripts/nts_business_registration.py scripts/test_naver_blog_search.py scripts/test_korean_slang_writing.py scripts/kakaotalk_mac.py scripts/test_kakaotalk_mac.py scripts/test_coupang_partners_mcp_wrapper.py scripts/ticket_availability.py scripts/test_ticket_availability.py ticket-availability/scripts/ticket_availability.py coupang-product-search/scripts/coupang_partners_mcp.py kakaotalk-mac/scripts/kakaotalk_mac.py naver-blog-research/scripts/_naver_http.py naver-blog-research/scripts/naver_search.py naver-blog-research/scripts/naver_read.py naver-blog-research/scripts/naver_download_images.py korean-slang-writing/scripts/_slang_http.py korean-slang-writing/scripts/slang_search.py korean-slang-writing/scripts/slang_lookup.py korean-scholarship-search/scripts/scholarship_filter.py korean-scholarship-search/scripts/test_scholarship_filter.py korean-scholarship-search/scripts/university_search_plan.py danawa-price-search/scripts/danawa_search.py kosis-stats/scripts/run_kosis_stats.py kosis-stats/tests/test_run_kosis_stats.py intercity-bus-booking/scripts/intercity_bus_search.py daangn-used-goods-search/scripts/daangn_used_goods.py daangn-realty-search/scripts/daangn_realty.py daangn-jobs-search/scripts/daangn_jobs.py daangn-cars-search/scripts/daangn_cars.py && npm run lint --workspaces --if-present && ./scripts/validate-skills.sh",
|
||||
"typecheck": "tsc --noEmit",
|
||||
"test": "node --test scripts/skill-docs.test.js scripts/test_korean_character_count.js scripts/test_build_manus_bundle.js && PYTHONPATH=.:scripts python3 -m unittest scripts.test_k_skill_cleaner scripts.test_fine_dust scripts.test_ktx_booking scripts.test_sillok_search scripts.test_korean_spell_check scripts.test_patent_search scripts.test_mfds_drug_safety scripts.test_nts_business_registration scripts.test_mfds_food_safety scripts.test_zipcode_search scripts.test_subway_lost_property scripts.test_geeknews_search scripts.test_naver_blog_search scripts.test_korean_slang_writing scripts.test_kakaotalk_mac scripts.test_coupang_partners_mcp_wrapper scripts.test_ticket_availability && PYTHONPATH=.:scripts:korean-scholarship-search/scripts python3 -m unittest discover -s korean-scholarship-search/scripts -p 'test_scholarship_filter.py' && PYTHONPATH=.:scripts:kosis-stats/scripts python3 -m unittest discover -s kosis-stats/tests -p 'test_run_kosis_stats.py' && npm run test --workspaces --if-present && ./scripts/validate-skills.sh",
|
||||
"pack:dry-run": "npm pack --workspace k-lotto --dry-run && npm pack --workspace daiso-product-search --dry-run && npm pack --workspace market-kurly-search --dry-run && npm pack --workspace blue-ribbon-nearby --dry-run && npm pack --workspace kakao-bar-nearby --dry-run && npm pack --workspace cheap-gas-nearby --dry-run && npm pack --workspace public-restroom-nearby --dry-run && npm pack --workspace parking-lot-search --dry-run && npm pack --workspace court-auction-notice-search --dry-run && npm pack --workspace donation-place-search --dry-run && npm pack --workspace gongsijiga-search --dry-run && npm pack --workspace kbl-results --dry-run && npm pack --workspace kleague-results --dry-run && npm pack --workspace lck-analytics --dry-run && npm pack --workspace toss-securities --dry-run && npm pack --workspace hipass-receipt --dry-run && npm pack --workspace used-car-price-search --dry-run && npm pack --workspace k-skill-rhwp --dry-run && npm pack --workspace korean-marathon-schedule --dry-run && npm pack --workspace gangnamunni-clinic-search --dry-run",
|
||||
"pack:dry-run": "npm pack --workspace k-lotto --dry-run && npm pack --workspace daiso-product-search --dry-run && npm pack --workspace market-kurly-search --dry-run && npm pack --workspace blue-ribbon-nearby --dry-run && npm pack --workspace kakao-bar-nearby --dry-run && npm pack --workspace cheap-gas-nearby --dry-run && npm pack --workspace public-restroom-nearby --dry-run && npm pack --workspace parking-lot-search --dry-run && npm pack --workspace court-auction-notice-search --dry-run && npm pack --workspace donation-place-search --dry-run && npm pack --workspace gongsijiga-search --dry-run && npm pack --workspace kbl-results --dry-run && npm pack --workspace kleague-results --dry-run && npm pack --workspace lck-analytics --dry-run && npm pack --workspace toss-securities --dry-run && npm pack --workspace hipass-receipt --dry-run && npm pack --workspace used-car-price-search --dry-run && npm pack --workspace k-skill-rhwp --dry-run && npm pack --workspace korean-marathon-schedule --dry-run && npm pack --workspace gangnamunni-clinic-search --dry-run && npm pack --workspace daishin-report-search --dry-run",
|
||||
"ci": "npm run lint && npm run typecheck && npm run test && npm run pack:dry-run",
|
||||
"version-packages": "changeset version",
|
||||
"release:npm": "changeset publish"
|
||||
|
|
|
|||
40
packages/daishin-report-search/README.md
Normal file
40
packages/daishin-report-search/README.md
Normal file
|
|
@ -0,0 +1,40 @@
|
|||
# daishin-report-search
|
||||
|
||||
Public lookup client for timestamped Daishin Securities report HTML pages mirrored at `jay-jo-0/github_pages_repo`.
|
||||
|
||||
## Usage
|
||||
|
||||
```js
|
||||
const { listReports, fetchReport } = require("daishin-report-search")
|
||||
|
||||
const latest = await listReports({ limit: 10 })
|
||||
const filtered = await listReports({ query: "반도체", limit: 5, maxInspect: 100 })
|
||||
const authenticated = await listReports({ githubToken: process.env.GITHUB_TOKEN })
|
||||
const detail = await fetchReport("20260511082352", { includeExplain: true })
|
||||
```
|
||||
|
||||
```bash
|
||||
GITHUB_TOKEN=... daishin-report-search --limit 10
|
||||
daishin-report-search --limit 10
|
||||
daishin-report-search 반도체 --limit 5 --max-inspect 100
|
||||
daishin-report-search --id 20260511082352 --include-explain
|
||||
```
|
||||
|
||||
## Source path
|
||||
|
||||
- Tree: `https://api.github.com/repos/jay-jo-0/github_pages_repo/git/trees/main?recursive=1`
|
||||
- Raw detail: `https://raw.githubusercontent.com/Jay-jo-0/github_pages_repo/main/<path>`
|
||||
- Exact-file fallback: `https://api.github.com/repos/jay-jo-0/github_pages_repo/contents/<path>?ref=main`
|
||||
- Browser detail: `https://jay-jo-0.github.io/github_pages_repo/<path>`
|
||||
|
||||
No API key or proxy is required.
|
||||
|
||||
## Boundaries
|
||||
|
||||
- `limit` is normalized to a positive integer with a maximum of 50 results.
|
||||
- `maxInspect` is normalized to a positive integer with a maximum of 500 latest pages to avoid excessive raw GitHub fetches.
|
||||
- Invalid, zero, negative, or non-finite numeric options fall back to documented defaults.
|
||||
- Latest/search discovery returns an empty result with `source.error` metadata instead of throwing when the GitHub tree API is blocked or rate-limited.
|
||||
- Optional `githubToken` and `githubHeaders` options are forwarded only to `api.github.com` requests (tree discovery and exact-file contents fallback), not to raw detail requests. The CLI also honors `DAISHIN_GITHUB_TOKEN` or `GITHUB_TOKEN` from the environment.
|
||||
- Exact report fetches try raw GitHub HTML first, then the GitHub contents API for the known timestamp path if raw fetch fails.
|
||||
- The mirror can contain timestamped pages from sources other than Daishin Securities; inspect the returned title/headings/page URL before treating a result as Daishin-authored.
|
||||
36
packages/daishin-report-search/package.json
Normal file
36
packages/daishin-report-search/package.json
Normal file
|
|
@ -0,0 +1,36 @@
|
|||
{
|
||||
"name": "daishin-report-search",
|
||||
"version": "0.1.0",
|
||||
"description": "Public Daishin Securities report lookup client for GitHub Pages mirrored HTML reports",
|
||||
"license": "MIT",
|
||||
"main": "src/index.js",
|
||||
"bin": {
|
||||
"daishin-report-search": "src/cli.js"
|
||||
},
|
||||
"files": [
|
||||
"src",
|
||||
"README.md"
|
||||
],
|
||||
"engines": {
|
||||
"node": ">=18"
|
||||
},
|
||||
"publishConfig": {
|
||||
"access": "public"
|
||||
},
|
||||
"repository": {
|
||||
"type": "git",
|
||||
"url": "git+https://github.com/NomaDamas/k-skill.git"
|
||||
},
|
||||
"keywords": [
|
||||
"k-skill",
|
||||
"daishin",
|
||||
"securities",
|
||||
"research",
|
||||
"reports",
|
||||
"korea"
|
||||
],
|
||||
"scripts": {
|
||||
"lint": "node --check src/index.js && node --check src/cli.js && node --check test/index.test.js",
|
||||
"test": "node --test"
|
||||
}
|
||||
}
|
||||
48
packages/daishin-report-search/src/cli.js
Executable file
48
packages/daishin-report-search/src/cli.js
Executable file
|
|
@ -0,0 +1,48 @@
|
|||
#!/usr/bin/env node
|
||||
const { fetchReport, listReports } = require("./index")
|
||||
|
||||
async function main() {
|
||||
const args = parseArgs(process.argv.slice(2))
|
||||
const result = args.id
|
||||
? await fetchReport(args.id, args)
|
||||
: await listReports(args)
|
||||
console.log(JSON.stringify(result, null, 2))
|
||||
}
|
||||
|
||||
function parseArgs(argv) {
|
||||
const options = {}
|
||||
for (let i = 0; i < argv.length; i += 1) {
|
||||
const arg = argv[i]
|
||||
if (arg === "--query" || arg === "-q") options.query = argv[++i] || ""
|
||||
else if (arg === "--limit") options.limit = argv[++i]
|
||||
else if (arg === "--max-inspect") options.maxInspect = argv[++i]
|
||||
else if (arg === "--id") options.id = argv[++i]
|
||||
else if (arg === "--include-explain") options.includeExplain = true
|
||||
else if (arg === "--include-html") options.includeHtml = true
|
||||
else if (arg === "--help" || arg === "-h") {
|
||||
printHelp()
|
||||
process.exit(0)
|
||||
} else if (/^\d{14}(?:\.html)?$/.test(arg) && !options.id) {
|
||||
options.id = arg
|
||||
} else if (!options.query) {
|
||||
options.query = arg
|
||||
}
|
||||
}
|
||||
return options
|
||||
}
|
||||
|
||||
function printHelp() {
|
||||
console.log(`Usage: daishin-report-search [query] [options]\n\nList latest reports:\n daishin-report-search --limit 10\n daishin-report-search 반도체 --limit 5 --max-inspect 100\n\nFetch one report:\n daishin-report-search --id 20260511082352 --include-explain\n\nOptions:\n -q, --query <text> Filter by title/headings/detail text\n --limit <number> Maximum list results (default: 10)\n --max-inspect <n> Maximum latest pages to inspect for query matching\n --id <timestamp> Fetch one YYYYMMDDHHMMSS report\n --include-explain Fetch companion *_explain.html page for --id\n --include-html Include raw HTML in JSON output\n`)
|
||||
console.log("Environment:\n DAISHIN_GITHUB_TOKEN or GITHUB_TOKEN Optional caller-owned token for api.github.com requests\n")
|
||||
}
|
||||
|
||||
function run() {
|
||||
return main().catch((error) => {
|
||||
console.error(error && error.stack ? error.stack : String(error))
|
||||
process.exitCode = 1
|
||||
})
|
||||
}
|
||||
|
||||
if (require.main === module) run()
|
||||
|
||||
module.exports = { parseArgs, printHelp, main }
|
||||
441
packages/daishin-report-search/src/index.js
Normal file
441
packages/daishin-report-search/src/index.js
Normal file
|
|
@ -0,0 +1,441 @@
|
|||
const OWNER = "Jay-jo-0"
|
||||
const API_OWNER = "jay-jo-0"
|
||||
const REPO = "github_pages_repo"
|
||||
const BRANCH = "main"
|
||||
const PAGES_BASE_URL = "https://jay-jo-0.github.io/github_pages_repo"
|
||||
const RAW_BASE_URL = `https://raw.githubusercontent.com/${OWNER}/${REPO}/${BRANCH}`
|
||||
const API_BASE_URL = `https://api.github.com/repos/${API_OWNER}/${REPO}`
|
||||
const TREE_URL = `${API_BASE_URL}/git/trees/${BRANCH}?recursive=1`
|
||||
const REPORT_PATH_PATTERN = /^(\d{14})(?:_explain)?\.html$/
|
||||
const DEFAULT_LIMIT = 10
|
||||
const MAX_LIMIT = 50
|
||||
const DEFAULT_MAX_INSPECT = 50
|
||||
const MAX_INSPECT = 500
|
||||
|
||||
async function listReports(options = {}) {
|
||||
const {
|
||||
query = "",
|
||||
limit = 10,
|
||||
maxInspect,
|
||||
includeHtml = false,
|
||||
fetcher = global.fetch
|
||||
} = options
|
||||
|
||||
if (!fetcher) throw new Error("fetch is required")
|
||||
|
||||
const normalizedLimit = parsePositiveInteger(limit, DEFAULT_LIMIT, MAX_LIMIT)
|
||||
const normalizedQuery = String(query || "").trim()
|
||||
const defaultInspectBudget = Math.max(DEFAULT_MAX_INSPECT, normalizedLimit * 5)
|
||||
const normalizedMaxInspect = parsePositiveInteger(maxInspect, defaultInspectBudget, MAX_INSPECT)
|
||||
const inspectBudget = Math.max(normalizedLimit, normalizedMaxInspect)
|
||||
const warnings = []
|
||||
|
||||
let tree
|
||||
try {
|
||||
tree = await fetchJson(fetcher, TREE_URL, options)
|
||||
} catch (error) {
|
||||
warnings.push(`GitHub tree discovery failed: ${error.message}`)
|
||||
return {
|
||||
query: normalizedQuery,
|
||||
count: 0,
|
||||
items: [],
|
||||
warnings,
|
||||
source: buildSource(0, 0, error)
|
||||
}
|
||||
}
|
||||
if (tree.truncated) warnings.push("github tree response was truncated; latest report list may be incomplete")
|
||||
|
||||
const paths = Array.isArray(tree.tree)
|
||||
? tree.tree.filter((entry) => entry && entry.type === "blob").map((entry) => entry.path)
|
||||
: []
|
||||
const candidates = parseTreePaths(paths)
|
||||
const items = []
|
||||
let inspectedReports = 0
|
||||
|
||||
for (const candidate of candidates.slice(0, inspectBudget)) {
|
||||
let item = { ...candidate, ...buildReportUrls(candidate.path) }
|
||||
if (candidate.hasExplain) {
|
||||
item.explainUrl = buildReportUrls(candidate.explainPath).pageUrl
|
||||
item.explainRawUrl = buildReportUrls(candidate.explainPath).rawUrl
|
||||
}
|
||||
|
||||
try {
|
||||
inspectedReports += 1
|
||||
const html = await fetchText(fetcher, item.rawUrl, options)
|
||||
const parsed = parseReportHtml(html)
|
||||
item = {
|
||||
...item,
|
||||
title: parsed.title || item.id,
|
||||
headings: parsed.headings,
|
||||
excerpt: parsed.excerpt,
|
||||
ratingTargets: parsed.ratingTargets
|
||||
}
|
||||
if (includeHtml) item.html = html
|
||||
if (matchesQuery({ ...item, text: parsed.text }, normalizedQuery)) items.push(item)
|
||||
} catch (error) {
|
||||
warnings.push(`report detail failed for ${item.path}: ${error.message}`)
|
||||
if (!normalizedQuery) items.push({ ...item, title: item.id })
|
||||
}
|
||||
|
||||
if (items.length >= normalizedLimit) break
|
||||
}
|
||||
|
||||
if (items.length < normalizedLimit && candidates.length > inspectBudget) {
|
||||
warnings.push(`inspection budget exhausted after ${inspectBudget} of ${candidates.length} report pages`)
|
||||
}
|
||||
|
||||
return {
|
||||
query: normalizedQuery,
|
||||
count: items.length,
|
||||
items,
|
||||
warnings,
|
||||
source: buildSource(candidates.length, inspectedReports)
|
||||
}
|
||||
}
|
||||
|
||||
async function fetchReport(idOrPath, options = {}) {
|
||||
const { includeExplain = false, includeHtml = false, fetcher = global.fetch } = options
|
||||
if (!fetcher) throw new Error("fetch is required")
|
||||
|
||||
const path = normalizeReportPath(idOrPath)
|
||||
const meta = parseTimestamp(path)
|
||||
if (!meta || meta.isExplain) throw new Error(`invalid report id or path: ${idOrPath}`)
|
||||
|
||||
const urls = buildReportUrls(path)
|
||||
const html = await fetchReportHtml(fetcher, urls, options)
|
||||
const parsed = parseReportHtml(html)
|
||||
const report = {
|
||||
...meta,
|
||||
...urls,
|
||||
title: parsed.title || meta.id,
|
||||
headings: parsed.headings,
|
||||
text: parsed.text,
|
||||
excerpt: parsed.excerpt,
|
||||
ratingTargets: parsed.ratingTargets
|
||||
}
|
||||
if (includeHtml) report.html = html
|
||||
|
||||
if (includeExplain) {
|
||||
const explainPath = `${meta.id}_explain.html`
|
||||
const explainUrls = buildReportUrls(explainPath)
|
||||
try {
|
||||
const explainHtml = await fetchReportHtml(fetcher, explainUrls, options)
|
||||
const explainParsed = parseReportHtml(explainHtml)
|
||||
report.explain = {
|
||||
...parseTimestamp(explainPath),
|
||||
...explainUrls,
|
||||
title: explainParsed.title || `${meta.id} explanation`,
|
||||
headings: explainParsed.headings,
|
||||
text: explainParsed.text,
|
||||
excerpt: explainParsed.excerpt,
|
||||
ratingTargets: explainParsed.ratingTargets
|
||||
}
|
||||
if (includeHtml) report.explain.html = explainHtml
|
||||
} catch (error) {
|
||||
report.explain = null
|
||||
report.warnings = [`explanation page failed for ${explainPath}: ${error.message}`]
|
||||
}
|
||||
}
|
||||
|
||||
return report
|
||||
}
|
||||
|
||||
function parseTreePaths(paths) {
|
||||
const byId = new Map()
|
||||
for (const path of paths) {
|
||||
const meta = parseTimestamp(path)
|
||||
if (!meta) continue
|
||||
const record = byId.get(meta.id) || { id: meta.id }
|
||||
if (meta.isExplain) {
|
||||
record.explainPath = meta.path
|
||||
record.hasExplain = true
|
||||
} else {
|
||||
Object.assign(record, meta)
|
||||
record.hasExplain = Boolean(record.hasExplain)
|
||||
}
|
||||
byId.set(meta.id, record)
|
||||
}
|
||||
|
||||
return [...byId.values()]
|
||||
.filter((record) => record.path)
|
||||
.map((record) => ({ ...record, hasExplain: Boolean(record.explainPath) }))
|
||||
.sort((a, b) => b.id.localeCompare(a.id))
|
||||
}
|
||||
|
||||
function parseTimestamp(path) {
|
||||
const match = String(path || "").match(REPORT_PATH_PATTERN)
|
||||
if (!match) return null
|
||||
const id = match[1]
|
||||
const isExplain = String(path).includes("_explain.html")
|
||||
const year = id.slice(0, 4)
|
||||
const month = id.slice(4, 6)
|
||||
const day = id.slice(6, 8)
|
||||
const hour = id.slice(8, 10)
|
||||
const minute = id.slice(10, 12)
|
||||
const second = id.slice(12, 14)
|
||||
const timestamp = `${year}-${month}-${day}T${hour}:${minute}:${second}+09:00`
|
||||
|
||||
return {
|
||||
id,
|
||||
path: String(path),
|
||||
date: `${year}-${month}-${day}`,
|
||||
time: `${hour}:${minute}:${second}`,
|
||||
timestamp,
|
||||
epochMs: Date.parse(timestamp),
|
||||
isExplain
|
||||
}
|
||||
}
|
||||
|
||||
function buildReportUrls(path, options = {}) {
|
||||
const branch = options.branch || BRANCH
|
||||
const encodedPath = encodeReportPath(path)
|
||||
return {
|
||||
pageUrl: `${PAGES_BASE_URL}/${encodedPath}`,
|
||||
rawUrl: `https://raw.githubusercontent.com/${OWNER}/${REPO}/${branch}/${encodedPath}`,
|
||||
apiUrl: `${API_BASE_URL}/contents/${encodedPath}?ref=${encodeURIComponent(branch)}`
|
||||
}
|
||||
}
|
||||
|
||||
function parseReportHtml(html) {
|
||||
const withoutScripts = String(html || "")
|
||||
.replace(/<script\b[^>]*>[\s\S]*?<\/script>/gi, " ")
|
||||
.replace(/<style\b[^>]*>[\s\S]*?<\/style>/gi, " ")
|
||||
const title = firstText(withoutScripts, /<h1\b[^>]*>([\s\S]*?)<\/h1>/i)
|
||||
|| firstText(withoutScripts, /<title\b[^>]*>([\s\S]*?)<\/title>/i)
|
||||
const headings = [...withoutScripts.matchAll(/<h[1-3]\b[^>]*>([\s\S]*?)<\/h[1-3]>/gi)]
|
||||
.map((match) => normalizeText(stripTags(match[1])))
|
||||
.filter(Boolean)
|
||||
const ratingTargets = parseTables(withoutScripts).filter((row) => {
|
||||
const keys = Object.keys(row).join(" ")
|
||||
return /종목명|투자의견|목표주가|Rating|Target/i.test(keys)
|
||||
})
|
||||
const text = normalizeText(
|
||||
decodeEntities(
|
||||
withoutScripts
|
||||
.replace(/<\/?(p|div|br|li|tr|h[1-6]|table|thead|tbody|ul|ol)\b[^>]*>/gi, "\n")
|
||||
.replace(/<[^>]+>/g, " ")
|
||||
)
|
||||
)
|
||||
const excerpt = text.length > 300 ? `${text.slice(0, 297)}...` : text
|
||||
|
||||
return { title, headings, text, excerpt, ratingTargets }
|
||||
}
|
||||
|
||||
function parseTables(html) {
|
||||
const rows = []
|
||||
for (const tableMatch of String(html || "").matchAll(/<table\b[^>]*>([\s\S]*?)<\/table>/gi)) {
|
||||
const tableRows = [...tableMatch[1].matchAll(/<tr\b[^>]*>([\s\S]*?)<\/tr>/gi)].map((rowMatch) =>
|
||||
[...rowMatch[1].matchAll(/<t[hd]\b[^>]*>([\s\S]*?)<\/t[hd]>/gi)].map((cellMatch) => normalizeText(stripTags(cellMatch[1])))
|
||||
).filter((cells) => cells.length > 0)
|
||||
if (tableRows.length < 2) continue
|
||||
const headers = tableRows[0]
|
||||
for (const cells of tableRows.slice(1)) {
|
||||
const row = {}
|
||||
headers.forEach((header, index) => {
|
||||
if (header && cells[index]) row[header] = cells[index]
|
||||
})
|
||||
if (Object.keys(row).length > 0) rows.push(row)
|
||||
}
|
||||
}
|
||||
return rows
|
||||
}
|
||||
|
||||
function matchesQuery(item, query) {
|
||||
if (!query) return true
|
||||
const haystack = [item.id, item.title, item.excerpt, item.text, ...(item.headings || [])]
|
||||
.join("\n")
|
||||
.toLocaleLowerCase("ko-KR")
|
||||
return query.toLocaleLowerCase("ko-KR").split(/\s+/).filter(Boolean).every((term) => haystack.includes(term))
|
||||
}
|
||||
|
||||
async function fetchJson(fetcher, url, options = {}) {
|
||||
const response = await fetcher(url, { headers: requestHeaders(url, options) })
|
||||
await assertOk(response, url)
|
||||
if (typeof response.json === "function") return response.json()
|
||||
return JSON.parse(await response.text())
|
||||
}
|
||||
|
||||
async function fetchText(fetcher, url, options = {}) {
|
||||
const response = await fetcher(url, { headers: requestHeaders(url, options) })
|
||||
await assertOk(response, url)
|
||||
return response.text()
|
||||
}
|
||||
|
||||
async function fetchReportHtml(fetcher, urls, options = {}) {
|
||||
try {
|
||||
return await fetchText(fetcher, urls.rawUrl, options)
|
||||
} catch (rawError) {
|
||||
try {
|
||||
const contents = await fetchJson(fetcher, urls.apiUrl, options)
|
||||
return decodeContentsApiHtml(contents, urls.apiUrl)
|
||||
} catch (contentsError) {
|
||||
const error = new Error(`${rawError.message}; contents fallback failed: ${contentsError.message}`)
|
||||
error.cause = rawError
|
||||
error.fallbackCause = contentsError
|
||||
error.url = rawError.url
|
||||
error.status = rawError.status
|
||||
error.statusText = rawError.statusText
|
||||
error.kind = rawError.kind
|
||||
error.rateLimit = rawError.rateLimit
|
||||
throw error
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
async function assertOk(response, url) {
|
||||
if (response && response.ok) return
|
||||
const statusCode = response && response.status
|
||||
const statusText = response && response.statusText
|
||||
const status = response ? `${statusCode || ""} ${statusText || ""}`.trim() : "no response"
|
||||
const error = new Error(`HTTP ${status} for ${url}`)
|
||||
error.url = url
|
||||
error.status = statusCode || null
|
||||
error.statusText = statusText || ""
|
||||
error.kind = statusCode === 403 || statusCode === 429 ? "rate_limit" : "http"
|
||||
error.rateLimit = readRateLimit(response && response.headers)
|
||||
throw error
|
||||
}
|
||||
|
||||
function requestHeaders(url, options = {}) {
|
||||
const headers = {
|
||||
"user-agent": "k-skill daishin-report-search (+https://github.com/NomaDamas/k-skill)",
|
||||
accept: "application/vnd.github+json, text/html;q=0.9, */*;q=0.8"
|
||||
}
|
||||
if (isGitHubApiUrl(url)) {
|
||||
Object.assign(headers, options.githubHeaders || {})
|
||||
const token = options.githubToken || readEnvToken()
|
||||
if (token && !hasHeader(headers, "authorization")) headers.authorization = `Bearer ${token}`
|
||||
}
|
||||
return headers
|
||||
}
|
||||
|
||||
function decodeContentsApiHtml(contents, url) {
|
||||
if (!contents || typeof contents.content !== "string") {
|
||||
throw new Error(`GitHub contents response missing content for ${url}`)
|
||||
}
|
||||
if (contents.encoding && contents.encoding !== "base64") {
|
||||
throw new Error(`unsupported GitHub contents encoding ${contents.encoding} for ${url}`)
|
||||
}
|
||||
return Buffer.from(contents.content.replace(/\s+/g, ""), "base64").toString("utf8")
|
||||
}
|
||||
|
||||
function isGitHubApiUrl(url) {
|
||||
try {
|
||||
return new URL(url).hostname.toLowerCase() === "api.github.com"
|
||||
} catch {
|
||||
return false
|
||||
}
|
||||
}
|
||||
|
||||
function buildSource(totalReportsDiscovered, inspectedReports, error) {
|
||||
const source = {
|
||||
treeUrl: TREE_URL,
|
||||
pagesBaseUrl: PAGES_BASE_URL,
|
||||
rawBaseUrl: RAW_BASE_URL,
|
||||
branch: BRANCH,
|
||||
totalReportsDiscovered,
|
||||
inspectedReports
|
||||
}
|
||||
if (error) source.error = serializeSourceError(error)
|
||||
return source
|
||||
}
|
||||
|
||||
function serializeSourceError(error) {
|
||||
return {
|
||||
message: error.message,
|
||||
url: error.url || TREE_URL,
|
||||
status: error.status || null,
|
||||
statusText: error.statusText || "",
|
||||
kind: error.kind || "unknown",
|
||||
rateLimit: error.rateLimit || {}
|
||||
}
|
||||
}
|
||||
|
||||
function readRateLimit(headers) {
|
||||
if (!headers || typeof headers.get !== "function") return {}
|
||||
const reset = headers.get("x-ratelimit-reset")
|
||||
const retryAfter = headers.get("retry-after")
|
||||
const rateLimit = {
|
||||
limit: headers.get("x-ratelimit-limit") || "",
|
||||
remaining: headers.get("x-ratelimit-remaining") || "",
|
||||
reset: reset || "",
|
||||
retryAfter: retryAfter || ""
|
||||
}
|
||||
if (reset && /^\d+$/.test(reset)) rateLimit.resetAt = new Date(Number(reset) * 1000).toISOString()
|
||||
return rateLimit
|
||||
}
|
||||
|
||||
function readEnvToken() {
|
||||
if (typeof process === "undefined" || !process.env) return ""
|
||||
return process.env.DAISHIN_GITHUB_TOKEN || process.env.GITHUB_TOKEN || ""
|
||||
}
|
||||
|
||||
function hasHeader(headers, name) {
|
||||
const normalized = name.toLowerCase()
|
||||
return Object.keys(headers).some((key) => key.toLowerCase() === normalized)
|
||||
}
|
||||
|
||||
function normalizeReportPath(idOrPath) {
|
||||
const value = String(idOrPath || "").trim()
|
||||
if (/^\d{14}$/.test(value)) return `${value}.html`
|
||||
return value.replace(/^\/+/, "")
|
||||
}
|
||||
|
||||
function firstText(html, pattern) {
|
||||
const match = String(html || "").match(pattern)
|
||||
return match ? normalizeText(stripTags(match[1])) : ""
|
||||
}
|
||||
|
||||
function stripTags(value) {
|
||||
return decodeEntities(String(value || "").replace(/<[^>]+>/g, " "))
|
||||
}
|
||||
|
||||
function normalizeText(value) {
|
||||
return String(value || "").replace(/\s+/g, " ").trim()
|
||||
}
|
||||
|
||||
function parsePositiveInteger(value, defaultValue, maxValue) {
|
||||
const parsed = Number(value)
|
||||
if (!Number.isFinite(parsed)) return defaultValue
|
||||
const integer = Math.floor(parsed)
|
||||
if (integer <= 0) return defaultValue
|
||||
return Math.min(integer, maxValue)
|
||||
}
|
||||
|
||||
function decodeEntities(value) {
|
||||
const named = {
|
||||
amp: "&",
|
||||
lt: "<",
|
||||
gt: ">",
|
||||
quot: '"',
|
||||
apos: "'",
|
||||
nbsp: " "
|
||||
}
|
||||
return String(value || "")
|
||||
.replace(/&#(\d+);/g, (entity, code) => decodeCodePoint(Number(code), entity))
|
||||
.replace(/&#x([0-9a-f]+);/gi, (entity, code) => decodeCodePoint(Number.parseInt(code, 16), entity))
|
||||
.replace(/&([a-z]+);/gi, (_, name) => named[name.toLowerCase()] || `&${name};`)
|
||||
}
|
||||
|
||||
function decodeCodePoint(codePoint, originalEntity) {
|
||||
if (!Number.isInteger(codePoint) || codePoint < 0 || codePoint > 0x10ffff) return originalEntity
|
||||
return String.fromCodePoint(codePoint)
|
||||
}
|
||||
|
||||
function encodeReportPath(path) {
|
||||
return String(path || "").split("/").map(encodeURIComponent).join("/")
|
||||
}
|
||||
|
||||
module.exports = {
|
||||
API_BASE_URL,
|
||||
BRANCH,
|
||||
PAGES_BASE_URL,
|
||||
RAW_BASE_URL,
|
||||
TREE_URL,
|
||||
buildReportUrls,
|
||||
fetchReport,
|
||||
listReports,
|
||||
parseReportHtml,
|
||||
parseTimestamp,
|
||||
parseTreePaths
|
||||
}
|
||||
364
packages/daishin-report-search/test/index.test.js
Normal file
364
packages/daishin-report-search/test/index.test.js
Normal file
|
|
@ -0,0 +1,364 @@
|
|||
const test = require("node:test")
|
||||
const assert = require("node:assert/strict")
|
||||
|
||||
const {
|
||||
buildReportUrls,
|
||||
fetchReport,
|
||||
listReports,
|
||||
parseReportHtml,
|
||||
parseTimestamp,
|
||||
parseTreePaths
|
||||
} = require("../src/index")
|
||||
const { parseArgs } = require("../src/cli")
|
||||
|
||||
const TREE_URL = "https://api.github.com/repos/jay-jo-0/github_pages_repo/git/trees/main?recursive=1"
|
||||
|
||||
function jsonResponse(value, ok = true, responseOptions = {}) {
|
||||
const headers = responseOptions.headers || {}
|
||||
const getHeader = (name) => {
|
||||
const normalized = String(name).toLowerCase()
|
||||
if (headers[normalized]) return headers[normalized]
|
||||
return normalized === "content-type" && ok ? "application/json" : null
|
||||
}
|
||||
return {
|
||||
ok,
|
||||
status: responseOptions.status || (ok ? 200 : 500),
|
||||
statusText: responseOptions.statusText || (ok ? "OK" : "Server Error"),
|
||||
headers: { get: getHeader },
|
||||
text: async () => JSON.stringify(value),
|
||||
json: async () => value
|
||||
}
|
||||
}
|
||||
|
||||
function textResponse(value, ok = true) {
|
||||
return {
|
||||
ok,
|
||||
status: ok ? 200 : 404,
|
||||
statusText: ok ? "OK" : "Not Found",
|
||||
headers: { get: () => "text/html; charset=utf-8" },
|
||||
text: async () => value
|
||||
}
|
||||
}
|
||||
|
||||
function timestampPath(prefix, index) {
|
||||
const day = String((index % 28) + 1).padStart(2, "0")
|
||||
const hour = String(Math.floor(index / 28) % 24).padStart(2, "0")
|
||||
const minute = String(Math.floor(index / (28 * 24)) % 60).padStart(2, "0")
|
||||
const second = String(index % 60).padStart(2, "0")
|
||||
return `${prefix}${day}${hour}${minute}${second}.html`
|
||||
}
|
||||
|
||||
test("parseTimestamp parses timestamp filenames into ISO-like metadata", () => {
|
||||
assert.deepEqual(parseTimestamp("20260511082352.html"), {
|
||||
id: "20260511082352",
|
||||
path: "20260511082352.html",
|
||||
date: "2026-05-11",
|
||||
time: "08:23:52",
|
||||
timestamp: "2026-05-11T08:23:52+09:00",
|
||||
epochMs: Date.parse("2026-05-10T23:23:52.000Z"),
|
||||
isExplain: false
|
||||
})
|
||||
assert.equal(parseTimestamp("20260511082352_explain.html").isExplain, true)
|
||||
assert.equal(parseTimestamp("README.md"), null)
|
||||
})
|
||||
|
||||
test("parseTreePaths filters timestamp reports and pairs explanation pages", () => {
|
||||
const reports = parseTreePaths([
|
||||
"nested/ignored.html",
|
||||
"20260511082352.html",
|
||||
"20260511082352_explain.html",
|
||||
"20260512010102_explain.html",
|
||||
"20260512010102.html",
|
||||
"README.md"
|
||||
])
|
||||
|
||||
assert.deepEqual(reports.map((report) => report.id), ["20260512010102", "20260511082352"])
|
||||
assert.equal(reports[0].explainPath, "20260512010102_explain.html")
|
||||
assert.equal(reports[1].hasExplain, true)
|
||||
})
|
||||
|
||||
test("buildReportUrls returns GitHub Pages, raw, and API URLs", () => {
|
||||
assert.deepEqual(buildReportUrls("20260511082352.html"), {
|
||||
pageUrl: "https://jay-jo-0.github.io/github_pages_repo/20260511082352.html",
|
||||
rawUrl: "https://raw.githubusercontent.com/Jay-jo-0/github_pages_repo/main/20260511082352.html",
|
||||
apiUrl: "https://api.github.com/repos/jay-jo-0/github_pages_repo/contents/20260511082352.html?ref=main"
|
||||
})
|
||||
})
|
||||
|
||||
test("parseReportHtml extracts title, headings, text, rating table, and excerpt", () => {
|
||||
const parsed = parseReportHtml(`<!doctype html><html><head><title>[대신증권 류형근] 반도체업</title></head>
|
||||
<body><h1>[대신증권 류형근] [Issue & News] 반도체업: 새로운 역사</h1>
|
||||
<h2>반도체, 더 올라갑니다</h2><p>삼성전자와 SK하이닉스의 목표주가를 상향합니다.</p>
|
||||
<table><tr><th>종목명</th><th>투자의견</th><th>목표주가</th></tr><tr><td>삼성전자</td><td>Buy</td><td>450,000원</td></tr></table></body></html>`)
|
||||
|
||||
assert.equal(parsed.title, "[대신증권 류형근] [Issue & News] 반도체업: 새로운 역사")
|
||||
assert.deepEqual(parsed.headings, ["[대신증권 류형근] [Issue & News] 반도체업: 새로운 역사", "반도체, 더 올라갑니다"])
|
||||
assert.match(parsed.text, /삼성전자와 SK하이닉스/)
|
||||
assert.deepEqual(parsed.ratingTargets, [{ 종목명: "삼성전자", 투자의견: "Buy", 목표주가: "450,000원" }])
|
||||
assert.ok(parsed.excerpt.length <= 300)
|
||||
})
|
||||
|
||||
test("listReports reads the GitHub tree, sorts latest first, fetches selected titles, and preserves warnings", async () => {
|
||||
const calls = []
|
||||
const fetcher = async (url) => {
|
||||
calls.push(url)
|
||||
if (url === TREE_URL) {
|
||||
return jsonResponse({
|
||||
truncated: false,
|
||||
tree: [
|
||||
{ path: "20260511082352.html", type: "blob" },
|
||||
{ path: "20260511082352_explain.html", type: "blob" },
|
||||
{ path: "20260514074108.html", type: "blob" },
|
||||
{ path: "assets/logo.png", type: "blob" }
|
||||
]
|
||||
})
|
||||
}
|
||||
if (url.endsWith("20260514074108.html")) return textResponse("<h1>[JAEMINI] 미국 장마감 시황 26.05.14</h1><p>시장 요약</p>")
|
||||
if (url.endsWith("20260511082352.html")) return textResponse("<h1>[대신증권 류형근] 반도체업</h1><p>반도체 리포트</p>")
|
||||
throw new Error(`unexpected url ${url}`)
|
||||
}
|
||||
|
||||
const result = await listReports({ limit: 2, fetcher })
|
||||
|
||||
assert.equal(result.source.treeUrl, TREE_URL)
|
||||
assert.equal(result.items.length, 2)
|
||||
assert.deepEqual(result.items.map((item) => item.id), ["20260514074108", "20260511082352"])
|
||||
assert.equal(result.items[0].title, "[JAEMINI] 미국 장마감 시황 26.05.14")
|
||||
assert.equal(result.items[1].hasExplain, true)
|
||||
assert.equal(result.items[1].explainUrl, "https://jay-jo-0.github.io/github_pages_repo/20260511082352_explain.html")
|
||||
assert.equal(result.warnings.length, 0)
|
||||
assert.ok(calls.some((url) => url.includes("git/trees/main?recursive=1")))
|
||||
})
|
||||
|
||||
test("listReports can query detail text beyond the first page until it finds matches", async () => {
|
||||
const fetcher = async (url) => {
|
||||
if (url === TREE_URL) {
|
||||
return jsonResponse({
|
||||
tree: [
|
||||
{ path: "20260514074108.html", type: "blob" },
|
||||
{ path: "20260511082352.html", type: "blob" }
|
||||
]
|
||||
})
|
||||
}
|
||||
if (url.endsWith("20260514074108.html")) return textResponse("<h1>미국 장마감 시황</h1><p>시장</p>")
|
||||
if (url.endsWith("20260511082352.html")) return textResponse("<h1>[대신증권 류형근] 반도체업</h1><p>삼성전자 목표주가 상향</p>")
|
||||
throw new Error(`unexpected url ${url}`)
|
||||
}
|
||||
|
||||
const result = await listReports({ query: "삼성전자", limit: 1, maxInspect: 2, fetcher })
|
||||
|
||||
assert.deepEqual(result.items.map((item) => item.id), ["20260511082352"])
|
||||
assert.equal(result.query, "삼성전자")
|
||||
})
|
||||
|
||||
test("listReports clamps non-finite and huge numeric options before inspecting reports", async () => {
|
||||
const detailCalls = []
|
||||
const tree = Array.from({ length: 600 }, (_, index) => ({ path: timestampPath("202605", index), type: "blob" }))
|
||||
const fetcher = async (url) => {
|
||||
if (url === TREE_URL) return jsonResponse({ tree })
|
||||
detailCalls.push(url)
|
||||
return textResponse("<h1>시장 요약</h1><p>일반 내용</p>")
|
||||
}
|
||||
|
||||
const result = await listReports({ query: "없는검색어", limit: Infinity, maxInspect: 1e9, fetcher })
|
||||
|
||||
assert.equal(result.count, 0)
|
||||
assert.equal(detailCalls.length, 500)
|
||||
assert.equal(result.source.inspectedReports, 500)
|
||||
assert.match(result.warnings.at(-1), /inspection budget exhausted after 500 of 600 report pages/)
|
||||
|
||||
const hugeLimitResult = await listReports({ limit: 1e9, fetcher })
|
||||
assert.equal(hugeLimitResult.items.length, 50)
|
||||
})
|
||||
|
||||
test("listReports falls back to defaults for invalid, zero, and negative numeric options", async () => {
|
||||
const detailCalls = []
|
||||
const tree = Array.from({ length: 60 }, (_, index) => ({ path: timestampPath("202604", index), type: "blob" }))
|
||||
const fetcher = async (url) => {
|
||||
if (url === TREE_URL) return jsonResponse({ tree })
|
||||
detailCalls.push(url)
|
||||
return textResponse("<h1>시장 요약</h1><p>일반 내용</p>")
|
||||
}
|
||||
|
||||
const result = await listReports({ query: "없는검색어", limit: Number.NaN, maxInspect: -25, fetcher })
|
||||
|
||||
assert.equal(result.count, 0)
|
||||
assert.equal(detailCalls.length, 50)
|
||||
assert.equal(result.source.inspectedReports, 50)
|
||||
|
||||
const zeroLimit = await listReports({ limit: 0, maxInspect: 0, fetcher })
|
||||
assert.equal(zeroLimit.items.length, 10)
|
||||
})
|
||||
|
||||
test("parseArgs preserves numeric option text for library validation", () => {
|
||||
assert.deepEqual(parseArgs(["--limit", "Infinity", "--max-inspect", "1e9"]), {
|
||||
limit: "Infinity",
|
||||
maxInspect: "1e9"
|
||||
})
|
||||
})
|
||||
|
||||
test("parseReportHtml preserves malformed numeric entities instead of throwing", () => {
|
||||
const parsed = parseReportHtml("<h1>� � A A</h1><p>본문</p>")
|
||||
|
||||
assert.match(parsed.title, /�/)
|
||||
assert.match(parsed.title, /�/)
|
||||
assert.match(parsed.title, /A A/)
|
||||
assert.match(parsed.text, /본문/)
|
||||
})
|
||||
|
||||
test("listReports returns structured source errors for GitHub tree rate limits", async () => {
|
||||
const reset = String(Math.floor(Date.parse("2026-05-14T01:00:00Z") / 1000))
|
||||
const fetcher = async (url) => {
|
||||
assert.equal(url, TREE_URL)
|
||||
return jsonResponse(
|
||||
{ message: "API rate limit exceeded" },
|
||||
false,
|
||||
{
|
||||
status: 403,
|
||||
statusText: "rate limit exceeded",
|
||||
headers: {
|
||||
"x-ratelimit-limit": "60",
|
||||
"x-ratelimit-remaining": "0",
|
||||
"x-ratelimit-reset": reset
|
||||
}
|
||||
}
|
||||
)
|
||||
}
|
||||
|
||||
const result = await listReports({ limit: 3, fetcher })
|
||||
|
||||
assert.equal(result.count, 0)
|
||||
assert.deepEqual(result.items, [])
|
||||
assert.equal(result.source.totalReportsDiscovered, 0)
|
||||
assert.equal(result.source.inspectedReports, 0)
|
||||
assert.equal(result.source.error.status, 403)
|
||||
assert.equal(result.source.error.kind, "rate_limit")
|
||||
assert.equal(result.source.error.rateLimit.limit, "60")
|
||||
assert.equal(result.source.error.rateLimit.remaining, "0")
|
||||
assert.equal(result.source.error.rateLimit.reset, reset)
|
||||
assert.equal(result.source.error.rateLimit.resetAt, "2026-05-14T01:00:00.000Z")
|
||||
assert.match(result.warnings[0], /GitHub tree discovery failed: HTTP 403 rate limit exceeded/)
|
||||
})
|
||||
|
||||
test("listReports classifies GitHub 429 responses as structured rate limits", async () => {
|
||||
const fetcher = async () => jsonResponse(
|
||||
{ message: "Too Many Requests" },
|
||||
false,
|
||||
{
|
||||
status: 429,
|
||||
statusText: "Too Many Requests",
|
||||
headers: { "retry-after": "42" }
|
||||
}
|
||||
)
|
||||
|
||||
const result = await listReports({ limit: 1, fetcher })
|
||||
|
||||
assert.equal(result.count, 0)
|
||||
assert.equal(result.source.error.status, 429)
|
||||
assert.equal(result.source.error.kind, "rate_limit")
|
||||
assert.equal(result.source.error.rateLimit.retryAfter, "42")
|
||||
assert.match(result.warnings[0], /GitHub tree discovery failed: HTTP 429 Too Many Requests/)
|
||||
})
|
||||
|
||||
test("listReports sends caller GitHub headers and token to discovery requests", async () => {
|
||||
const calls = []
|
||||
const fetcher = async (url, init = {}) => {
|
||||
calls.push({ url, headers: init.headers })
|
||||
if (url === TREE_URL) return jsonResponse({ tree: [{ path: "20260511082352.html", type: "blob" }] })
|
||||
return textResponse("<h1>헤더 테스트</h1><p>본문</p>")
|
||||
}
|
||||
|
||||
const result = await listReports({
|
||||
limit: 1,
|
||||
githubToken: "test-token",
|
||||
githubHeaders: { "x-github-api-version": "2022-11-28" },
|
||||
fetcher
|
||||
})
|
||||
|
||||
assert.equal(result.items.length, 1)
|
||||
assert.equal(calls[0].headers.authorization, "Bearer test-token")
|
||||
assert.equal(calls[0].headers["x-github-api-version"], "2022-11-28")
|
||||
assert.equal(calls[1].headers.authorization, undefined)
|
||||
assert.equal(calls[1].headers["x-github-api-version"], undefined)
|
||||
})
|
||||
|
||||
test("fetchReport does not forward caller GitHub auth to raw detail requests", async () => {
|
||||
const calls = []
|
||||
const fetcher = async (url, init = {}) => {
|
||||
calls.push({ url, headers: init.headers })
|
||||
return textResponse("<h1>권한 범위 테스트</h1><p>본문</p>")
|
||||
}
|
||||
|
||||
const report = await fetchReport("20260511082352", {
|
||||
githubToken: "test-token",
|
||||
githubHeaders: { "x-github-api-version": "2022-11-28" },
|
||||
fetcher
|
||||
})
|
||||
|
||||
assert.equal(report.title, "권한 범위 테스트")
|
||||
assert.equal(calls.length, 1)
|
||||
assert.match(calls[0].url, /raw\.githubusercontent\.com/)
|
||||
assert.equal(calls[0].headers.authorization, undefined)
|
||||
assert.equal(calls[0].headers["x-github-api-version"], undefined)
|
||||
})
|
||||
|
||||
test("fetchReport falls back to GitHub contents API when raw exact report fetch fails", async () => {
|
||||
const calls = []
|
||||
const html = Buffer.from("<h1>콘텐츠 API 원문</h1><p>fallback body</p>", "utf8").toString("base64")
|
||||
const fetcher = async (url, init = {}) => {
|
||||
calls.push({ url, headers: init.headers })
|
||||
if (url.includes("raw.githubusercontent.com")) {
|
||||
return textResponse("not found", false)
|
||||
}
|
||||
if (url.includes("/contents/20260511082352.html")) {
|
||||
return jsonResponse({ content: html, encoding: "base64" })
|
||||
}
|
||||
throw new Error(`unexpected url ${url}`)
|
||||
}
|
||||
|
||||
const report = await fetchReport("20260511082352", {
|
||||
githubToken: "test-token",
|
||||
githubHeaders: { "x-github-api-version": "2022-11-28" },
|
||||
fetcher
|
||||
})
|
||||
|
||||
assert.equal(report.title, "콘텐츠 API 원문")
|
||||
assert.match(report.text, /fallback body/)
|
||||
assert.match(calls[0].url, /raw\.githubusercontent\.com/)
|
||||
assert.equal(calls[0].headers.authorization, undefined)
|
||||
assert.match(calls[1].url, /api\.github\.com/)
|
||||
assert.equal(calls[1].headers.authorization, "Bearer test-token")
|
||||
assert.equal(calls[1].headers["x-github-api-version"], "2022-11-28")
|
||||
})
|
||||
|
||||
test("listReports reports the actual number of inspected detail pages", async () => {
|
||||
const detailCalls = []
|
||||
const tree = Array.from({ length: 10 }, (_, index) => ({ path: timestampPath("202603", index), type: "blob" }))
|
||||
const fetcher = async (url) => {
|
||||
if (url === TREE_URL) return jsonResponse({ tree })
|
||||
detailCalls.push(url)
|
||||
return textResponse("<h1>시장 요약</h1><p>일반 내용</p>")
|
||||
}
|
||||
|
||||
const result = await listReports({ limit: 2, fetcher })
|
||||
|
||||
assert.equal(result.items.length, 2)
|
||||
assert.equal(detailCalls.length, 2)
|
||||
assert.equal(result.source.inspectedReports, 2)
|
||||
})
|
||||
|
||||
test("fetchReport returns detail plus optional explanation page", async () => {
|
||||
const fetcher = async (url) => {
|
||||
if (url.endsWith("20260511082352.html")) return textResponse("<h1>원문 리포트</h1><p>원문 내용</p>")
|
||||
if (url.endsWith("20260511082352_explain.html")) return textResponse("<h1>쉬운 설명</h1><p>설명 내용</p>")
|
||||
throw new Error(`unexpected url ${url}`)
|
||||
}
|
||||
|
||||
const report = await fetchReport("20260511082352", { includeExplain: true, fetcher })
|
||||
|
||||
assert.equal(report.id, "20260511082352")
|
||||
assert.equal(report.title, "원문 리포트")
|
||||
assert.equal(report.explain.title, "쉬운 설명")
|
||||
assert.match(report.text, /원문 내용/)
|
||||
assert.match(report.explain.text, /설명 내용/)
|
||||
})
|
||||
Loading…
Add table
Add a link
Reference in a new issue