kyush-llm-router/ARCHITECTURE.md
2026-03-05 23:46:54 +09:00

477 lines
19 KiB
Markdown

# Kyush LLM Router Architecture Document
## Overview
Kyush LLM Router is a proxy server that manages multiple users and routes their requests to various OpenAI-compatible backend APIs. It provides API key-based authentication, user permission management, and usage monitoring through an admin web dashboard.
## System Architecture
```
┌───────────────────────────────────────────────────────────────────────────┐
│ Client Layer │
├─────────────────────┬───────────────────────────────┬─────────────────────┤
│ LLM Clients │ Admin Dashboard │ Vite Dev Server │
│ (OpenAI SDK etc) │ (Solid.js + Vite) │ (Port 5173) │
└──────────┬──────────┴──────────────┬────────────────┴──────────┬──────────┘
│ │ │
│ OpenAI-Compatible API │ REST API │
│ (Port 3000) │ (Port 3001) │
▼ ▼ │
┌───────────────────────────────────────────────────────────────────────────┐
│ Router Server (Node.js/Express) │
│ (Port 3000) │
├───────────────────────────────────────────────────────────────────────────┤
│ ┌───────────────────────────────────────────────────────────────────┐ │
│ │ Authentication Middleware │ │
│ │ - API Key Validation │ │
│ │ - Rate Limiting │ │
│ └───────────────────────────────────────────────────────────────────┘ │
│ ┌───────────────────────────────────────────────────────────────────┐ │
│ │ Request Router │ │
│ │ - Backend Selection │ │
│ │ - Load Balancing (optional) │ │
│ │ - Request/Response Transformation │ │
│ └───────────────────────────────────────────────────────────────────┘ │
│ ┌───────────────────────────────────────────────────────────────────┐ │
│ │ Admin API Endpoints (proxied to Vite dev server) │ │
│ │ - User Management │ │
│ │ - Backend Management │ │
│ │ - Permission Management │ │
│ │ - Usage Analytics │ │
│ └───────────────────────────────────────────────────────────────────┘ │
└────────────────────────────┬──────────────────────────────────────────────┘
┌─────────────────┴─────────────────┐
▼ ▼
┌───────────────────────┐ ┌─────────────────────────┐
│ Core Database │ │ Analytics Database │
│ (SQLite - core.db) │ │ (SQLite - analytics.db)│
├───────────────────────┤ ├─────────────────────────┤
│ - users │ │ - request_logs │
│ - backends │ │ - usage_stats │
│ - permissions │ │ - backend_metrics │
└───────────────────────┘ └─────────────────────────┘
│ │
└─────────────────┬─────────────────┘
┌───────────────────────────────────────────────────────────────────────────┐
│ Backend Layer │
├───────────────────────────────────────────────────────────────────────────┤
│ Multiple OpenAI-Compatible APIs (vLLM, SGLang, etc.) │
└───────────────────────────────────────────────────────────────────────────┘
```
## Project Structure
```
kyush-llm-router/
├── server/
│ ├── src/
│ │ ├── index.ts # Express app entry point
│ │ ├── config/
│ │ │ ├── database.ts # Core SQLite connection
│ │ │ └── analytics-db.ts # Analytics SQLite connection
│ │ ├── routes/
│ │ │ ├── api.ts # OpenAI-compatible API proxy
│ │ │ ├── admin.ts # Admin management API
│ │ │ └── auth.ts # Authentication middleware
│ │ ├── models/
│ │ │ ├── User.ts # User model
│ │ │ ├── Backend.ts # Backend model
│ │ │ └── Permission.ts # Permission model
│ │ ├── analytics/
│ │ │ ├── RequestLog.ts # Request log model (analytics DB)
│ │ │ ├── UsageStats.ts # Usage stats model (analytics DB)
│ │ │ └── BackendMetrics.ts # Backend metrics model
│ │ ├── services/
│ │ │ ├── AuthService.ts # API key validation
│ │ │ ├── RouterService.ts # Backend routing logic
│ │ │ └── AnalyticsService.ts # Usage tracking
│ │ └── utils/
│ │ └── logger.ts # Logging utility
│ ├── package.json
│ └── tsconfig.json
├── client/
│ ├── src/
│ │ ├── index.tsx # Solid.js entry point
│ │ ├── App.tsx # Main application component
│ │ ├── routes/
│ │ │ ├── Dashboard.tsx # Main dashboard
│ │ │ ├── Users.tsx # User management
│ │ │ ├── Backends.tsx # Backend management
│ │ │ ├── Permissions.tsx # Permission management
│ │ │ └── Analytics.tsx # Usage monitoring
│ │ ├── components/
│ │ │ ├── Layout.tsx # Admin layout
│ │ │ ├── UserTable.tsx # User list table
│ │ │ ├── BackendTable.tsx # Backend list table
│ │ │ └── StatsChart.tsx # Usage chart component
│ │ ├── api/
│ │ │ └── client.ts # API client for admin endpoints
│ │ └── types/
│ │ └── index.ts # TypeScript type definitions
│ ├── package.json
│ ├── vite.config.ts
│ └── tsconfig.json
├── shared/
│ └── types.ts # Shared type definitions
├── database/
│ ├── schema.sql # Core database schema
│ └── analytics-schema.sql # Analytics database schema
├── scripts/
│ └── dev.js # Dev server launcher (runs both)
├── docker-compose.yml # Docker Compose setup
├── package.json # Root package.json (scripts)
└── README.md
```
## Core Components
### 1. Server (Node.js/Express)
#### Authentication Middleware (`src/routes/auth.ts`)
- Validates API keys from incoming requests
- Extracts user identity from `Authorization: Bearer <api_key>` header
- Returns 401 if authentication fails
- Attaches user info to request object for downstream handlers
#### API Proxy Route (`src/routes/api.ts`)
```typescript
// OpenAI-compatible endpoints
POST /v1/chat/completions
POST /v1/completions
GET /v1/models
```
- Forwards requests to selected backend
- Handles request/response transformation
- Logs all requests for analytics
#### Admin API Route (`src/routes/admin.ts`)
```typescript
// User Management
POST /admin/users # Create user
GET /admin/users # List users
PUT /admin/users/:id # Update user
DELETE /admin/users/:id # Delete user
// Backend Management
POST /admin/backends # Add backend
GET /admin/backends # List backends
PUT /admin/backends/:id # Update backend
DELETE /admin/backends/:id # Delete backend
// Permission Management
POST /admin/permissions # Grant permission
DELETE /admin/permissions # Revoke permission
GET /admin/permissions # List permissions
// Analytics
GET /admin/analytics/usage # Usage statistics
GET /admin/analytics/requests # Request logs
```
#### Router Service (`src/services/RouterService.ts`)
- Selects appropriate backend based on:
- User's permissions
- Backend availability
- Load balancing strategy (round-robin, least-loaded)
- Handles backend failures with retry logic
### 2. Client (Solid.js + Vite)
#### Pages
- **Dashboard**: Overview of system status, recent activity
- **Users**: CRUD operations for users, API key generation
- **Backends**: Manage backend API configurations
- **Permissions**: Assign/revoke backend access per user
- **Analytics**: View usage statistics and request logs
#### Key Features
- Real-time updates via polling or WebSocket (optional)
- Form validation for user/backend input
- Error handling with user-friendly messages
- Responsive design for various screen sizes
### 3. Databases (SQLite)
#### Core Database Schema (core.db)
```sql
-- Users table
CREATE TABLE users (
id INTEGER PRIMARY KEY AUTOINCREMENT,
api_key TEXT UNIQUE NOT NULL,
name TEXT NOT NULL,
email TEXT,
is_active BOOLEAN DEFAULT 1,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Backends table
CREATE TABLE backends (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT UNIQUE NOT NULL,
base_url TEXT NOT NULL,
api_key TEXT,
is_active BOOLEAN DEFAULT 1,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Permissions table (many-to-many: users ↔ backends)
CREATE TABLE permissions (
id INTEGER PRIMARY KEY AUTOINCREMENT,
user_id INTEGER NOT NULL,
backend_id INTEGER NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (user_id) REFERENCES users(id),
FOREIGN KEY (backend_id) REFERENCES backends(id),
UNIQUE(user_id, backend_id)
);
-- Indexes for performance
CREATE INDEX idx_users_api_key ON users(api_key);
CREATE INDEX idx_permissions_user ON permissions(user_id);
CREATE INDEX idx_permissions_backend ON permissions(backend_id);
```
#### Analytics Database Schema (analytics.db)
```sql
-- Request logs table
CREATE TABLE request_logs (
id INTEGER PRIMARY KEY AUTOINCREMENT,
user_id INTEGER NOT NULL,
backend_id INTEGER NOT NULL,
endpoint TEXT NOT NULL,
request_model TEXT,
response_model TEXT,
prompt_tokens INTEGER,
completion_tokens INTEGER,
total_tokens INTEGER,
status_code INTEGER,
response_time_ms INTEGER,
error_message TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (user_id) REFERENCES users(id),
FOREIGN KEY (backend_id) REFERENCES backends(id)
);
-- Usage stats table (aggregated daily)
CREATE TABLE usage_stats (
id INTEGER PRIMARY KEY AUTOINCREMENT,
user_id INTEGER NOT NULL,
backend_id INTEGER NOT NULL,
date DATE NOT NULL,
total_requests INTEGER DEFAULT 0,
total_tokens INTEGER DEFAULT 0,
FOREIGN KEY (user_id) REFERENCES users(id),
FOREIGN KEY (backend_id) REFERENCES backends(id),
UNIQUE(user_id, backend_id, date)
);
-- Backend metrics table (aggregated metrics per backend)
CREATE TABLE backend_metrics (
id INTEGER PRIMARY KEY AUTOINCREMENT,
backend_id INTEGER NOT NULL,
date DATE NOT NULL,
total_requests INTEGER DEFAULT 0,
total_tokens INTEGER DEFAULT 0,
avg_response_time_ms REAL DEFAULT 0,
error_count INTEGER DEFAULT 0,
success_rate REAL DEFAULT 1.0,
FOREIGN KEY (backend_id) REFERENCES backends(id),
UNIQUE(backend_id, date)
);
-- Indexes for performance
CREATE INDEX idx_request_logs_user ON request_logs(user_id);
CREATE INDEX idx_request_logs_backend ON request_logs(backend_id);
CREATE INDEX idx_request_logs_date ON request_logs(created_at);
CREATE INDEX idx_usage_stats_user ON usage_stats(user_id);
CREATE INDEX idx_usage_stats_date ON usage_stats(date);
CREATE INDEX idx_backend_metrics_backend ON backend_metrics(backend_id);
CREATE INDEX idx_backend_metrics_date ON backend_metrics(date);
```
## API Design
### OpenAI-Compatible API Proxy
The router exposes the same API interface as OpenAI, making it easy for clients to integrate.
#### Example Request
```bash
curl http://localhost:3000/v1/chat/completions \
-H "Authorization: Bearer <user_api_key>" \
-H "Content-Type: application/json" \
-d '{
"model": "llama-3",
"messages": [{"role": "user", "content": "Hello"}]
}'
```
#### Example Response
```json
{
"id": "chatcmpl-xxx",
"object": "chat.completion",
"created": 1234567890,
"model": "llama-3",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you?"
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 15,
"total_tokens": 25
}
}
```
### Admin API
#### Create User
```bash
POST /admin/users
Content-Type: application/json
{
"name": "John Doe",
"email": "john@example.com"
}
Response: {
"id": 1,
"name": "John Doe",
"api_key": "sk-xxx...",
"created_at": "2024-01-01T00:00:00Z"
}
```
#### Add Backend
```bash
POST /admin/backends
Content-Type: application/json
{
"name": "vLLM Server 1",
"base_url": "http://localhost:8000/v1",
"api_key": "backend-key-xxx"
}
```
## Security Considerations
1. **API Key Storage**: API keys are stored hashed in the database
2. **Transport Security**: Use HTTPS in production
3. **Rate Limiting**: Implement per-user rate limits
4. **Input Validation**: Validate all admin API inputs
5. **CORS**: Configure CORS for admin dashboard only
## Deployment
### Development
Run both server and client with a single command:
```bash
npm run dev
```
This starts:
- Express server on port 3000
- Vite dev server on port 3001 (admin API routes proxied from Express)
### Docker Compose
```yaml
version: '3.8'
services:
router:
build: .
ports:
- "3000:3000"
environment:
- NODE_ENV=production
- SERVER_PORT=3000
- CLIENT_PORT=3001
- CORE_DB_PATH=/data/core.db
- ANALYTICS_DB_PATH=/data/analytics.db
volumes:
- router-data:/data
restart: unless-stopped
volumes:
router-data:
```
### Environment Variables
| Variable | Description | Default |
|----------|-------------|---------|
| `SERVER_PORT` | Express server port | `3000` |
| `CLIENT_PORT` | Vite dev server port | `3001` |
| `CORE_DB_PATH` | Core database path | `./data/core.db` |
| `ANALYTICS_DB_PATH` | Analytics database path | `./data/analytics.db` |
| `ADMIN_PASSWORD` | Admin dashboard password | (required) |
## Development Roadmap
### Phase 1: Core Infrastructure
- [ ] Set up Express server with TypeScript
- [ ] Implement SQLite database schema
- [ ] Create user and backend models
- [ ] Implement API key authentication
### Phase 2: API Proxy
- [ ] Implement OpenAI-compatible API endpoints
- [ ] Add request forwarding to backends
- [ ] Implement basic routing logic
- [ ] Add request logging
### Phase 3: Admin API
- [ ] Implement CRUD endpoints for users
- [ ] Implement CRUD endpoints for backends
- [ ] Implement permission management
- [ ] Add usage analytics endpoints
### Phase 4: Admin Dashboard
- [ ] Set up Solid.js + Vite project
- [ ] Create user management UI
- [ ] Create backend management UI
- [ ] Create permission management UI
- [ ] Create analytics dashboard
### Phase 5: Advanced Features
- [ ] Add rate limiting
- [ ] Implement load balancing
- [ ] Add WebSocket for real-time updates
- [ ] Implement backend health checks
## Technology Stack Summary
| Component | Technology |
|-----------|------------|
| Backend | Node.js 18+, Express.js, TypeScript |
| Core Database | SQLite (better-sqlite3) |
| Analytics Database | SQLite (better-sqlite3) |
| Frontend | Solid.js, Vite, TypeScript |
| HTTP Client | Axios/Fetch |
| Charts | Chart.js or Solid-chart |
| Styling | Tailwind CSS or CSS Modules |
| Validation | Zod |
| Testing | Vitest (frontend), Jest (backend) |
| Dev Server | Concurrently for running both servers |