19 KiB
19 KiB
Kyush LLM Router Architecture Document
Overview
Kyush LLM Router is a proxy server that manages multiple users and routes their requests to various OpenAI-compatible backend APIs. It provides API key-based authentication, user permission management, and usage monitoring through an admin web dashboard.
System Architecture
┌───────────────────────────────────────────────────────────────────────────┐
│ Client Layer │
├─────────────────────┬───────────────────────────────┬─────────────────────┤
│ LLM Clients │ Admin Dashboard │ Vite Dev Server │
│ (OpenAI SDK etc) │ (Solid.js + Vite) │ (Port 5173) │
└──────────┬──────────┴──────────────┬────────────────┴──────────┬──────────┘
│ │ │
│ OpenAI-Compatible API │ REST API │
│ (Port 3000) │ (Port 3001) │
▼ ▼ │
┌───────────────────────────────────────────────────────────────────────────┐
│ Router Server (Node.js/Express) │
│ (Port 3000) │
├───────────────────────────────────────────────────────────────────────────┤
│ ┌───────────────────────────────────────────────────────────────────┐ │
│ │ Authentication Middleware │ │
│ │ - API Key Validation │ │
│ │ - Rate Limiting │ │
│ └───────────────────────────────────────────────────────────────────┘ │
│ ┌───────────────────────────────────────────────────────────────────┐ │
│ │ Request Router │ │
│ │ - Backend Selection │ │
│ │ - Load Balancing (optional) │ │
│ │ - Request/Response Transformation │ │
│ └───────────────────────────────────────────────────────────────────┘ │
│ ┌───────────────────────────────────────────────────────────────────┐ │
│ │ Admin API Endpoints (proxied to Vite dev server) │ │
│ │ - User Management │ │
│ │ - Backend Management │ │
│ │ - Permission Management │ │
│ │ - Usage Analytics │ │
│ └───────────────────────────────────────────────────────────────────┘ │
└────────────────────────────┬──────────────────────────────────────────────┘
│
┌─────────────────┴─────────────────┐
▼ ▼
┌───────────────────────┐ ┌─────────────────────────┐
│ Core Database │ │ Analytics Database │
│ (SQLite - core.db) │ │ (SQLite - analytics.db)│
├───────────────────────┤ ├─────────────────────────┤
│ - users │ │ - request_logs │
│ - backends │ │ - usage_stats │
│ - permissions │ │ - backend_metrics │
└───────────────────────┘ └─────────────────────────┘
│ │
└─────────────────┬─────────────────┘
▼
┌───────────────────────────────────────────────────────────────────────────┐
│ Backend Layer │
├───────────────────────────────────────────────────────────────────────────┤
│ Multiple OpenAI-Compatible APIs (vLLM, SGLang, etc.) │
└───────────────────────────────────────────────────────────────────────────┘
Project Structure
kyush-llm-router/
├── server/
│ ├── src/
│ │ ├── index.ts # Express app entry point
│ │ ├── config/
│ │ │ ├── database.ts # Core SQLite connection
│ │ │ └── analytics-db.ts # Analytics SQLite connection
│ │ ├── routes/
│ │ │ ├── api.ts # OpenAI-compatible API proxy
│ │ │ ├── admin.ts # Admin management API
│ │ │ └── auth.ts # Authentication middleware
│ │ ├── models/
│ │ │ ├── User.ts # User model
│ │ │ ├── Backend.ts # Backend model
│ │ │ └── Permission.ts # Permission model
│ │ ├── analytics/
│ │ │ ├── RequestLog.ts # Request log model (analytics DB)
│ │ │ ├── UsageStats.ts # Usage stats model (analytics DB)
│ │ │ └── BackendMetrics.ts # Backend metrics model
│ │ ├── services/
│ │ │ ├── AuthService.ts # API key validation
│ │ │ ├── RouterService.ts # Backend routing logic
│ │ │ └── AnalyticsService.ts # Usage tracking
│ │ └── utils/
│ │ └── logger.ts # Logging utility
│ ├── package.json
│ └── tsconfig.json
│
├── client/
│ ├── src/
│ │ ├── index.tsx # Solid.js entry point
│ │ ├── App.tsx # Main application component
│ │ ├── routes/
│ │ │ ├── Dashboard.tsx # Main dashboard
│ │ │ ├── Users.tsx # User management
│ │ │ ├── Backends.tsx # Backend management
│ │ │ ├── Permissions.tsx # Permission management
│ │ │ └── Analytics.tsx # Usage monitoring
│ │ ├── components/
│ │ │ ├── Layout.tsx # Admin layout
│ │ │ ├── UserTable.tsx # User list table
│ │ │ ├── BackendTable.tsx # Backend list table
│ │ │ └── StatsChart.tsx # Usage chart component
│ │ ├── api/
│ │ │ └── client.ts # API client for admin endpoints
│ │ └── types/
│ │ └── index.ts # TypeScript type definitions
│ ├── package.json
│ ├── vite.config.ts
│ └── tsconfig.json
│
├── shared/
│ └── types.ts # Shared type definitions
│
├── database/
│ ├── schema.sql # Core database schema
│ └── analytics-schema.sql # Analytics database schema
│
├── scripts/
│ └── dev.js # Dev server launcher (runs both)
│
├── docker-compose.yml # Docker Compose setup
├── package.json # Root package.json (scripts)
└── README.md
Core Components
1. Server (Node.js/Express)
Authentication Middleware (src/routes/auth.ts)
- Validates API keys from incoming requests
- Extracts user identity from
Authorization: Bearer <api_key>header - Returns 401 if authentication fails
- Attaches user info to request object for downstream handlers
API Proxy Route (src/routes/api.ts)
// OpenAI-compatible endpoints
POST /v1/chat/completions
POST /v1/completions
GET /v1/models
- Forwards requests to selected backend
- Handles request/response transformation
- Logs all requests for analytics
Admin API Route (src/routes/admin.ts)
// User Management
POST /admin/users # Create user
GET /admin/users # List users
PUT /admin/users/:id # Update user
DELETE /admin/users/:id # Delete user
// Backend Management
POST /admin/backends # Add backend
GET /admin/backends # List backends
PUT /admin/backends/:id # Update backend
DELETE /admin/backends/:id # Delete backend
// Permission Management
POST /admin/permissions # Grant permission
DELETE /admin/permissions # Revoke permission
GET /admin/permissions # List permissions
// Analytics
GET /admin/analytics/usage # Usage statistics
GET /admin/analytics/requests # Request logs
Router Service (src/services/RouterService.ts)
- Selects appropriate backend based on:
- User's permissions
- Backend availability
- Load balancing strategy (round-robin, least-loaded)
- Handles backend failures with retry logic
2. Client (Solid.js + Vite)
Pages
- Dashboard: Overview of system status, recent activity
- Users: CRUD operations for users, API key generation
- Backends: Manage backend API configurations
- Permissions: Assign/revoke backend access per user
- Analytics: View usage statistics and request logs
Key Features
- Real-time updates via polling or WebSocket (optional)
- Form validation for user/backend input
- Error handling with user-friendly messages
- Responsive design for various screen sizes
3. Databases (SQLite)
Core Database Schema (core.db)
-- Users table
CREATE TABLE users (
id INTEGER PRIMARY KEY AUTOINCREMENT,
api_key TEXT UNIQUE NOT NULL,
name TEXT NOT NULL,
email TEXT,
is_active BOOLEAN DEFAULT 1,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Backends table
CREATE TABLE backends (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT UNIQUE NOT NULL,
base_url TEXT NOT NULL,
api_key TEXT,
is_active BOOLEAN DEFAULT 1,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Permissions table (many-to-many: users ↔ backends)
CREATE TABLE permissions (
id INTEGER PRIMARY KEY AUTOINCREMENT,
user_id INTEGER NOT NULL,
backend_id INTEGER NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (user_id) REFERENCES users(id),
FOREIGN KEY (backend_id) REFERENCES backends(id),
UNIQUE(user_id, backend_id)
);
-- Indexes for performance
CREATE INDEX idx_users_api_key ON users(api_key);
CREATE INDEX idx_permissions_user ON permissions(user_id);
CREATE INDEX idx_permissions_backend ON permissions(backend_id);
Analytics Database Schema (analytics.db)
-- Request logs table
CREATE TABLE request_logs (
id INTEGER PRIMARY KEY AUTOINCREMENT,
user_id INTEGER NOT NULL,
backend_id INTEGER NOT NULL,
endpoint TEXT NOT NULL,
request_model TEXT,
response_model TEXT,
prompt_tokens INTEGER,
completion_tokens INTEGER,
total_tokens INTEGER,
status_code INTEGER,
response_time_ms INTEGER,
error_message TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (user_id) REFERENCES users(id),
FOREIGN KEY (backend_id) REFERENCES backends(id)
);
-- Usage stats table (aggregated daily)
CREATE TABLE usage_stats (
id INTEGER PRIMARY KEY AUTOINCREMENT,
user_id INTEGER NOT NULL,
backend_id INTEGER NOT NULL,
date DATE NOT NULL,
total_requests INTEGER DEFAULT 0,
total_tokens INTEGER DEFAULT 0,
FOREIGN KEY (user_id) REFERENCES users(id),
FOREIGN KEY (backend_id) REFERENCES backends(id),
UNIQUE(user_id, backend_id, date)
);
-- Backend metrics table (aggregated metrics per backend)
CREATE TABLE backend_metrics (
id INTEGER PRIMARY KEY AUTOINCREMENT,
backend_id INTEGER NOT NULL,
date DATE NOT NULL,
total_requests INTEGER DEFAULT 0,
total_tokens INTEGER DEFAULT 0,
avg_response_time_ms REAL DEFAULT 0,
error_count INTEGER DEFAULT 0,
success_rate REAL DEFAULT 1.0,
FOREIGN KEY (backend_id) REFERENCES backends(id),
UNIQUE(backend_id, date)
);
-- Indexes for performance
CREATE INDEX idx_request_logs_user ON request_logs(user_id);
CREATE INDEX idx_request_logs_backend ON request_logs(backend_id);
CREATE INDEX idx_request_logs_date ON request_logs(created_at);
CREATE INDEX idx_usage_stats_user ON usage_stats(user_id);
CREATE INDEX idx_usage_stats_date ON usage_stats(date);
CREATE INDEX idx_backend_metrics_backend ON backend_metrics(backend_id);
CREATE INDEX idx_backend_metrics_date ON backend_metrics(date);
API Design
OpenAI-Compatible API Proxy
The router exposes the same API interface as OpenAI, making it easy for clients to integrate.
Example Request
curl http://localhost:3000/v1/chat/completions \
-H "Authorization: Bearer <user_api_key>" \
-H "Content-Type: application/json" \
-d '{
"model": "llama-3",
"messages": [{"role": "user", "content": "Hello"}]
}'
Example Response
{
"id": "chatcmpl-xxx",
"object": "chat.completion",
"created": 1234567890,
"model": "llama-3",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you?"
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 15,
"total_tokens": 25
}
}
Admin API
Create User
POST /admin/users
Content-Type: application/json
{
"name": "John Doe",
"email": "john@example.com"
}
Response: {
"id": 1,
"name": "John Doe",
"api_key": "sk-xxx...",
"created_at": "2024-01-01T00:00:00Z"
}
Add Backend
POST /admin/backends
Content-Type: application/json
{
"name": "vLLM Server 1",
"base_url": "http://localhost:8000/v1",
"api_key": "backend-key-xxx"
}
Security Considerations
- API Key Storage: API keys are stored hashed in the database
- Transport Security: Use HTTPS in production
- Rate Limiting: Implement per-user rate limits
- Input Validation: Validate all admin API inputs
- CORS: Configure CORS for admin dashboard only
Deployment
Development
Run both server and client with a single command:
npm run dev
This starts:
- Express server on port 3000
- Vite dev server on port 3001 (admin API routes proxied from Express)
Docker Compose
version: '3.8'
services:
router:
build: .
ports:
- "3000:3000"
environment:
- NODE_ENV=production
- SERVER_PORT=3000
- CLIENT_PORT=3001
- CORE_DB_PATH=/data/core.db
- ANALYTICS_DB_PATH=/data/analytics.db
volumes:
- router-data:/data
restart: unless-stopped
volumes:
router-data:
Environment Variables
| Variable | Description | Default |
|---|---|---|
SERVER_PORT |
Express server port | 3000 |
CLIENT_PORT |
Vite dev server port | 3001 |
CORE_DB_PATH |
Core database path | ./data/core.db |
ANALYTICS_DB_PATH |
Analytics database path | ./data/analytics.db |
ADMIN_PASSWORD |
Admin dashboard password | (required) |
Development Roadmap
Phase 1: Core Infrastructure
- Set up Express server with TypeScript
- Implement SQLite database schema
- Create user and backend models
- Implement API key authentication
Phase 2: API Proxy
- Implement OpenAI-compatible API endpoints
- Add request forwarding to backends
- Implement basic routing logic
- Add request logging
Phase 3: Admin API
- Implement CRUD endpoints for users
- Implement CRUD endpoints for backends
- Implement permission management
- Add usage analytics endpoints
Phase 4: Admin Dashboard
- Set up Solid.js + Vite project
- Create user management UI
- Create backend management UI
- Create permission management UI
- Create analytics dashboard
Phase 5: Advanced Features
- Add rate limiting
- Implement load balancing
- Add WebSocket for real-time updates
- Implement backend health checks
Technology Stack Summary
| Component | Technology |
|---|---|
| Backend | Node.js 18+, Express.js, TypeScript |
| Core Database | SQLite (better-sqlite3) |
| Analytics Database | SQLite (better-sqlite3) |
| Frontend | Solid.js, Vite, TypeScript |
| HTTP Client | Axios/Fetch |
| Charts | Chart.js or Solid-chart |
| Styling | Tailwind CSS or CSS Modules |
| Validation | Zod |
| Testing | Vitest (frontend), Jest (backend) |
| Dev Server | Concurrently for running both servers |