feat(rewrite): enhance model rewrite logic with cycle detection and chain evaluation

This commit is contained in:
Kyush 2026-04-23 17:39:58 +09:00
commit 7cef8635bd
9 changed files with 531 additions and 79 deletions

View file

@ -145,7 +145,7 @@ export const Models: Component = () => {
<div class="ui-app-page">
<PageHeader
title="Models"
description="Inspect cached backend model catalogs and manage global model rewrite rules."
description="Inspect cached backend model catalogs and manage chained global model rewrite rules."
actions={<Button onClick={() => void Promise.all([refetchOverview(), refetchRules()])}>Refresh</Button>}
/>
@ -224,7 +224,7 @@ export const Models: Component = () => {
<Panel
title="Model Rewrite Rules"
description="Force rules always rewrite. Fallback rules rewrite only when the original model has no usable backend."
description="Force rules always rewrite and continue through the chain. Fallback rules continue only when the current model has no usable backend."
actions={<IconButton variant="primary" icon={<Plus />} label="Add Rule" onClick={openCreateDialog} />}
>
<div class="ui-stack ui-stack--tight">
@ -266,7 +266,7 @@ export const Models: Component = () => {
open={dialogOpen()}
onOpenChange={setDialogOpen}
title={editingRule() ? 'Edit Model Rule' : 'Add Model Rule'}
description="Choose whether the target model should always replace the source, or only act as a fallback when the source is unavailable."
description="Choose whether the target model should always replace the source, or only continue the chain when the current model is unavailable."
footer={
<>
<Button onClick={() => setDialogOpen(false)} disabled={submitting()}>Cancel</Button>
@ -282,7 +282,7 @@ export const Models: Component = () => {
<TextField label="Note" value={form().note} onInput={(event) => setForm((current) => ({ ...current, note: event.currentTarget.value }))} />
<Checkbox
label="Always force rewrite"
description="When enabled, requests always route to the target model. When disabled, the target model is only used as a fallback."
description="When enabled, requests always continue to the target model. When disabled, the target is used only if the current model has no backend."
checked={form().force}
onChange={(checked) => setForm((current) => ({ ...current, force: checked }))}
/>

View file

@ -23,9 +23,10 @@
`/v1/**`는 기존 사용자 API 키 인증을 유지하며 관리자 인증과 분리된다.
추가 동작:
- `/v1/chat/completions` 는 요청 모델명을 먼저 전역 rewrite 규칙으로 해석한 뒤, 최종 모델을 서빙하는 허용 가능한 활성 백엔드만 후보로 사용한다
- `force=true` rewrite 는 항상 적용된다
- `force=false` rewrite 는 원본 모델을 서빙하는 허용 가능한 활성 백엔드가 없을 때만 fallback 으로 적용된다
- `/v1/chat/completions` 는 요청 모델명을 먼저 전역 rewrite 체인으로 해석한 뒤, 최종 모델을 서빙하는 허용 가능한 활성 백엔드만 후보로 사용한다
- `force=true` rewrite 는 항상 적용되고 target 모델의 다음 규칙까지 계속 평가한다
- `force=false` rewrite 는 현재 모델을 서빙하는 허용 가능한 활성 백엔드가 없을 때만 fallback 으로 적용되고 target 모델의 다음 규칙까지 계속 평가한다
- `/v1/models` 는 native backend 모델뿐 아니라 현재 사용자 권한에서 최종 후보가 있는 rewrite source alias도 함께 반환한다
- 최종 후보가 없으면 모델 미지원 오류를 반환하고 `request_model`, `routed_model` 을 함께 내려준다
## Admin API
@ -80,6 +81,8 @@
| PUT | `/admin/model-rewrites/:id` | 전역 모델 rewrite 규칙 수정 |
| DELETE | `/admin/model-rewrites/:id` | 전역 모델 rewrite 규칙 삭제 |
활성 rewrite 그래프에 cycle을 만드는 생성/수정 요청은 `409 { error, cycle }` 로 거부된다. 비활성 규칙끼리의 cycle은 저장할 수 있지만 활성화 시점에는 같은 검사를 통과해야 한다.
`GET /admin/backends/:id/models` 응답에는 아래가 함께 포함된다.
- `backend`: 백엔드 기본 정보 + 캐시 요약
- `cache`: 메모리 캐시 상태 (`ready`, `uninitialized`, `error`, `inactive`)

View file

@ -78,7 +78,8 @@ SPA는 `/dashboard`를 라우터 base로 사용하고, 관리자 API는 계속 `
- `Backends` 화면은 백엔드별 모델 캐시 상태, 모델 수, 마지막 sync 상태를 표시한다
- `Backends` 화면에서 활성 백엔드는 수동 refresh 와 캐시된 모델 목록 확인이 가능하다
- 비활성 백엔드는 모델 조회를 시도하지 않으며 UI에서도 `Skipped` 상태로 표시된다
- `Models` 화면은 전체 메모리 모델 카탈로그와 전역 모델 rewrite 규칙을 관리한다
- `Models` 화면은 전체 메모리 모델 카탈로그와 전역 모델 rewrite 체인을 관리한다
- rewrite 규칙은 2가지 모드를 가진다
- `Force`: 원본 모델 사용 가능 여부와 관계없이 항상 target model 로 rewrite
- `Fallback`: 원본 모델을 서빙하는 허용 가능한 활성 백엔드가 없을 때만 target model 로 rewrite
- `Force`: 현재 모델 사용 가능 여부와 관계없이 항상 target model 로 이동하고 다음 규칙을 계속 평가
- `Fallback`: 현재 모델을 서빙하는 허용 가능한 활성 백엔드가 없을 때만 target model 로 이동하고 다음 규칙을 계속 평가
- 활성 rewrite cycle은 저장 시점에 거부되며, `/v1/models` 는 실제 요청 가능한 rewrite alias를 함께 반환한다

View file

@ -9,7 +9,7 @@
1. 사용자 API 키 인증
2. 사용자가 접근 가능한 backend id 목록 로드
3. 접근 가능한 활성 백엔드 중 아직 메모리 카탈로그가 초기화되지 않은 백엔드만 `/v1/models` 로 lazy fetch
4. 요청 `model` 에 대해 전역 `model_rewrites` 규칙 평가
4. 요청 `model` 에 대해 전역 `model_rewrites` 체인을 끝까지 평가
5. 최종 모델을 서빙하는 허용 가능한 활성 백엔드만 후보로 선택
6. 후보 중 1개를 랜덤 선택 후 업스트림으로 포워딩
@ -17,7 +17,8 @@
1. 사용자 API 키 인증
2. 접근 가능한 활성 백엔드의 메모리 카탈로그를 확인
3. 모델 ID 합집합을 반환
3. native backend 모델과 rewrite `source_model` alias를 같은 체인 해석기로 평가
4. 최종 모델 후보가 있는 requestable 모델 ID 합집합을 반환
## Caching Rules
@ -37,13 +38,22 @@
| Mode | Condition | Result |
|------|-----------|--------|
| `force=true` | 항상 | `source_model` 즉시 `target_model` 로 치환 |
| `force=false` | 원본 모델 후보가 없을 때만 | `target_model` 을 fallback 으로 사용 |
| `force=true` | 항상 | `source_model``target_model` 로 치환하고 다음 규칙을 계속 평가 |
| `force=false` | 현재 모델 후보가 없을 때만 | `target_model` 을 fallback 으로 사용하고 다음 규칙을 계속 평가 |
해석 기준:
- “원본 모델 후보가 있다”는 것은 사용자가 접근 가능하고 활성 상태이며, 메모리 카탈로그상 해당 모델을 서빙하는 백엔드가 하나 이상 있다는 뜻이다
- 원본 모델 후보가 있으면 fallback 규칙은 무시된다
- “현재 모델 후보가 있다”는 것은 사용자가 접근 가능하고 활성 상태이며, 메모리 카탈로그상 해당 모델을 서빙하는 백엔드가 하나 이상 있다는 뜻이다
- 현재 모델 후보가 있으면 fallback 규칙은 무시되고 체인 평가가 멈춘다
- force 규칙은 현재 모델 후보 존재 여부와 관계없이 target으로 이동한다
- 최종 모델 후보가 없으면 라우터는 포워딩하지 않고 모델 미지원 오류를 반환한다
- 활성 rewrite 그래프에 cycle이 생기는 관리자 생성/수정은 거부된다
- 직접 DB 조작 등으로 runtime cycle이 발견되면 라우터는 설정 오류를 반환한다
- 체인 평가는 요청별 allowed backend set과 candidate memo를 사용해 반복 DB 조회를 피한다
예시:
`AutoModelTranslate -(Force)-> Qwen3.5 -(Force)-> Qwen/Qwen3.5-397B-A17B-FP8 -(Fallback)-> Gemma4 -(Force)-> cyankiwi/gemma-4-26B-A4B-it-AWQ-4bit`
위 예시에서 `Qwen/Qwen3.5-397B-A17B-FP8` 후보가 있으면 fallback이 적용되지 않고 그 모델로 라우팅된다. 후보가 없으면 `Gemma4`로 이동한 뒤 force 규칙을 이어서 적용한다.
## Admin Surface

View file

@ -62,11 +62,12 @@ server/src/
## Model Routing
- 요청 모델명은 먼저 전역 `model_rewrites` 규칙을 확인한다
- `force=1` 규칙은 항상 `source_model -> target_model` 로 변환한다
- `force=0` 규칙은 원본 모델을 서빙하는 허용 가능한 활성 백엔드가 없을 때만 fallback 으로 적용한다
- 요청 모델명은 먼저 전역 `model_rewrites` 체인을 확인한다
- `force=1` 규칙은 항상 `source_model -> target_model` 로 변환하고 다음 규칙을 계속 확인한다
- `force=0` 규칙은 현재 모델을 서빙하는 허용 가능한 활성 백엔드가 없을 때만 fallback 으로 적용하고 다음 규칙을 계속 확인한다
- 활성 rewrite cycle은 관리자 생성/수정 시 거부하고, runtime에서도 방어한다
- 최종 모델을 서빙하는 허용 가능한 활성 백엔드가 없으면 `/v1/chat/completions` 는 모델 미지원 오류를 반환한다
- `/v1/models` 는 허용 가능한 활성 백엔드들의 캐시된 모델 목록 합집합을 반환한다
- `/v1/models` 는 허용 가능한 활성 백엔드들의 native 모델과 실제 요청 가능한 rewrite alias 합집합을 반환한다
참고:
- 세부 라우팅 규칙과 캐시 트리거는 [docs/model-routing.md](./model-routing.md) 참고

View file

@ -21,6 +21,19 @@ const router: Router = Router();
router.use('/scripts', scriptRoutes);
function sendRewriteCycleError(res: Response, rules: ReturnType<typeof ModelRewriteModel.findAll>): boolean {
const cycle = ModelCatalogService.detectRewriteCycle(rules);
if (!cycle) {
return false;
}
res.status(409).json({
error: 'Model rewrite cycle detected',
cycle,
});
return true;
}
router.get('/dashboard/summary', (req: Request, res: Response) => {
const days = req.query.days ? Number(req.query.days) : 30;
res.json(AnalyticsService.getDashboardSummary(days));
@ -303,10 +316,30 @@ router.post('/model-rewrites', (req: Request, res: Response) => {
return;
}
const sourceModel = source_model.trim();
const targetModel = target_model.trim();
const timestamp = getUtcTimestamp();
const candidateRules = [
...ModelRewriteModel.findAll(),
{
id: 0,
source_model: sourceModel,
target_model: targetModel,
is_active: is_active === false ? false : true,
force: !!force,
note,
created_at: timestamp,
updated_at: timestamp,
},
];
if (sendRewriteCycleError(res, candidateRules)) {
return;
}
try {
const rule = ModelRewriteModel.create({
source_model: source_model.trim(),
target_model: target_model.trim(),
source_model: sourceModel,
target_model: targetModel,
is_active,
force,
note,
@ -330,8 +363,40 @@ router.put('/model-rewrites/:id', (req: Request, res: Response) => {
return;
}
const data = req.body as UpdateModelRewriteData;
if (typeof data.source_model === 'string' && !data.source_model.trim()) {
res.status(400).json({ error: 'source_model cannot be empty' });
return;
}
if (typeof data.target_model === 'string' && !data.target_model.trim()) {
res.status(400).json({ error: 'target_model cannot be empty' });
return;
}
const candidateRules = ModelRewriteModel.findAll().map((rule) => {
if (rule.id !== id) {
return rule;
}
return {
...rule,
source_model: typeof data.source_model === 'string' ? data.source_model.trim() : rule.source_model,
target_model: typeof data.target_model === 'string' ? data.target_model.trim() : rule.target_model,
is_active: data.is_active !== undefined ? data.is_active : rule.is_active,
force: data.force !== undefined ? data.force : rule.force,
note: data.note !== undefined ? data.note : rule.note,
};
});
if (sendRewriteCycleError(res, candidateRules)) {
return;
}
try {
const updated = ModelRewriteModel.update(id, req.body as UpdateModelRewriteData);
const updated = ModelRewriteModel.update(id, {
...data,
source_model: typeof data.source_model === 'string' ? data.source_model.trim() : undefined,
target_model: typeof data.target_model === 'string' ? data.target_model.trim() : undefined,
});
ModelCatalogService.loadRewriteMap();
res.json(updated);
} catch (error) {

View file

@ -1,11 +1,10 @@
import { Router, Request, Response } from 'express';
import { authenticate, AuthenticatedRequest } from './auth';
import { BackendModel } from '../models/Backend';
import { RouterService } from '../services/RouterService';
import { AnalyticsService } from '../services/AnalyticsService';
import { ScriptEngine } from '../services/ScriptEngine';
import { logger } from '../utils/logger';
import { ModelCatalogService } from '../services/ModelCatalogService';
import { ModelCatalogService, ModelRewriteCycleError } from '../services/ModelCatalogService';
import { getDetailStreamLogMode } from '../config/stream-logging';
import { ChatStreamLogAccumulator } from '../utils/streamLog';
@ -36,17 +35,14 @@ router.post('/chat/completions', async (req: AuthenticatedRequest, res: Response
const requestedModel = typeof req.body?.model === 'string' ? req.body.model : '';
await ModelCatalogService.ensureInitializedForBackends(allowedBackendIds);
const resolution = ModelCatalogService.resolveRequestedModel(requestedModel, allowedBackendIds);
const activeAllowedBackendIds = BackendModel.findActive()
.map((item) => item.id)
.filter((backendId) => allowedBackendIds.includes(backendId));
const activeAllowedBackendIds = ModelCatalogService.getActiveAllowedBackendIds(allowedBackendIds);
if (activeAllowedBackendIds.length === 0) {
AnalyticsService.logRequest({
user_id: user.id,
backend_id: 0,
endpoint: '/v1/chat/completions',
request_model: requestedModel,
routed_model: resolution.routedModel,
routed_model: requestedModel,
status_code: 403,
error_message: 'No active backends available',
detail_logged: user.detail_logging,
@ -56,7 +52,34 @@ router.post('/chat/completions', async (req: AuthenticatedRequest, res: Response
res.status(403).json({ error: 'No active backends available' });
return;
}
const candidateBackendIds = ModelCatalogService.getCandidateBackendIds(resolution.routedModel, allowedBackendIds);
let resolution: ReturnType<typeof ModelCatalogService.resolveRequestedModel>;
try {
resolution = ModelCatalogService.resolveRequestedModel(requestedModel, activeAllowedBackendIds);
} catch (error) {
const errorMsg = error instanceof ModelRewriteCycleError
? error.message
: error instanceof Error
? error.message
: 'Model rewrite resolution failed';
AnalyticsService.logRequest({
user_id: user.id,
backend_id: 0,
endpoint: '/v1/chat/completions',
request_model: requestedModel,
routed_model: requestedModel,
status_code: 500,
error_message: errorMsg,
detail_logged: user.detail_logging,
request_headers: user.detail_logging ? normalizeHeaders(req.headers) : undefined,
request_body: user.detail_logging ? req.body : undefined,
});
logger.error(`Model rewrite resolution failed for user ${user.id}: ${errorMsg}`);
res.status(500).json({ error: 'Model rewrite configuration error' });
return;
}
const candidateBackendIds = ModelCatalogService.getCandidateBackendIds(resolution.routedModel, activeAllowedBackendIds);
const backend = RouterService.selectBackend(candidateBackendIds);
if (!backend) {
AnalyticsService.logRequest({
@ -336,17 +359,23 @@ router.get('/models', async (req: AuthenticatedRequest, res: Response) => {
}
await ModelCatalogService.ensureInitializedForBackends(allowedBackendIds);
const activeAllowedBackendIds = BackendModel.findActive()
.map((item) => item.id)
.filter((backendId) => allowedBackendIds.includes(backendId));
const activeAllowedBackendIds = ModelCatalogService.getActiveAllowedBackendIds(allowedBackendIds);
if (activeAllowedBackendIds.length === 0) {
res.status(403).json({ error: 'No active backends available' });
return;
}
const models = ModelCatalogService.getModelsForAllowedBackends(activeAllowedBackendIds).map((entry) => ({
id: entry.model_id,
object: 'model',
}));
let models: Array<{ id: string; object: string }>;
try {
models = ModelCatalogService.getRequestableModelsForAllowedBackends(activeAllowedBackendIds).map((entry) => ({
id: entry.model_id,
object: 'model',
}));
} catch (error) {
const errorMsg = error instanceof Error ? error.message : 'Model rewrite resolution failed';
logger.error(`Model list resolution failed: ${errorMsg}`);
res.status(500).json({ error: 'Model rewrite configuration error' });
return;
}
res.json({ object: 'list', data: models });
});

View file

@ -35,16 +35,34 @@ interface RewriteResolution {
requestedModel: string;
routedModel: string;
wasRewritten: boolean;
ruleType: 'none' | 'force' | 'fallback';
ruleType: 'none' | 'force' | 'fallback' | 'chain';
}
interface RewriteConfig {
id: number;
sourceModel: string;
targetModel: string;
force: boolean;
}
interface ResolutionContext {
allowedActiveBackendIds: number[];
allowedActiveBackendIdSet: Set<number>;
candidateMemo: Map<string, number[]>;
}
const DEFAULT_REFRESH_MIN_MS = 5 * 60 * 1000;
export class ModelRewriteCycleError extends Error {
cycle: string[];
constructor(cycle: string[]) {
super(`Model rewrite cycle detected: ${cycle.join(' -> ')}`);
this.name = 'ModelRewriteCycleError';
this.cycle = cycle;
}
}
export class ModelCatalogService {
private static backendModelsByBackendId = new Map<number, BackendCacheEntry>();
private static backendIdsByModel = new Map<string, Set<number>>();
@ -188,14 +206,60 @@ export class ModelCatalogService {
this.modelRewriteMap.clear();
for (const rule of ModelRewriteModel.findAll()) {
if (rule.is_active) {
this.modelRewriteMap.set(rule.source_model, {
targetModel: rule.target_model,
const sourceModel = this.normalizeModelId(rule.source_model);
const targetModel = this.normalizeModelId(rule.target_model);
this.modelRewriteMap.set(sourceModel, {
id: rule.id,
sourceModel,
targetModel,
force: rule.force,
});
}
}
}
private static createResolutionContext(allowedBackendIds: number[]): ResolutionContext {
const allowed = new Set(allowedBackendIds);
const allowedActiveBackendIds = BackendModel.findActive()
.map((backend) => backend.id)
.filter((backendId) => allowed.has(backendId));
return {
allowedActiveBackendIds,
allowedActiveBackendIdSet: new Set(allowedActiveBackendIds),
candidateMemo: new Map<string, number[]>(),
};
}
static getActiveAllowedBackendIds(allowedBackendIds: number[]): number[] {
return this.createResolutionContext(allowedBackendIds).allowedActiveBackendIds;
}
private static getCandidateBackendIdsWithContext(modelId: string, context: ResolutionContext): number[] {
const normalized = this.normalizeModelId(modelId);
const memoized = context.candidateMemo.get(normalized);
if (memoized) return memoized;
const backendIds = this.backendIdsByModel.get(normalized);
const candidates = backendIds
? Array.from(backendIds).filter((backendId) => context.allowedActiveBackendIdSet.has(backendId))
: [];
const sorted = candidates.sort((a, b) => a - b);
context.candidateMemo.set(normalized, sorted);
return sorted;
}
private static getRuleTypeFromAppliedRules(appliedRules: RewriteConfig[]): RewriteResolution['ruleType'] {
if (appliedRules.length === 0) {
return 'none';
}
if (appliedRules.length === 1) {
return appliedRules[0].force ? 'force' : 'fallback';
}
return 'chain';
}
static syncActiveBackendCacheState(): void {
const backends = BackendModel.findAll();
const backendIds = new Set(backends.map((backend) => backend.id));
@ -216,44 +280,128 @@ export class ModelCatalogService {
this.rebuildModelIndex();
}
static resolveRequestedModel(modelId: string, allowedBackendIds: number[]): RewriteResolution {
private static resolveRequestedModelWithContext(modelId: string, context: ResolutionContext): RewriteResolution {
const requestedModel = this.normalizeModelId(modelId);
const rewrite = this.modelRewriteMap.get(requestedModel);
if (!rewrite) {
return {
requestedModel,
routedModel: requestedModel,
wasRewritten: false,
ruleType: 'none',
};
const visitedModels = new Map<string, number>();
const path: string[] = [];
const appliedRules: RewriteConfig[] = [];
let currentModel = requestedModel;
const maxSteps = this.modelRewriteMap.size + 1;
for (let step = 0; step <= maxSteps; step += 1) {
const firstSeenAt = visitedModels.get(currentModel);
if (firstSeenAt !== undefined) {
throw new ModelRewriteCycleError([...path.slice(firstSeenAt), currentModel]);
}
visitedModels.set(currentModel, path.length);
path.push(currentModel);
const rewrite = this.modelRewriteMap.get(currentModel);
if (!rewrite) {
return {
requestedModel,
routedModel: currentModel,
wasRewritten: currentModel !== requestedModel,
ruleType: this.getRuleTypeFromAppliedRules(appliedRules),
};
}
if (!rewrite.force) {
const originalCandidates = this.getCandidateBackendIdsWithContext(currentModel, context);
if (originalCandidates.length > 0) {
return {
requestedModel,
routedModel: currentModel,
wasRewritten: currentModel !== requestedModel,
ruleType: this.getRuleTypeFromAppliedRules(appliedRules),
};
}
}
appliedRules.push(rewrite);
currentModel = this.normalizeModelId(rewrite.targetModel);
}
if (rewrite.force) {
return {
requestedModel,
routedModel: rewrite.targetModel,
wasRewritten: rewrite.targetModel !== requestedModel,
ruleType: 'force',
};
throw new ModelRewriteCycleError([...path, currentModel]);
}
static resolveRequestedModel(modelId: string, allowedBackendIds: number[]): RewriteResolution {
return this.resolveRequestedModelWithContext(modelId, this.createResolutionContext(allowedBackendIds));
}
static detectRewriteCycle(rules: ModelRewriteRule[]): string[] | null {
const activeRules = new Map<string, string>();
for (const rule of rules) {
if (rule.is_active) {
activeRules.set(this.normalizeModelId(rule.source_model), this.normalizeModelId(rule.target_model));
}
}
const originalCandidates = this.getCandidateBackendIds(requestedModel, allowedBackendIds);
if (originalCandidates.length > 0) {
return {
requestedModel,
routedModel: requestedModel,
wasRewritten: false,
ruleType: 'none',
};
}
const visited = new Set<string>();
const visiting = new Map<string, number>();
const path: string[] = [];
const routedModel = rewrite.targetModel;
return {
requestedModel,
routedModel,
wasRewritten: routedModel !== requestedModel,
ruleType: 'fallback',
const visit = (modelId: string): string[] | null => {
const firstSeenAt = visiting.get(modelId);
if (firstSeenAt !== undefined) {
return [...path.slice(firstSeenAt), modelId];
}
if (visited.has(modelId)) {
return null;
}
visiting.set(modelId, path.length);
path.push(modelId);
const targetModel = activeRules.get(modelId);
if (targetModel) {
const cycle = visit(targetModel);
if (cycle) {
return cycle;
}
}
path.pop();
visiting.delete(modelId);
visited.add(modelId);
return null;
};
for (const sourceModel of activeRules.keys()) {
const cycle = visit(sourceModel);
if (cycle) {
return cycle;
}
}
return null;
}
static getRequestableModelsForAllowedBackends(allowedBackendIds: number[]): BackendModelCatalogEntry[] {
const context = this.createResolutionContext(allowedBackendIds);
const requestableModelIds = new Set<string>();
const candidateModelIds = new Set<string>([
...this.backendIdsByModel.keys(),
...this.modelRewriteMap.keys(),
]);
for (const modelId of candidateModelIds) {
const resolution = this.resolveRequestedModelWithContext(modelId, context);
const routedBackendIds = this.getCandidateBackendIdsWithContext(resolution.routedModel, context);
if (routedBackendIds.length > 0) {
requestableModelIds.add(this.normalizeModelId(modelId));
}
}
return Array.from(requestableModelIds)
.sort((a, b) => a.localeCompare(b))
.map((modelId) => {
const resolution = this.resolveRequestedModelWithContext(modelId, context);
return {
model_id: modelId,
backend_ids: this.getCandidateBackendIdsWithContext(resolution.routedModel, context),
};
});
}
static getBackendCacheStatus(backendId: number): BackendModelCacheStatus {
@ -378,13 +526,7 @@ export class ModelCatalogService {
}
static getCandidateBackendIds(modelId: string, allowedBackendIds: number[]): number[] {
const normalized = this.normalizeModelId(modelId);
const backendIds = this.backendIdsByModel.get(normalized);
if (!backendIds) return [];
const allowed = new Set(allowedBackendIds);
const active = new Set(BackendModel.findActive().map((backend) => backend.id));
return Array.from(backendIds).filter((backendId) => allowed.has(backendId) && active.has(backendId));
return this.getCandidateBackendIdsWithContext(modelId, this.createResolutionContext(allowedBackendIds));
}
static getModelsForAllowedBackends(allowedBackendIds: number[]): BackendModelCatalogEntry[] {

View file

@ -619,5 +619,206 @@ describe('OpenAI Compatible Backend Integration', () => {
expect(receivedModel).toBe('fallback-model');
expect(response.body.model).toBe('fallback-model');
});
it('should follow force rewrite chains before upstream forwarding', async () => {
let receivedModel: string | undefined;
const { server, port } = createMockBackend({
onRequest: (req) => {
if (req.path === '/v1/chat/completions') {
receivedModel = req.body.model;
}
},
chatResponse: {
id: 'force-chain-success',
model: 'chain-final-c',
choices: [{ index: 0, message: { role: 'assistant', content: 'chain' }, finish_reason: 'stop' }],
usage: { prompt_tokens: 1, completion_tokens: 1, total_tokens: 2 },
},
modelsResponse: [{ id: 'chain-final-c', object: 'model' }],
});
mockServer = server;
mockPort = port;
const userResponse = await admin.post('/admin/users').send({ name: 'Force Chain User 8-10' });
const userApiKey = userResponse.body.api_key;
const userId = userResponse.body.id;
const backendResponse = await admin.post('/admin/backends').send({
name: 'Force Chain Backend 8-10',
base_url: `http://localhost:${port}`,
});
await admin.post('/admin/permissions').send({ user_id: userId, backend_id: backendResponse.body.id });
expect((await admin.post('/admin/model-rewrites').send({ source_model: 'chain-start-a', target_model: 'chain-mid-b', force: true })).status).toBe(201);
expect((await admin.post('/admin/model-rewrites').send({ source_model: 'chain-mid-b', target_model: 'chain-final-c', force: true })).status).toBe(201);
const response = await request(app)
.post('/v1/chat/completions')
.set('Authorization', `Bearer ${userApiKey}`)
.send({ model: 'chain-start-a', messages: [] });
expect(response.status).toBe(200);
expect(receivedModel).toBe('chain-final-c');
});
it('should continue a mixed chain only when the current fallback model is unavailable', async () => {
let unavailableReceivedModel: string | undefined;
const unavailableBackend = createMockBackend({
onRequest: (req) => {
if (req.path === '/v1/chat/completions') {
unavailableReceivedModel = req.body.model;
}
},
chatResponse: {
id: 'mixed-chain-unavailable',
model: 'mixed-final-e',
choices: [{ index: 0, message: { role: 'assistant', content: 'fallback-chain' }, finish_reason: 'stop' }],
usage: { prompt_tokens: 1, completion_tokens: 1, total_tokens: 2 },
},
modelsResponse: [{ id: 'mixed-final-e', object: 'model' }],
});
mockServer = unavailableBackend.server;
mockPort = unavailableBackend.port;
const unavailableUser = await admin.post('/admin/users').send({ name: 'Mixed Chain Missing User 8-11' });
const unavailableBackendResponse = await admin.post('/admin/backends').send({
name: 'Mixed Chain Missing Backend 8-11',
base_url: `http://localhost:${unavailableBackend.port}`,
});
await admin.post('/admin/permissions').send({ user_id: unavailableUser.body.id, backend_id: unavailableBackendResponse.body.id });
expect((await admin.post('/admin/model-rewrites').send({ source_model: 'mixed-missing-a', target_model: 'mixed-missing-b', force: true })).status).toBe(201);
expect((await admin.post('/admin/model-rewrites').send({ source_model: 'mixed-missing-b', target_model: 'mixed-missing-c', force: true })).status).toBe(201);
expect((await admin.post('/admin/model-rewrites').send({ source_model: 'mixed-missing-c', target_model: 'mixed-missing-d', force: false })).status).toBe(201);
expect((await admin.post('/admin/model-rewrites').send({ source_model: 'mixed-missing-d', target_model: 'mixed-final-e', force: true })).status).toBe(201);
const unavailableResponse = await request(app)
.post('/v1/chat/completions')
.set('Authorization', `Bearer ${unavailableUser.body.api_key}`)
.send({ model: 'mixed-missing-a', messages: [] });
expect(unavailableResponse.status).toBe(200);
expect(unavailableReceivedModel).toBe('mixed-final-e');
let availableReceivedModel: string | undefined;
const availableBackend = createMockBackend({
onRequest: (req) => {
if (req.path === '/v1/chat/completions') {
availableReceivedModel = req.body.model;
}
},
chatResponse: {
id: 'mixed-chain-available',
model: 'mixed-available-c',
choices: [{ index: 0, message: { role: 'assistant', content: 'available-chain' }, finish_reason: 'stop' }],
usage: { prompt_tokens: 1, completion_tokens: 1, total_tokens: 2 },
},
modelsResponse: [{ id: 'mixed-available-c', object: 'model' }, { id: 'mixed-available-e', object: 'model' }],
});
try {
const availableUser = await admin.post('/admin/users').send({ name: 'Mixed Chain Available User 8-12' });
const availableBackendResponse = await admin.post('/admin/backends').send({
name: 'Mixed Chain Available Backend 8-12',
base_url: `http://localhost:${availableBackend.port}`,
});
await admin.post('/admin/permissions').send({ user_id: availableUser.body.id, backend_id: availableBackendResponse.body.id });
expect((await admin.post('/admin/model-rewrites').send({ source_model: 'mixed-available-a', target_model: 'mixed-available-b', force: true })).status).toBe(201);
expect((await admin.post('/admin/model-rewrites').send({ source_model: 'mixed-available-b', target_model: 'mixed-available-c', force: true })).status).toBe(201);
expect((await admin.post('/admin/model-rewrites').send({ source_model: 'mixed-available-c', target_model: 'mixed-available-d', force: false })).status).toBe(201);
expect((await admin.post('/admin/model-rewrites').send({ source_model: 'mixed-available-d', target_model: 'mixed-available-e', force: true })).status).toBe(201);
const availableResponse = await request(app)
.post('/v1/chat/completions')
.set('Authorization', `Bearer ${availableUser.body.api_key}`)
.send({ model: 'mixed-available-a', messages: [] });
expect(availableResponse.status).toBe(200);
expect(availableReceivedModel).toBe('mixed-available-c');
} finally {
await new Promise<void>((resolve) => availableBackend.server.close(() => resolve()));
}
});
it('should expose only requestable native models and rewrite aliases from /v1/models', async () => {
const allowedBackend = createMockBackend({
modelsResponse: [
{ id: 'models-visible-final', object: 'model' },
{ id: 'models-native-forced-away', object: 'model' },
],
});
const deniedBackend = createMockBackend({
modelsResponse: [{ id: 'models-denied-final', object: 'model' }],
});
mockServer = allowedBackend.server;
mockPort = allowedBackend.port;
try {
const userResponse = await admin.post('/admin/users').send({ name: 'Requestable Models User 8-13' });
const allowedBackendResponse = await admin.post('/admin/backends').send({
name: 'Requestable Models Allowed Backend 8-13',
base_url: `http://localhost:${allowedBackend.port}`,
});
await admin.post('/admin/backends').send({
name: 'Requestable Models Denied Backend 8-13',
base_url: `http://localhost:${deniedBackend.port}`,
});
await admin.post('/admin/permissions').send({ user_id: userResponse.body.id, backend_id: allowedBackendResponse.body.id });
expect((await admin.post('/admin/model-rewrites').send({ source_model: 'models-visible-alias', target_model: 'models-visible-final', force: true })).status).toBe(201);
expect((await admin.post('/admin/model-rewrites').send({ source_model: 'models-missing-alias', target_model: 'models-missing-final', force: true })).status).toBe(201);
expect((await admin.post('/admin/model-rewrites').send({ source_model: 'models-denied-alias', target_model: 'models-denied-final', force: true })).status).toBe(201);
expect((await admin.post('/admin/model-rewrites').send({ source_model: 'models-native-forced-away', target_model: 'models-missing-final', force: true })).status).toBe(201);
const response = await request(app)
.get('/v1/models')
.set('Authorization', `Bearer ${userResponse.body.api_key}`);
expect(response.status).toBe(200);
const ids = response.body.data.map((item: any) => item.id);
expect(ids).toContain('models-visible-final');
expect(ids).toContain('models-visible-alias');
expect(ids).not.toContain('models-missing-alias');
expect(ids).not.toContain('models-denied-alias');
expect(ids).not.toContain('models-native-forced-away');
} finally {
await new Promise<void>((resolve) => deniedBackend.server.close(() => resolve()));
}
});
it('should reject active rewrite cycles while allowing inactive cycles until activation', async () => {
const selfLoop = await admin.post('/admin/model-rewrites').send({
source_model: 'cycle-self-a',
target_model: 'cycle-self-a',
force: true,
});
expect(selfLoop.status).toBe(409);
expect(selfLoop.body.error).toBe('Model rewrite cycle detected');
expect((await admin.post('/admin/model-rewrites').send({ source_model: 'cycle-active-a', target_model: 'cycle-active-b', force: true })).status).toBe(201);
const activeCycle = await admin.post('/admin/model-rewrites').send({
source_model: 'cycle-active-b',
target_model: 'cycle-active-a',
force: true,
});
expect(activeCycle.status).toBe(409);
const inactiveA = await admin.post('/admin/model-rewrites').send({
source_model: 'cycle-inactive-a',
target_model: 'cycle-inactive-b',
is_active: false,
force: true,
});
const inactiveB = await admin.post('/admin/model-rewrites').send({
source_model: 'cycle-inactive-b',
target_model: 'cycle-inactive-a',
is_active: false,
force: true,
});
expect(inactiveA.status).toBe(201);
expect(inactiveB.status).toBe(201);
const activation = await admin.put(`/admin/model-rewrites/${inactiveA.body.id}`).send({ is_active: true });
expect(activation.status).toBe(200);
const secondActivation = await admin.put(`/admin/model-rewrites/${inactiveB.body.id}`).send({ is_active: true });
expect(secondActivation.status).toBe(409);
});
});
});