fix(shared): accept tool role and multimodal content in chat schemas

The v1 chat completion proxy rejected requests with `role: "tool"` or array-typed `content` (multimodal image/video payloads) because the shared zod schemas were too restrictive: - `ChatRoleSchema` was `z.enum(['system','user','assistant'])` — now `z.string()` so any role the backend supports passes through. The router is a proxy and has no reason to constrain which roles are valid; the upstream provider decides that. - `ChatMessageSchema.content` was `z.string()` — now `z.union([z.string(), z.array(z.any()), z.null()]).optional()` to accept the three shapes the OpenAI spec defines: plain text, an array of content-part objects (images, video frames, etc.), or null (e.g. assistant messages that only carry tool_calls). `.passthrough()` on the message object ensures extra fields like `tool_call_id`, `name`, `tool_calls`, etc. are forwarded untouched. - `ChatCompletionChoiceSchema.finish_reason` was `z.string()` — now `z.string().nullable().optional()` since some providers return null for streaming chunks or incomplete generations. Fixes #2, Fixes #3 Co-Authored-By: Claude <noreply@anthropic.com>
2026-04-11 18:28:35 +09:00 · 2026-04-11 18:28:35 +09:00 · db58054fdb
commit db58054fdb
parent f97e67382b
1 changed files with 31 additions and 10 deletions
--- a/shared/schemas.ts
+++ b/shared/schemas.ts
@ -220,13 +220,32 @@ export type CreateAdminTokenInput = z.infer<typeof CreateAdminTokenInputSchema>;
 * OpenAI v1
 * ────────────────────────────────────────────────────────────────────────── */

-export const ChatRoleSchema = z.enum(['system', 'user', 'assistant']);
+/**
+ * The router is a proxy — it must not reject roles or content shapes that
+ * a backend legitimately supports. The OpenAI spec defines `system`,
+ * `user`, `assistant`, `tool`, and `function`; other providers may add
+ * more. Accept any string so messages pass through unaltered.
+ */
+export const ChatRoleSchema = z.string();
 export type ChatRole = z.infer<typeof ChatRoleSchema>;

-export const ChatMessageSchema = z.object({
-  role: ChatRoleSchema,
-  content: z.string(),
-});
+/**
+ * `content` may be:
+ *   - a plain string (most common)
+ *   - `null` (e.g. assistant messages that only carry tool_calls)
+ *   - an array of content-part objects (multimodal: images, video, etc.)
+ *
+ * We validate the structural envelope but leave the inner content
+ * unconstrained so the backend decides what's valid.
+ */
+export const ChatMessageSchema = z
+  .object({
+    role: ChatRoleSchema,
+    content: z
+      .union([z.string(), z.array(z.record(z.unknown())), z.null()])
+      .optional(),
+  })
+  .passthrough();
 export type ChatMessage = z.infer<typeof ChatMessageSchema>;

 export const ChatCompletionRequestSchema = z
@ -250,11 +269,13 @@ export const ChatCompletionUsageSchema = z.object({
 });
 export type ChatCompletionUsage = z.infer<typeof ChatCompletionUsageSchema>;

-export const ChatCompletionChoiceSchema = z.object({
-  index: z.number().int(),
-  message: ChatMessageSchema,
-  finish_reason: z.string(),
-});
+export const ChatCompletionChoiceSchema = z
+  .object({
+    index: z.number().int(),
+    message: ChatMessageSchema,
+    finish_reason: z.string().nullable().optional(),
+  })
+  .passthrough();
 export type ChatCompletionChoice = z.infer<typeof ChatCompletionChoiceSchema>;

 export const ChatCompletionResponseSchema = z