Add Lip Sync Studio with 9 models (Infinite Talk, Wan 2.2, LTX, Sync, LatentSync, Creatify, Veed)

This commit is contained in:
Anil Matcha 2026-03-12 17:41:17 +05:30
commit d05c8448e8
6 changed files with 1057 additions and 8 deletions

View file

@ -2,13 +2,14 @@
> **The free, open-source alternative to Higgsfield AI.** Generate AI images and videos using 200+ state-of-the-art models — without the closed ecosystem or subscription fees.
Open Higgsfield AI is an open-source AI image, video, and cinema studio that brings Higgsfield-style creative workflows to everyone. Powered by [Muapi.ai](https://muapi.ai), it supports text-to-image, image-to-image, text-to-video, and image-to-video generation across models like Flux, Nano Banana, Midjourney, Kling, Sora, Veo, Seedream, and more — all from a sleek, modern interface you can self-host and customize.
Open Higgsfield AI is an open-source AI image, video, cinema, and lip sync studio that brings Higgsfield-style creative workflows to everyone. Powered by [Muapi.ai](https://muapi.ai), it supports text-to-image, image-to-image, text-to-video, image-to-video, and audio-driven lip sync generation across models like Flux, Nano Banana, Midjourney, Kling, Sora, Veo, Seedream, Infinite Talk, LTX Lipsync, Wan 2.2, and more — all from a sleek, modern interface you can self-host and customize.
**Why Open Higgsfield AI instead of Higgsfield AI?**
- **Free & open-source** — no subscription, no vendor lock-in
- **Self-hosted** — your data stays on your machine
- **200+ models** — text-to-image, image-to-image, text-to-video, image-to-video
- **200+ models** — text-to-image, image-to-image, text-to-video, image-to-video, lip sync
- **Multi-image input** — feed up to 14 reference images into compatible models
- **Lip Sync Studio** — animate portraits or sync lips to any audio with 9 dedicated models
- **Extensible** — add your own models, modify the UI, build on top of it
For a deep dive into the technical architecture and the philosophy behind the "Infinite Budget" cinema workflow, see our [comprehensive guide and roadmap](https://medium.com/@anilmatcha/building-open-higgsfield-ai-an-open-source-ai-cinema-studio-83c1e0a2a5f1).
@ -20,6 +21,7 @@ For a deep dive into the technical architecture and the philosophy behind the "I
- **Image Studio** — Generate images from text prompts (50+ text-to-image models) or transform existing images (55+ image-to-image models). Switches model set automatically based on whether a reference image is provided. Quality and resolution controls visible for models that support them.
- **Multi-Image Input** — Upload up to 14 reference images for compatible edit models (Nano Banana 2 Edit, Flux Kontext Dev, GPT-4o Edit, and more). Multi-select picker with order badges, batch upload, and a "Use Selected" confirmation flow.
- **Video Studio** — Generate videos from text prompts (40+ text-to-video models) or animate a start-frame image (60+ image-to-video models). Same intelligent mode switching as Image Studio.
- **Lip Sync Studio** — Animate portrait images or sync lips on existing videos using audio. 9 dedicated models across two modes: portrait image + audio → talking video, and video + audio → lipsync video.
- **Cinema Studio** — Higgsfield AI-style interface for photorealistic cinematic shots with pro camera controls (Lens, Focal Length, Aperture)
- **Upload History** — Reference images are uploaded once and stored locally. A picker panel lets you reuse any previously uploaded image across sessions — no re-uploading.
- **Smart Controls** — Dynamic aspect ratio, resolution/quality, and duration pickers that adapt to each model's capabilities (including t2i models with resolution or quality options)
@ -90,6 +92,43 @@ The Video Studio follows the same pattern:
| **Seedance 2.0 I2V** | Image-to-Video | ByteDance · Animate images into video · Up to 9 reference images · Aspect ratios 16:9 / 9:16 / 4:3 / 3:4 · Duration 5 / 10 / 15s · Quality basic/high |
| **Seedance 2.0 Extend** | Video Extension | ByteDance · Seamlessly continue any Seedance 2.0 generation · Preserves style, motion & audio · Optional continuation prompt · Duration 5 / 10 / 15s · Quality basic/high |
### 🎙️ Lip Sync Studio
The **Lip Sync Studio** generates audio-driven talking videos using 9 models across two input modes:
| Mode | Trigger | Description |
| :--- | :--- | :--- |
| **Portrait Image** | Default | Upload a portrait image + audio file → animated talking video |
| **Video** | Switch to Video mode | Upload an existing video + audio file → lipsync video |
#### Image-based Models (Portrait Image + Audio → Video)
| Model | Endpoint | Resolutions | Prompt |
| :--- | :--- | :--- | :--- |
| **Infinite Talk** | `infinitetalk-image-to-video` | 480p, 720p | Optional |
| **Wan 2.2 Speech to Video** | `wan2.2-speech-to-video` | 480p, 720p | Optional |
| **LTX 2.3 Lipsync** | `ltx-2.3-lipsync` | 480p, 720p, 1080p | Optional |
| **LTX 2 19B Lipsync** | `ltx-2-19b-lipsync` | 480p, 720p, 1080p | Optional |
#### Video-based Models (Video + Audio → Lipsync Video)
| Model | Endpoint | Resolutions | Prompt |
| :--- | :--- | :--- | :--- |
| **Sync Lipsync** | `sync-lipsync` | — | — |
| **LatentSync** | `latentsync-video` | — | — |
| **Creatify Lipsync** | `creatify-lipsync` | — | — |
| **Veed Lipsync** | `veed-lipsync` | — | — |
| **Infinite Talk V2V** | `infinitetalk-video-to-video` | 480p, 720p | Optional |
**How it works:**
1. Select **Portrait Image** or **Video** mode using the toggle
2. Upload your portrait image (or video) using the image/video upload button
3. Upload your audio file using the audio upload button
4. Optionally enter a prompt to guide the motion style
5. Select a model and resolution (where supported), then click **Generate**
Generation history is saved separately in `lipsync_history` and pending jobs resume automatically on page reload.
### 🎥 Cinema Studio Controls
The **Cinema Studio** offers precise control over the virtual camera, translating your choices into optimized prompt modifiers:
@ -150,22 +189,23 @@ src/
├── components/
│ ├── ImageStudio.js # Dual-mode t2i/i2i studio with dynamic model switching & multi-image support
│ ├── VideoStudio.js # Dual-mode t2v/i2v studio with dynamic model switching
│ ├── LipSyncStudio.js # Lip sync studio: portrait image/video + audio → talking video (9 models)
│ ├── CinemaStudio.js # Pro studio with camera controls & infinite canvas flow
│ ├── UploadPicker.js # Upload button + history panel; single & multi-image select modes
│ ├── CameraControls.js # Scrollable picker for camera/lens/focal/aperture
│ ├── Header.js # App header with settings and controls
│ ├── Header.js # App header with navigation (Image, Video, Lip Sync, Cinema Studio…)
│ ├── AuthModal.js # API key input modal
│ ├── SettingsModal.js # Settings panel for API key management
│ └── Sidebar.js # Navigation sidebar
├── lib/
│ ├── muapi.js # API client: generateImage, generateVideo, generateI2I, generateI2V, uploadFile
│ ├── models.js # 200+ model definitions with endpoints, inputs, maxImages, quality/resolution mappings
│ ├── muapi.js # API client: generateImage, generateVideo, generateI2I, generateI2V, processV2V, processLipSync, uploadFile
│ ├── models.js # 200+ model definitions: t2i, i2i, t2v, i2v, v2v, lipsync arrays with endpoints & input schemas
│ └── uploadHistory.js # localStorage CRUD + canvas thumbnail generation for upload history
├── styles/
│ ├── global.css # Global styles and animations
│ ├── studio.css # Studio-specific styles
│ └── variables.css # CSS custom properties
├── main.js # App entry point
├── main.js # App entry point & router (image / video / lipsync / cinema)
└── style.css # Tailwind imports
```
@ -180,6 +220,8 @@ Authentication uses the `x-api-key` header. During development, a Vite proxy han
File uploads use `POST /api/v1/upload_file` (multipart/form-data) and return a hosted URL that is passed to image-conditioned models. For multi-image models the full `images_list` array is forwarded to the API in one request.
Lip sync jobs use the same two-step pattern: a dedicated `processLipSync()` method accepts `image_url` or `video_url` alongside `audio_url`, dispatches to the model's endpoint, and polls until the output video URL is available.
## 🎨 Supported Model Categories
| Category | Count | Examples |
@ -188,6 +230,7 @@ File uploads use `POST /api/v1/upload_file` (multipart/form-data) and return a h
| **Image-to-Image** | 55+ | Nano Banana 2 Edit (×14), Flux Kontext Pro, GPT-4o Edit, Seededit v3, Upscaler, Background Remover |
| **Text-to-Video** | 40+ | Kling v3, Sora 2, Veo 3, Wan 2.6, Seedance 2.0, Seedance 2.0 Extend, Seedance Pro, Hailuo 2.3, Runway Gen-3 |
| **Image-to-Video** | 60+ | Kling v2.1 I2V, Veo3 I2V, Runway I2V, Seedance 2.0 I2V, Midjourney v7 I2V, Hunyuan I2V, Wan2.2 I2V |
| **Lip Sync** | 9 | Infinite Talk I2V, Wan 2.2 Speech to Video, LTX 2.3 Lipsync, LTX 2 19B Lipsync, Sync, LatentSync, Creatify, Veed, Infinite Talk V2V |
## 🛠️ Tech Stack
@ -205,6 +248,7 @@ Higgsfield AI is a proprietary AI video and image generation platform. **Open Hi
| **Cost** | Subscription-based | Free (open-source) |
| **Models** | Proprietary | 200+ open & commercial models |
| **Multi-image input** | Limited | Up to 14 images per request |
| **Lip sync** | No | 9 models, image & video modes |
| **Self-hosting** | No | Yes |
| **Customizable** | No | Fully hackable |
| **Data privacy** | Cloud-based | Your data stays local |

View file

@ -25,7 +25,7 @@ export function Header(navigate) {
const menu = document.createElement('nav');
menu.className = 'hidden lg:flex items-center gap-6 text-[13px] font-bold text-secondary';
const items = ['Explore', 'Image', 'Video', 'Edit', 'Character', 'Contests', 'Vibe Motion', 'Cinema Studio', 'AI Influencer', 'Apps', 'Assist', 'Community'];
const items = ['Explore', 'Image', 'Video', 'Lip Sync', 'Edit', 'Character', 'Contests', 'Vibe Motion', 'Cinema Studio', 'AI Influencer', 'Apps', 'Assist', 'Community'];
items.forEach(item => {
const link = document.createElement('a');
@ -51,6 +51,7 @@ export function Header(navigate) {
if (item === 'Image') navigate('image');
else if (item === 'Video') navigate('video');
else if (item === 'Lip Sync') navigate('lipsync');
else if (item === 'Cinema Studio') navigate('cinema');
};

View file

@ -0,0 +1,793 @@
import { muapi } from '../lib/muapi.js';
import { lipsyncModels, imageLipSyncModels, videoLipSyncModels, getLipSyncModelById, getResolutionsForLipSyncModel } from '../lib/models.js';
import { AuthModal } from './AuthModal.js';
import { savePendingJob, removePendingJob, getPendingJobs } from '../lib/pendingJobs.js';
export function LipSyncStudio() {
const container = document.createElement('div');
container.className = 'w-full h-full flex flex-col items-center justify-center bg-app-bg relative p-4 md:p-6 overflow-y-auto custom-scrollbar overflow-x-hidden';
// --- State ---
// 'image' mode: portrait image + audio → video
// 'video' mode: existing video + audio → lipsync video
let inputMode = 'image';
let selectedModel = imageLipSyncModels[0].id;
let selectedResolution = imageLipSyncModels[0].inputs?.resolution?.default || '480p';
let uploadedImageUrl = null;
let uploadedVideoUrl = null;
let uploadedAudioUrl = null;
let dropdownOpen = null;
const getCurrentModels = () => inputMode === 'image' ? imageLipSyncModels : videoLipSyncModels;
const getCurrentModel = () => lipsyncModels.find(m => m.id === selectedModel);
// ==========================================
// 1. HERO SECTION
// ==========================================
const hero = document.createElement('div');
hero.className = 'flex flex-col items-center mb-10 md:mb-20 animate-fade-in-up transition-all duration-700';
hero.innerHTML = `
<div class="mb-10 relative group">
<div class="absolute inset-0 bg-primary/20 blur-[100px] rounded-full opacity-40 group-hover:opacity-70 transition-opacity duration-1000"></div>
<div class="relative w-24 h-24 md:w-32 md:h-32 bg-teal-900/40 rounded-3xl flex items-center justify-center border border-white/5 overflow-hidden">
<svg width="80" height="80" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1" class="text-primary opacity-20 absolute -right-4 -bottom-4">
<path d="M12 1a3 3 0 0 0-3 3v8a3 3 0 0 0 6 0V4a3 3 0 0 0-3-3z"/>
<path d="M19 10v2a7 7 0 0 1-14 0v-2"/>
<line x1="12" y1="19" x2="12" y2="23"/>
<line x1="8" y1="23" x2="16" y2="23"/>
</svg>
<div class="w-16 h-16 bg-primary/10 rounded-2xl flex items-center justify-center border border-primary/20 shadow-glow relative z-10">
<svg width="32" height="32" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="1.5" class="text-primary">
<path d="M12 1a3 3 0 0 0-3 3v8a3 3 0 0 0 6 0V4a3 3 0 0 0-3-3z"/>
<path d="M19 10v2a7 7 0 0 1-14 0v-2"/>
<line x1="12" y1="19" x2="12" y2="23"/>
<line x1="8" y1="23" x2="16" y2="23"/>
</svg>
</div>
<div class="absolute top-4 right-4 text-primary animate-pulse">🎙</div>
</div>
</div>
<h1 class="text-2xl sm:text-4xl md:text-7xl font-black text-white tracking-widest uppercase mb-4 selection:bg-primary selection:text-black text-center px-4">Lip Sync</h1>
<p class="text-secondary text-sm font-medium tracking-wide opacity-60">Animate portraits or sync lips to audio with AI</p>
`;
container.appendChild(hero);
// ==========================================
// 2. INPUT BAR
// ==========================================
const promptWrapper = document.createElement('div');
promptWrapper.className = 'w-full max-w-4xl relative z-40 animate-fade-in-up';
promptWrapper.style.animationDelay = '0.2s';
const bar = document.createElement('div');
bar.className = 'w-full bg-[#111]/90 backdrop-blur-xl border border-white/10 rounded-[1.5rem] md:rounded-[2.5rem] p-3 md:p-5 flex flex-col gap-3 md:gap-5 shadow-3xl';
// --- Mode Toggle (Image vs Video) ---
const modeToggleRow = document.createElement('div');
modeToggleRow.className = 'flex items-center gap-2 px-2';
const modeLabel = document.createElement('span');
modeLabel.className = 'text-xs text-muted font-bold uppercase tracking-widest mr-2';
modeLabel.textContent = 'Input:';
const imageModeBtn = document.createElement('button');
imageModeBtn.type = 'button';
imageModeBtn.className = 'px-4 py-1.5 rounded-xl text-xs font-bold transition-all border border-primary bg-primary/10 text-primary';
imageModeBtn.textContent = '🖼 Portrait Image';
const videoModeBtn = document.createElement('button');
videoModeBtn.type = 'button';
videoModeBtn.className = 'px-4 py-1.5 rounded-xl text-xs font-bold transition-all border border-white/10 text-muted hover:border-white/30 hover:text-white';
videoModeBtn.textContent = '🎬 Video';
modeToggleRow.appendChild(modeLabel);
modeToggleRow.appendChild(imageModeBtn);
modeToggleRow.appendChild(videoModeBtn);
bar.appendChild(modeToggleRow);
// --- Uploads Row ---
const uploadsRow = document.createElement('div');
uploadsRow.className = 'flex items-start gap-3 px-2';
// ── Image Upload ──
const imageFileInput = document.createElement('input');
imageFileInput.type = 'file';
imageFileInput.accept = 'image/*';
imageFileInput.className = 'hidden';
const imageUploadBtn = document.createElement('button');
imageUploadBtn.type = 'button';
imageUploadBtn.title = 'Upload portrait image';
imageUploadBtn.className = 'flex-shrink-0 w-14 h-14 rounded-xl border transition-all flex flex-col items-center justify-center gap-1 bg-white/5 border-white/10 hover:bg-white/10 hover:border-primary/40 group relative overflow-hidden';
imageUploadBtn.innerHTML = `
<div class="image-icon flex flex-col items-center gap-1">
<svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" class="text-muted group-hover:text-primary transition-colors"><rect x="3" y="3" width="18" height="18" rx="2"/><circle cx="8.5" cy="8.5" r="1.5"/><polyline points="21 15 16 10 5 21"/></svg>
<span class="text-[9px] text-muted group-hover:text-primary font-bold">IMAGE</span>
</div>
<div class="image-spinner hidden items-center justify-center w-full h-full absolute inset-0"><span class="animate-spin text-primary text-sm"></span></div>
<div class="image-ready hidden flex-col items-center gap-1 absolute inset-0 bg-primary/10 rounded-xl">
<svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" class="text-primary mt-3"><rect x="3" y="3" width="18" height="18" rx="2"/><circle cx="8.5" cy="8.5" r="1.5"/><polyline points="21 15 16 10 5 21"/><polyline points="7 18 10 15 13 18" stroke="#d9ff00" stroke-width="2.5"/></svg>
<span class="text-[9px] text-primary font-bold">READY</span>
</div>
`;
imageUploadBtn.appendChild(imageFileInput);
// ── Video Upload ──
const videoFileInput = document.createElement('input');
videoFileInput.type = 'file';
videoFileInput.accept = 'video/*';
videoFileInput.className = 'hidden';
const videoUploadBtn = document.createElement('button');
videoUploadBtn.type = 'button';
videoUploadBtn.title = 'Upload source video';
videoUploadBtn.className = 'flex-shrink-0 w-14 h-14 rounded-xl border transition-all flex flex-col items-center justify-center gap-1 bg-white/5 border-white/10 hover:bg-white/10 hover:border-primary/40 group relative overflow-hidden hidden';
videoUploadBtn.innerHTML = `
<div class="video-icon flex flex-col items-center gap-1">
<svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" class="text-muted group-hover:text-primary transition-colors"><polygon points="23 7 16 12 23 17 23 7"/><rect x="1" y="5" width="15" height="14" rx="2" ry="2"/></svg>
<span class="text-[9px] text-muted group-hover:text-primary font-bold">VIDEO</span>
</div>
<div class="video-spinner hidden items-center justify-center w-full h-full absolute inset-0"><span class="animate-spin text-primary text-sm"></span></div>
<div class="video-ready hidden flex-col items-center gap-1 absolute inset-0 bg-primary/10 rounded-xl">
<svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" class="text-primary mt-3"><polygon points="23 7 16 12 23 17 23 7"/><rect x="1" y="5" width="15" height="14" rx="2" ry="2"/><polyline points="7 10 10 13 15 8" stroke="#d9ff00" stroke-width="2.5"/></svg>
<span class="text-[9px] text-primary font-bold">READY</span>
</div>
`;
videoUploadBtn.appendChild(videoFileInput);
// ── Audio Upload ──
const audioFileInput = document.createElement('input');
audioFileInput.type = 'file';
audioFileInput.accept = 'audio/*';
audioFileInput.className = 'hidden';
const audioUploadBtn = document.createElement('button');
audioUploadBtn.type = 'button';
audioUploadBtn.title = 'Upload audio file';
audioUploadBtn.className = 'flex-shrink-0 w-14 h-14 rounded-xl border transition-all flex flex-col items-center justify-center gap-1 bg-white/5 border-white/10 hover:bg-white/10 hover:border-primary/40 group relative overflow-hidden';
audioUploadBtn.innerHTML = `
<div class="audio-icon flex flex-col items-center gap-1">
<svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" class="text-muted group-hover:text-primary transition-colors"><path d="M12 1a3 3 0 0 0-3 3v8a3 3 0 0 0 6 0V4a3 3 0 0 0-3-3z"/><path d="M19 10v2a7 7 0 0 1-14 0v-2"/><line x1="12" y1="19" x2="12" y2="23"/></svg>
<span class="text-[9px] text-muted group-hover:text-primary font-bold">AUDIO</span>
</div>
<div class="audio-spinner hidden items-center justify-center w-full h-full absolute inset-0"><span class="animate-spin text-primary text-sm"></span></div>
<div class="audio-ready hidden flex-col items-center gap-1 absolute inset-0 bg-primary/10 rounded-xl">
<svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" class="text-primary mt-3"><path d="M12 1a3 3 0 0 0-3 3v8a3 3 0 0 0 6 0V4a3 3 0 0 0-3-3z"/><path d="M19 10v2a7 7 0 0 1-14 0v-2"/><line x1="12" y1="19" x2="12" y2="23"/><polyline points="7 10 10 13 15 8" stroke="#d9ff00" stroke-width="2.5"/></svg>
<span class="text-[9px] text-primary font-bold">READY</span>
</div>
`;
audioUploadBtn.appendChild(audioFileInput);
// ── Prompt Textarea ──
const textarea = document.createElement('textarea');
textarea.placeholder = 'Optional: describe the talking style or motion...';
textarea.className = 'flex-1 bg-transparent text-white placeholder-muted/50 text-sm resize-none outline-none min-h-[56px] leading-relaxed pt-1';
textarea.rows = 2;
uploadsRow.appendChild(imageUploadBtn);
uploadsRow.appendChild(videoUploadBtn);
uploadsRow.appendChild(audioUploadBtn);
uploadsRow.appendChild(textarea);
bar.appendChild(uploadsRow);
// ── Status labels ──
const statusRow = document.createElement('div');
statusRow.className = 'flex items-center gap-3 px-2 text-xs text-muted';
const imageStatusLabel = document.createElement('span');
imageStatusLabel.className = 'text-muted';
imageStatusLabel.textContent = 'No image';
const audioStatusLabel = document.createElement('span');
audioStatusLabel.className = 'text-muted';
audioStatusLabel.textContent = 'No audio';
statusRow.appendChild(imageStatusLabel);
statusRow.appendChild(document.createTextNode(' · '));
statusRow.appendChild(audioStatusLabel);
bar.appendChild(statusRow);
// ── Bottom Controls Row ──
const bottomRow = document.createElement('div');
bottomRow.className = 'flex items-center gap-2 md:gap-3 flex-wrap px-2';
// Model selector
const modelBtn = document.createElement('button');
modelBtn.id = 'ls-model-btn';
modelBtn.type = 'button';
modelBtn.className = 'flex items-center gap-2 px-3 py-2 rounded-xl bg-white/5 hover:bg-white/10 border border-white/10 hover:border-primary/40 transition-all text-xs font-bold text-white group';
modelBtn.innerHTML = `<svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" class="text-primary"><polygon points="23 7 16 12 23 17 23 7"/><rect x="1" y="5" width="15" height="14" rx="2" ry="2"/></svg><span id="ls-model-btn-label">${getCurrentModels()[0].name}</span><svg width="10" height="10" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" class="text-muted group-hover:text-white transition-colors"><polyline points="6 9 12 15 18 9"/></svg>`;
// Resolution selector
const resolutionBtn = document.createElement('button');
resolutionBtn.id = 'ls-resolution-btn';
resolutionBtn.type = 'button';
resolutionBtn.className = 'flex items-center gap-2 px-3 py-2 rounded-xl bg-white/5 hover:bg-white/10 border border-white/10 hover:border-primary/40 transition-all text-xs font-bold text-white group';
resolutionBtn.innerHTML = `<svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" class="text-primary"><rect x="2" y="3" width="20" height="14" rx="2" ry="2"/><line x1="8" y1="21" x2="16" y2="21"/><line x1="12" y1="17" x2="12" y2="21"/></svg><span id="ls-resolution-btn-label">${selectedResolution}</span><svg width="10" height="10" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" class="text-muted group-hover:text-white transition-colors"><polyline points="6 9 12 15 18 9"/></svg>`;
// Generate button
const generateBtn = document.createElement('button');
generateBtn.id = 'ls-generate-btn';
generateBtn.type = 'button';
generateBtn.className = 'ml-auto px-6 py-2.5 bg-primary text-black font-black text-sm rounded-2xl hover:scale-105 active:scale-95 transition-all shadow-glow disabled:opacity-50 disabled:cursor-not-allowed disabled:hover:scale-100';
generateBtn.textContent = 'Generate ✨';
bottomRow.appendChild(modelBtn);
bottomRow.appendChild(resolutionBtn);
bottomRow.appendChild(generateBtn);
bar.appendChild(bottomRow);
promptWrapper.appendChild(bar);
container.appendChild(promptWrapper);
// ==========================================
// 3. DROPDOWN SYSTEM
// ==========================================
const dropdown = document.createElement('div');
dropdown.className = 'hidden fixed z-[100] bg-[#111] border border-white/10 rounded-2xl shadow-3xl p-2 min-w-[200px] max-h-[400px] overflow-y-auto custom-scrollbar';
dropdown.id = 'ls-dropdown';
const closeDropdown = (e) => {
if (!e || (!dropdown.contains(e.target) && !e.target.closest('[id^="ls-"]'))) {
dropdown.classList.add('hidden');
dropdownOpen = null;
}
};
const populateDropdown = (type) => {
dropdown.innerHTML = '';
if (type === 'model') {
const models = getCurrentModels();
models.forEach(m => {
const item = document.createElement('button');
item.type = 'button';
item.className = `w-full text-left px-4 py-2.5 rounded-xl text-sm transition-all hover:bg-white/10 ${m.id === selectedModel ? 'text-primary font-bold bg-primary/5' : 'text-white font-medium'}`;
item.innerHTML = `<div>${m.name}</div><div class="text-xs text-muted mt-0.5">${m.description?.slice(0, 60)}...</div>`;
item.onclick = () => {
selectedModel = m.id;
document.getElementById('ls-model-btn-label').textContent = m.name;
const resolutions = getResolutionsForLipSyncModel(selectedModel);
if (resolutions.length > 0) {
selectedResolution = m.inputs?.resolution?.default || resolutions[0];
document.getElementById('ls-resolution-btn-label').textContent = selectedResolution;
resolutionBtn.classList.remove('hidden');
} else {
resolutionBtn.classList.add('hidden');
}
textarea.style.display = m.hasPrompt ? '' : 'none';
closeDropdown();
};
dropdown.appendChild(item);
});
} else if (type === 'resolution') {
const resolutions = getResolutionsForLipSyncModel(selectedModel);
resolutions.forEach(r => {
const item = document.createElement('button');
item.type = 'button';
item.className = `w-full text-left px-4 py-2.5 rounded-xl text-sm transition-all hover:bg-white/10 ${r === selectedResolution ? 'text-primary font-bold bg-primary/5' : 'text-white font-medium'}`;
item.textContent = r;
item.onclick = () => {
selectedResolution = r;
document.getElementById('ls-resolution-btn-label').textContent = r;
closeDropdown();
};
dropdown.appendChild(item);
});
}
};
const openDropdown = (type, anchorBtn) => {
dropdownOpen = type;
// Populate and temporarily show off-screen to measure height
populateDropdown(type);
dropdown.style.top = '-9999px';
dropdown.style.bottom = 'auto';
dropdown.classList.remove('hidden');
const ddHeight = dropdown.offsetHeight;
const rect = anchorBtn.getBoundingClientRect();
const spaceBelow = window.innerHeight - rect.bottom - 8;
const spaceAbove = rect.top - 8;
if (spaceBelow >= ddHeight || spaceBelow >= spaceAbove) {
dropdown.style.top = `${rect.bottom + 8}px`;
dropdown.style.bottom = 'auto';
dropdown.style.maxHeight = `${Math.max(150, spaceBelow - 8)}px`;
} else {
dropdown.style.top = 'auto';
dropdown.style.bottom = `${window.innerHeight - rect.top + 8}px`;
dropdown.style.maxHeight = `${Math.max(150, spaceAbove - 8)}px`;
}
dropdown.style.left = `${Math.min(rect.left, window.innerWidth - 220)}px`;
};
modelBtn.onclick = (e) => { e.stopPropagation(); if (dropdownOpen === 'model') { closeDropdown(); } else { openDropdown('model', modelBtn); } };
resolutionBtn.onclick = (e) => { e.stopPropagation(); if (dropdownOpen === 'resolution') { closeDropdown(); } else { openDropdown('resolution', resolutionBtn); } };
window.addEventListener('click', closeDropdown);
container.appendChild(dropdown);
// ==========================================
// 4. MODE SWITCHING LOGIC
// ==========================================
const updateUIForMode = () => {
if (inputMode === 'image') {
imageModeBtn.className = 'px-4 py-1.5 rounded-xl text-xs font-bold transition-all border border-primary bg-primary/10 text-primary';
videoModeBtn.className = 'px-4 py-1.5 rounded-xl text-xs font-bold transition-all border border-white/10 text-muted hover:border-white/30 hover:text-white';
imageUploadBtn.classList.remove('hidden');
videoUploadBtn.classList.add('hidden');
} else {
videoModeBtn.className = 'px-4 py-1.5 rounded-xl text-xs font-bold transition-all border border-primary bg-primary/10 text-primary';
imageModeBtn.className = 'px-4 py-1.5 rounded-xl text-xs font-bold transition-all border border-white/10 text-muted hover:border-white/30 hover:text-white';
videoUploadBtn.classList.remove('hidden');
imageUploadBtn.classList.add('hidden');
}
// Switch to first model of new mode
const models = getCurrentModels();
selectedModel = models[0].id;
document.getElementById('ls-model-btn-label').textContent = models[0].name;
// Update resolution
const resolutions = getResolutionsForLipSyncModel(selectedModel);
if (resolutions.length > 0) {
selectedResolution = models[0].inputs?.resolution?.default || resolutions[0];
document.getElementById('ls-resolution-btn-label').textContent = selectedResolution;
resolutionBtn.classList.remove('hidden');
} else {
resolutionBtn.classList.add('hidden');
}
// Show/hide prompt
textarea.style.display = models[0].hasPrompt ? '' : 'none';
};
imageModeBtn.onclick = () => {
if (inputMode === 'image') return;
inputMode = 'image';
uploadedVideoUrl = null;
updateVideoUploadState('idle');
updateUIForMode();
};
videoModeBtn.onclick = () => {
if (inputMode === 'video') return;
inputMode = 'video';
uploadedImageUrl = null;
updateImageUploadState('idle');
updateUIForMode();
};
// ==========================================
// 5. UPLOAD HANDLERS
// ==========================================
const updateImageUploadState = (state, filename) => {
const icon = imageUploadBtn.querySelector('.image-icon');
const spinner = imageUploadBtn.querySelector('.image-spinner');
const ready = imageUploadBtn.querySelector('.image-ready');
if (state === 'idle') {
icon.classList.remove('hidden'); icon.classList.add('flex');
spinner.classList.add('hidden'); spinner.classList.remove('flex');
ready.classList.add('hidden'); ready.classList.remove('flex');
imageUploadBtn.classList.remove('border-primary/60');
imageUploadBtn.classList.add('border-white/10');
imageUploadBtn.title = 'Upload portrait image';
imageStatusLabel.textContent = 'No image';
imageStatusLabel.className = 'text-muted';
} else if (state === 'loading') {
icon.classList.add('hidden'); icon.classList.remove('flex');
spinner.classList.remove('hidden'); spinner.classList.add('flex');
ready.classList.add('hidden'); ready.classList.remove('flex');
} else if (state === 'ready') {
icon.classList.add('hidden'); icon.classList.remove('flex');
spinner.classList.add('hidden'); spinner.classList.remove('flex');
ready.classList.remove('hidden'); ready.classList.add('flex');
imageUploadBtn.classList.remove('border-white/10');
imageUploadBtn.classList.add('border-primary/60');
imageUploadBtn.title = `${filename} — click to clear`;
imageStatusLabel.textContent = `${filename}`;
imageStatusLabel.className = 'text-primary';
}
};
const updateVideoUploadState = (state, filename) => {
const icon = videoUploadBtn.querySelector('.video-icon');
const spinner = videoUploadBtn.querySelector('.video-spinner');
const ready = videoUploadBtn.querySelector('.video-ready');
if (state === 'idle') {
icon.classList.remove('hidden'); icon.classList.add('flex');
spinner.classList.add('hidden'); spinner.classList.remove('flex');
ready.classList.add('hidden'); ready.classList.remove('flex');
videoUploadBtn.classList.remove('border-primary/60');
videoUploadBtn.classList.add('border-white/10');
videoUploadBtn.title = 'Upload source video';
imageStatusLabel.textContent = 'No video';
imageStatusLabel.className = 'text-muted';
} else if (state === 'loading') {
icon.classList.add('hidden'); icon.classList.remove('flex');
spinner.classList.remove('hidden'); spinner.classList.add('flex');
ready.classList.add('hidden'); ready.classList.remove('flex');
} else if (state === 'ready') {
icon.classList.add('hidden'); icon.classList.remove('flex');
spinner.classList.add('hidden'); spinner.classList.remove('flex');
ready.classList.remove('hidden'); ready.classList.add('flex');
videoUploadBtn.classList.remove('border-white/10');
videoUploadBtn.classList.add('border-primary/60');
videoUploadBtn.title = `${filename} — click to clear`;
imageStatusLabel.textContent = `${filename}`;
imageStatusLabel.className = 'text-primary';
}
};
const updateAudioUploadState = (state, filename) => {
const icon = audioUploadBtn.querySelector('.audio-icon');
const spinner = audioUploadBtn.querySelector('.audio-spinner');
const ready = audioUploadBtn.querySelector('.audio-ready');
if (state === 'idle') {
icon.classList.remove('hidden'); icon.classList.add('flex');
spinner.classList.add('hidden'); spinner.classList.remove('flex');
ready.classList.add('hidden'); ready.classList.remove('flex');
audioUploadBtn.classList.remove('border-primary/60');
audioUploadBtn.classList.add('border-white/10');
audioUploadBtn.title = 'Upload audio file';
audioStatusLabel.textContent = 'No audio';
audioStatusLabel.className = 'text-muted';
} else if (state === 'loading') {
icon.classList.add('hidden'); icon.classList.remove('flex');
spinner.classList.remove('hidden'); spinner.classList.add('flex');
ready.classList.add('hidden'); ready.classList.remove('flex');
} else if (state === 'ready') {
icon.classList.add('hidden'); icon.classList.remove('flex');
spinner.classList.add('hidden'); spinner.classList.remove('flex');
ready.classList.remove('hidden'); ready.classList.add('flex');
audioUploadBtn.classList.remove('border-white/10');
audioUploadBtn.classList.add('border-primary/60');
audioUploadBtn.title = `${filename} — click to clear`;
audioStatusLabel.textContent = `${filename}`;
audioStatusLabel.className = 'text-primary';
}
};
imageUploadBtn.onclick = async (e) => {
e.stopPropagation();
if (uploadedImageUrl) {
uploadedImageUrl = null;
updateImageUploadState('idle');
return;
}
imageFileInput.click();
};
imageFileInput.onchange = async (e) => {
const file = e.target.files[0];
if (!file) return;
const apiKey = localStorage.getItem('muapi_key');
if (!apiKey) { AuthModal(() => imageFileInput.click()); return; }
updateImageUploadState('loading');
try {
uploadedImageUrl = await muapi.uploadFile(file);
updateImageUploadState('ready', file.name);
} catch (err) {
updateImageUploadState('idle');
alert(`Image upload failed: ${err.message}`);
}
imageFileInput.value = '';
};
videoUploadBtn.onclick = async (e) => {
e.stopPropagation();
if (uploadedVideoUrl) {
uploadedVideoUrl = null;
updateVideoUploadState('idle');
return;
}
videoFileInput.click();
};
videoFileInput.onchange = async (e) => {
const file = e.target.files[0];
if (!file) return;
const apiKey = localStorage.getItem('muapi_key');
if (!apiKey) { AuthModal(() => videoFileInput.click()); return; }
updateVideoUploadState('loading');
try {
uploadedVideoUrl = await muapi.uploadFile(file);
updateVideoUploadState('ready', file.name);
} catch (err) {
updateVideoUploadState('idle');
alert(`Video upload failed: ${err.message}`);
}
videoFileInput.value = '';
};
audioUploadBtn.onclick = async (e) => {
e.stopPropagation();
if (uploadedAudioUrl) {
uploadedAudioUrl = null;
updateAudioUploadState('idle');
return;
}
audioFileInput.click();
};
audioFileInput.onchange = async (e) => {
const file = e.target.files[0];
if (!file) return;
const apiKey = localStorage.getItem('muapi_key');
if (!apiKey) { AuthModal(() => audioFileInput.click()); return; }
updateAudioUploadState('loading');
try {
uploadedAudioUrl = await muapi.uploadFile(file);
updateAudioUploadState('ready', file.name);
} catch (err) {
updateAudioUploadState('idle');
alert(`Audio upload failed: ${err.message}`);
}
audioFileInput.value = '';
};
// Hide resolution if first model has none
if (getResolutionsForLipSyncModel(selectedModel).length === 0) {
resolutionBtn.classList.add('hidden');
}
// ==========================================
// 6. CANVAS AREA + HISTORY
// ==========================================
const generationHistory = [];
const historySidebar = document.createElement('div');
historySidebar.className = 'fixed right-0 top-0 h-full w-20 md:w-24 bg-black/60 backdrop-blur-xl border-l border-white/5 z-50 flex flex-col items-center py-4 gap-3 overflow-y-auto transition-all duration-500 translate-x-full opacity-0';
historySidebar.id = 'lipsync-history-sidebar';
const historyLabel = document.createElement('div');
historyLabel.className = 'text-[9px] font-bold text-muted uppercase tracking-widest mb-2';
historyLabel.textContent = 'History';
historySidebar.appendChild(historyLabel);
const historyList = document.createElement('div');
historyList.className = 'flex flex-col gap-2 w-full px-2';
historySidebar.appendChild(historyList);
container.appendChild(historySidebar);
// Main canvas
const canvas = document.createElement('div');
canvas.className = 'absolute inset-0 flex flex-col items-center justify-center p-4 min-[800px]:p-16 z-10 opacity-0 pointer-events-none transition-all duration-1000 translate-y-10 scale-95';
const videoContainer = document.createElement('div');
videoContainer.className = 'relative group';
const resultVideo = document.createElement('video');
resultVideo.className = 'max-h-[60vh] max-w-[80vw] rounded-3xl shadow-3xl border border-white/10 interactive-glow object-contain';
resultVideo.controls = true;
resultVideo.loop = true;
resultVideo.autoplay = true;
resultVideo.muted = false;
resultVideo.playsInline = true;
videoContainer.appendChild(resultVideo);
const canvasControls = document.createElement('div');
canvasControls.className = 'mt-6 flex gap-3 opacity-0 transition-opacity delay-500 duration-500 justify-center';
const regenerateBtn = document.createElement('button');
regenerateBtn.className = 'bg-white/10 hover:bg-white/20 px-6 py-2.5 rounded-2xl text-xs font-bold transition-all border border-white/5 backdrop-blur-lg text-white';
regenerateBtn.textContent = '↻ Regenerate';
const downloadBtn = document.createElement('button');
downloadBtn.className = 'bg-primary text-black px-6 py-2.5 rounded-2xl text-xs font-bold transition-all shadow-glow active:scale-95';
downloadBtn.textContent = '↓ Download';
const newBtn = document.createElement('button');
newBtn.className = 'bg-white/10 hover:bg-white/20 px-6 py-2.5 rounded-2xl text-xs font-bold transition-all border border-white/5 backdrop-blur-lg text-white';
newBtn.textContent = '+ New';
canvasControls.appendChild(regenerateBtn);
canvasControls.appendChild(downloadBtn);
canvasControls.appendChild(newBtn);
canvas.appendChild(videoContainer);
canvas.appendChild(canvasControls);
container.appendChild(canvas);
const showVideoInCanvas = (videoUrl) => {
hero.classList.add('hidden');
promptWrapper.classList.add('hidden');
resultVideo.src = videoUrl;
resultVideo.onloadeddata = () => {
canvas.classList.remove('opacity-0', 'pointer-events-none', 'translate-y-10', 'scale-95');
canvas.classList.add('opacity-100', 'translate-y-0', 'scale-100');
canvasControls.classList.remove('opacity-0');
canvasControls.classList.add('opacity-100');
};
};
const addToHistory = (entry) => {
generationHistory.unshift(entry);
localStorage.setItem('lipsync_history', JSON.stringify(generationHistory.slice(0, 30)));
historySidebar.classList.remove('translate-x-full', 'opacity-0');
historySidebar.classList.add('translate-x-0', 'opacity-100');
renderHistory();
};
const renderHistory = () => {
historyList.innerHTML = '';
generationHistory.forEach((entry, idx) => {
const thumb = document.createElement('div');
thumb.className = `relative group/thumb cursor-pointer rounded-xl overflow-hidden border-2 transition-all duration-300 ${idx === 0 ? 'border-primary shadow-glow' : 'border-white/10 hover:border-white/30'}`;
thumb.innerHTML = `
<video src="${entry.url}" preload="metadata" muted class="w-full aspect-square object-cover"></video>
<div class="absolute inset-0 bg-black/60 opacity-0 group-hover/thumb:opacity-100 transition-opacity flex items-center justify-center">
<button class="hist-download p-1.5 bg-primary rounded-lg text-black hover:scale-110 transition-transform" title="Download">
<svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="3"><path d="M21 15v4a2 2 0 01-2 2H5a2 2 0 01-2-2v-4M7 10l5 5 5-5M12 15V3"/></svg>
</button>
</div>
`;
thumb.onclick = (e) => {
if (e.target.closest('.hist-download')) { downloadFile(entry.url, `lipsync-${entry.id || idx}.mp4`); return; }
showVideoInCanvas(entry.url);
historyList.querySelectorAll('div').forEach(t => { t.classList.remove('border-primary', 'shadow-glow'); t.classList.add('border-white/10'); });
thumb.classList.remove('border-white/10');
thumb.classList.add('border-primary', 'shadow-glow');
};
historyList.appendChild(thumb);
});
};
const downloadFile = async (url, filename) => {
try {
const response = await fetch(url);
const blob = await response.blob();
const blobUrl = URL.createObjectURL(blob);
const a = document.createElement('a');
a.href = blobUrl; a.download = filename;
document.body.appendChild(a); a.click();
document.body.removeChild(a);
URL.revokeObjectURL(blobUrl);
} catch { window.open(url, '_blank'); }
};
// Load history
try {
const saved = JSON.parse(localStorage.getItem('lipsync_history') || '[]');
if (saved.length > 0) {
saved.forEach(e => generationHistory.push(e));
historySidebar.classList.remove('translate-x-full', 'opacity-0');
historySidebar.classList.add('translate-x-0', 'opacity-100');
renderHistory();
}
} catch { /* ignore */ }
// Resume pending jobs
(async () => {
const pending = getPendingJobs('lipsync');
if (!pending.length) return;
const apiKey = localStorage.getItem('muapi_key');
if (!apiKey) return;
const banner = document.createElement('div');
banner.className = 'fixed top-4 left-1/2 -translate-x-1/2 z-[200] bg-[#111] border border-white/10 text-white text-sm px-5 py-3 rounded-2xl shadow-xl flex items-center gap-3';
banner.innerHTML = `<span class="animate-spin text-primary">◌</span> <span class="banner-text">Resuming ${pending.length} pending generation${pending.length > 1 ? 's' : ''}…</span>`;
document.body.appendChild(banner);
let remaining = pending.length;
pending.forEach(async (job) => {
const elapsedAttempts = Math.floor((Date.now() - job.submittedAt) / job.interval);
const attemptsLeft = Math.max(1, job.maxAttempts - elapsedAttempts);
try {
const result = await muapi.pollForResult(job.requestId, apiKey, attemptsLeft, job.interval);
const url = result.outputs?.[0] || result.url || result.output?.url;
if (url) addToHistory({ id: job.requestId, url, ...job.historyMeta, timestamp: new Date().toISOString() });
} catch (e) { console.warn('[LipSyncStudio] Pending job failed:', job.requestId, e.message); }
finally {
removePendingJob(job.requestId);
remaining--;
if (remaining === 0) banner.remove();
else banner.querySelector('.banner-text').textContent = `Resuming ${remaining} pending generation${remaining > 1 ? 's' : ''}`;
}
});
})();
// ==========================================
// 7. CANVAS BUTTON HANDLERS
// ==========================================
downloadBtn.onclick = () => {
const current = resultVideo.src;
if (current) {
const entry = generationHistory.find(e => e.url === current);
downloadFile(current, `lipsync-${entry?.id || 'clip'}.mp4`);
}
};
regenerateBtn.onclick = () => generateBtn.click();
newBtn.onclick = () => {
canvas.classList.add('opacity-0', 'pointer-events-none', 'translate-y-10', 'scale-95');
canvas.classList.remove('opacity-100', 'translate-y-0', 'scale-100');
canvasControls.classList.add('opacity-0');
canvasControls.classList.remove('opacity-100');
hero.classList.remove('hidden', 'opacity-0', 'scale-95', '-translate-y-10', 'pointer-events-none');
promptWrapper.classList.remove('hidden', 'opacity-40');
textarea.value = '';
textarea.focus();
};
// ==========================================
// 8. GENERATION LOGIC
// ==========================================
generateBtn.onclick = async () => {
const model = getCurrentModel();
const prompt = textarea.value.trim();
// Validation
if (!uploadedAudioUrl) {
alert('Please upload an audio file first.');
return;
}
if (inputMode === 'image' && !uploadedImageUrl) {
alert('Please upload a portrait image first.');
return;
}
if (inputMode === 'video' && !uploadedVideoUrl) {
alert('Please upload a source video first.');
return;
}
const apiKey = localStorage.getItem('muapi_key');
if (!apiKey) { AuthModal(() => generateBtn.click()); return; }
hero.classList.add('opacity-0', 'scale-95', '-translate-y-10', 'pointer-events-none');
generateBtn.disabled = true;
generateBtn.innerHTML = `<span class="animate-spin inline-block mr-2 text-black">◌</span> Generating...`;
let hadError = false;
let capturedRequestId = null;
const historyMeta = { prompt, model: selectedModel };
const onRequestId = (rid) => {
capturedRequestId = rid;
savePendingJob({ requestId: rid, studioType: 'lipsync', historyMeta, maxAttempts: 900, interval: 2000, submittedAt: Date.now() });
};
try {
const lipsyncParams = {
model: selectedModel,
audio_url: uploadedAudioUrl,
onRequestId
};
if (inputMode === 'image') {
lipsyncParams.image_url = uploadedImageUrl;
} else {
lipsyncParams.video_url = uploadedVideoUrl;
}
if (prompt && model?.hasPrompt) lipsyncParams.prompt = prompt;
const resolutions = getResolutionsForLipSyncModel(selectedModel);
if (resolutions.length > 0) lipsyncParams.resolution = selectedResolution;
if (model?.hasSeed) lipsyncParams.seed = -1;
const res = await muapi.processLipSync(lipsyncParams);
console.log('[LipSyncStudio] Response:', res);
if (res && res.url) {
if (capturedRequestId) removePendingJob(capturedRequestId);
const genId = res.id || capturedRequestId || Date.now().toString();
addToHistory({ id: genId, url: res.url, prompt, model: selectedModel, timestamp: new Date().toISOString() });
showVideoInCanvas(res.url);
} else {
throw new Error('No video URL returned by API');
}
} catch (e) {
hadError = true;
if (capturedRequestId) removePendingJob(capturedRequestId);
console.error(e);
hero.classList.remove('opacity-0', 'scale-95', '-translate-y-10', 'pointer-events-none');
generateBtn.innerHTML = `Error: ${e.message.slice(0, 60)}`;
setTimeout(() => { generateBtn.innerHTML = `Generate ✨`; }, 4000);
} finally {
generateBtn.disabled = false;
if (!hadError) generateBtn.innerHTML = `Generate ✨`;
}
};
return container;
}

View file

@ -8013,4 +8013,149 @@ export const v2vModels = [
}
];
// ─── LipSync / Speech-to-Video models ────────────────────────────────────────
// Image-based: portrait image + audio → talking video
// Video-based: existing video + audio → lipsync video
export const lipsyncModels = [
// ── Image + Audio → Video ──────────────────────────────────────────────────
{
"id": "infinitetalk-image-to-video",
"name": "Infinite Talk",
"endpoint": "infinitetalk-image-to-video",
"family": "infinitetalk",
"category": "image",
"hasPrompt": true,
"description": "Animate a portrait image into a talking video driven by audio.",
"inputs": {
"resolution": {
"type": "string",
"title": "Resolution",
"name": "resolution",
"enum": ["480p", "720p"],
"default": "480p"
}
}
},
{
"id": "wan2.2-speech-to-video",
"name": "Wan 2.2 Speech to Video",
"endpoint": "wan2.2-speech-to-video",
"family": "wan",
"category": "image",
"hasPrompt": true,
"description": "Generate a talking portrait video from an image and audio using Wan 2.2.",
"inputs": {
"resolution": {
"type": "string",
"title": "Resolution",
"name": "resolution",
"enum": ["480p", "720p"],
"default": "480p"
}
}
},
{
"id": "ltx-2.3-lipsync",
"name": "LTX 2.3 Lipsync",
"endpoint": "ltx-2.3-lipsync",
"family": "ltx",
"category": "image",
"hasPrompt": true,
"hasSeed": true,
"description": "High-quality lipsync from portrait image and audio using LTX 2.3.",
"inputs": {
"resolution": {
"type": "string",
"title": "Resolution",
"name": "resolution",
"enum": ["480p", "720p", "1080p"],
"default": "720p"
}
}
},
{
"id": "ltx-2-19b-lipsync",
"name": "LTX 2 19B Lipsync",
"endpoint": "ltx-2-19b-lipsync",
"family": "ltx",
"category": "image",
"hasPrompt": true,
"description": "Lipsync from portrait image and audio using LTX 2 19B model.",
"inputs": {
"resolution": {
"type": "string",
"title": "Resolution",
"name": "resolution",
"enum": ["480p", "720p", "1080p"],
"default": "720p"
}
}
},
// ── Video + Audio → Video ──────────────────────────────────────────────────
{
"id": "sync-lipsync",
"name": "Sync Lipsync",
"endpoint": "sync-lipsync",
"family": "lipsync",
"category": "video",
"hasPrompt": false,
"description": "Generate realistic lipsync animations from audio using Sync's advanced algorithms."
},
{
"id": "latent-sync",
"name": "LatentSync",
"endpoint": "latentsync-video",
"family": "lipsync",
"category": "video",
"hasPrompt": false,
"description": "Video-to-video lipsync using LatentSync for high-quality audio-driven lip animations."
},
{
"id": "creatify-lipsync",
"name": "Creatify Lipsync",
"endpoint": "creatify-lipsync",
"family": "lipsync",
"category": "video",
"hasPrompt": false,
"description": "Realistic lipsync video optimized for speed, quality, and consistency by Creatify."
},
{
"id": "veed-lipsync",
"name": "Veed Lipsync",
"endpoint": "veed-lipsync",
"family": "lipsync",
"category": "video",
"hasPrompt": false,
"description": "Generate realistic lipsync from any audio using VEED's latest model."
},
{
"id": "infinitetalk-video-to-video",
"name": "Infinite Talk V2V",
"endpoint": "infinitetalk-video-to-video",
"family": "infinitetalk",
"category": "video",
"hasPrompt": true,
"description": "Apply audio-driven lipsync to an existing video using Infinite Talk.",
"inputs": {
"resolution": {
"type": "string",
"title": "Resolution",
"name": "resolution",
"enum": ["480p", "720p"],
"default": "480p"
}
}
}
];
export const getLipSyncModelById = (id) => lipsyncModels.find(m => m.id === id);
export const getResolutionsForLipSyncModel = (id) => {
const model = lipsyncModels.find(m => m.id === id);
return model?.inputs?.resolution?.enum || [];
};
export const imageLipSyncModels = lipsyncModels.filter(m => m.category === 'image');
export const videoLipSyncModels = lipsyncModels.filter(m => m.category === 'video');
export const getV2VModelById = (id) => v2vModels.find(m => m.id === id);

View file

@ -1,4 +1,4 @@
import { getModelById, getVideoModelById, getI2IModelById, getI2VModelById, getV2VModelById } from './models.js';
import { getModelById, getVideoModelById, getI2IModelById, getI2VModelById, getV2VModelById, getLipSyncModelById } from './models.js';
export class MuapiClient {
constructor() {
@ -444,6 +444,68 @@ export class MuapiClient {
}
}
/**
* Processes lipsync / speech-to-video generation.
* Supports image+audio video and video+audio video models.
* @param {Object} params
* @param {string} params.model - lipsyncModel id
* @param {string} [params.image_url] - Portrait image URL (image-based models)
* @param {string} [params.video_url] - Source video URL (video-based models)
* @param {string} params.audio_url - Audio file URL
* @param {string} [params.prompt] - Optional prompt (for models that support it)
* @param {string} [params.resolution] - Output resolution
* @param {number} [params.seed] - Optional seed (-1 for random)
* @param {Function} [params.onRequestId] - Called when request_id is received
*/
async processLipSync(params) {
const key = this.getKey();
const modelInfo = getLipSyncModelById(params.model);
const endpoint = modelInfo?.endpoint || params.model;
const url = `${this.baseUrl}/api/v1/${endpoint}`;
const finalPayload = {};
if (params.audio_url) finalPayload.audio_url = params.audio_url;
if (params.image_url) finalPayload.image_url = params.image_url;
if (params.video_url) finalPayload.video_url = params.video_url;
if (params.prompt) finalPayload.prompt = params.prompt;
if (params.resolution) finalPayload.resolution = params.resolution;
if (params.seed !== undefined && params.seed !== -1) finalPayload.seed = params.seed;
console.log('[Muapi] LipSync Request:', url);
console.log('[Muapi] LipSync Payload:', finalPayload);
try {
const response = await fetch(url, {
method: 'POST',
headers: { 'Content-Type': 'application/json', 'x-api-key': key },
body: JSON.stringify(finalPayload)
});
if (!response.ok) {
const errText = await response.text();
console.error('[Muapi] LipSync API Error:', errText);
throw new Error(`API Request Failed: ${response.status} ${response.statusText} - ${errText.slice(0, 100)}`);
}
const submitData = await response.json();
console.log('[Muapi] LipSync Submit Response:', submitData);
const requestId = submitData.request_id || submitData.id;
if (!requestId) return submitData;
if (params.onRequestId) params.onRequestId(requestId);
const result = await this.pollForResult(requestId, key, 900, 2000);
const videoUrl = result.outputs?.[0] || result.url || result.output?.url;
console.log('[Muapi] LipSync Result URL:', videoUrl);
return { ...result, url: videoUrl };
} catch (error) {
console.error('Muapi LipSync Error:', error);
throw error;
}
}
getDimensionsFromAR(ar) {
// Base unit 1024 (Flux standard)
switch (ar) {

View file

@ -20,6 +20,10 @@ function navigate(page) {
import('./components/CinemaStudio.js').then(({ CinemaStudio }) => {
contentArea.appendChild(CinemaStudio());
});
} else if (page === 'lipsync') {
import('./components/LipSyncStudio.js').then(({ LipSyncStudio }) => {
contentArea.appendChild(LipSyncStudio());
});
}
}