← Back to console

Deploy & Roadmap Plan

Current system status + what is coming next (reflects real work done)

βœ“ Done

  • Console: landing page, dashboard, presenter management, gallery with tabs/search/player, settings, i18n EN/δΈ­ζ–‡/ΰΉ„ΰΈ—ΰΈ’, dark mode + adjustable font/size/density
  • Per-presenter: background removal (rembg), image/video backdrop, camera zoom, 9-point frame position + presenter size β€” live preview before save
  • Backend: Postgres + PostgREST (Supabase-lite) + nginx gateway on a single host
  • Pipeline split into clean layers: api / services / clients / gpu β€” ALL GPU work behind one boundary, swap the GPU server with env only (GPU_BACKEND + GPU_BASE_URL)
  • TTS: MMS-TTS Thai/English code-switch (offline, China-safe) | Azure Neural | edge-tts | GPT-SoVITS (pitch/emotion) wired into the chain
  • Voice clone: OpenVoice v2 β€” matches the presenter voice from the source clip
  • Lip-sync: Wav2Lip on RTX 4090 β€” mouth + head motion from a reference video (HD 1080p)
  • Render: async + progress bar, keeps every version, adjustable speed/length (up to 30 min)
  • 24/7 live: Qwen auto-generates a continuous script + 30s look-ahead pre-render β†’ gapless RTMP (auto filler)
  • Live Console: real-time FB comments + AI auto-reply + the presenter can SPEAK replies on stream
  • Stream: ffmpeg β†’ Facebook Live / TikTok / Shopee (RTMP)
  • Status panel: monitors every service + GPU/RAM/disk bars

β–’ In progress / next

  • Deploy GPT-SoVITS api.py on the GPU box, then enable in Settings (system side is ready β€” see services/sovits/README.md)
  • Point a real domain + run deploy/scripts/enable-https.sh + add Google OAuth credentials β†’ enable Gmail login (script/login page ready)
  • Set fb_page_token + fb_live_video_id in Settings for real live comments
  • Apply migrations 0005-0006 to the production DB + redeploy on UCloud
  • Later: full per-user accounts replacing demo mode, durable queue (Redis) replacing in-memory state

β—‡ Architecture

  • web (Next.js 16) :3000 β€” console + API gateway (single forwarder lib/pipeline-client)
  • pipeline (FastAPI) :8000 β€” api β†’ services β†’ {clients, gpu, media} | /render /generate /live /gpu
  • gpu boundary β€” lipsync, MMS-TTS, OpenVoice, GFPGAN, rembg, SoVITS β€” local/remote per capability
  • lipsync :8001 β€” Wav2Lip/MuseTalk GPU | sovits :9880 β€” GPT-SoVITS (opt-in)
  • db (Postgres) :5432 + postgrest + gateway :8088 | qwen (ollama) :11434
  • nginx β€” reverse proxy + nip.io (status/console/api) + opt-in TLS via script
Everything runs on one UCloud GPU host β€” and the GPU part alone can move to another box by changing env only (GPU_BACKEND=remote)