๐ŸŽฏ

ai-avatar-video

๐ŸŽฏSkill

from agentspace-so/runcomfy-agent-skills

VibeIndex|
What it does
|

A Claude Code skill for creating AI avatar and talking-head videos through RunComfy CLI, routing across OmniHuman, Wan 2-7, HappyHorse, and Seedance v2 models for audio-driven lip-sync and virtual presenter production.

Overview

AI Avatar & Talking Head Video is a Claude Code skill for creating audio-driven avatar and talking-head videos through the RunComfy CLI. It routes across multiple model endpoints: ByteDance OmniHuman for full-body audio-driven avatars from a single portrait plus audio file, Wan-AI Wan 2-7 for audio-driven mouth sync on portraits, HappyHorse 1.0 for text/image-to-video with in-pass audio, and Seedance v2 Pro for multi-modal cinematic generation with reference audio and subject. The skill classifies user intent (UGC voiceover, virtual presenter, dubbed product demo, lip-synced character, dialog scene) and selects the appropriate model.

Key Features

  • Intent-based model selection - Automatically routes to the right model based on what you are building: OmniHuman for full-body avatar from portrait + audio, Wan 2-7 for mouth-sync on existing portraits, HappyHorse for text/image-to-video with audio, Seedance v2 Pro for cinematic multi-modal generation.
  • Portrait-to-video generation - Feed a single portrait image and an audio file to produce a video where the subject speaks, sings, or gestures naturally with full head, mouth, and body movement.
  • Multiple quality tiers - Choose between premium endpoints for hero-quality output and faster/cheaper tiers for iteration and drafting, with the skill guiding the selection based on your stated use case.
  • Documented prompting patterns - Each model route ships with its documented prompting format and the exact runcomfy run invocation, removing guesswork from API parameter construction.

Who is this for?

  • Marketing teams and content creators producing UGC-style voiceover videos, virtual presenter content, or dubbed product demos at scale
  • Developers building automated video generation pipelines that need audio-driven talking-head output from static portrait images
  • Educators and communicators who want to generate speaking avatar videos from scripts and portrait photos without filming
๐Ÿ“ฆ

Same repository

agentspace-so/runcomfy-agent-skills(30 items)

ai-avatar-video

Installation

Vibe Index InstallInstalls to .claude/skills/
npx vibeindex add agentspace-so/runcomfy-agent-skills --skill ai-avatar-video
skills.sh Installโš  Installs to .agents/skills/
npx skills add agentspace-so/runcomfy-agent-skills --skill ai-avatar-video
Manual InstallCopy SKILL.md content and save to the path below
~/.claude/skills/ai-avatar-video/SKILL.md

SKILL.md

175,062Installs
-
AddedMay 13, 2026

More from this repository10

๐ŸŽฏ
video-edit๐ŸŽฏSkill

A smart intent-routing skill for video editing on RunComfy that selects the best model based on the user's intent. Routes to Wan 2.7 Edit-Video for restyle and background swaps, Kling 2.6 Pro for precise motion transfer, or Lucy Edit for lightweight identity-stable restyle and outfit swaps.

๐ŸŽฏ
image-to-video๐ŸŽฏSkill

A smart intent-routing skill for image-to-video generation on RunComfy that automatically selects the best model for the task. Routes to HappyHorse 1.0 I2V for general animations, Wan 2.7 for custom-voiceover lip-sync, or Seedance 2.0 Pro for multi-modal composition from image, video, and audio references.

๐ŸŽฏ
image-edit๐ŸŽฏSkill

A smart intent-routing skill for image editing on RunComfy that selects the best model based on the editing task. Routes to Nano Banana Edit for batch edits up to 20 images, GPT Image 2 for multilingual text rewrite, Flux Kontext Pro for single-shot precise edits, or Z-Image Turbo for mask-driven inpainting.

๐ŸŽฏ
flux-kontext๐ŸŽฏSkill

Edit images with Black Forest Labs' Flux 1 Kontext Pro on RunComfy, specializing in single-reference precise local edits with high-fidelity source preservation. Ideal for targeted changes like adding objects or modifying details while keeping the rest of the image unchanged.

๐ŸŽฏ
nano-banana-2๐ŸŽฏSkill

A RunComfy skill that generates images using Google Nano Banana 2, the flash-tier text-to-image model in the Gemini family. Optimized for rapid iteration, social thumbnails, and in-image typography with configurable resolution tiers and safety tolerance.

๐ŸŽฏ
nano-banana-edit๐ŸŽฏSkill

Edit images with Google Nano Banana 2 on RunComfy, supporting batch edits of up to 20 images per call with strong identity preservation. Features localized edits using spatial language, background swaps, and configurable resolution up to 4K.

๐ŸŽฏ
happyhorse-1-0๐ŸŽฏSkill

Generate text-to-video with HappyHorse 1.0 on RunComfy, currently ranked #1 on Artificial Analysis Video Arena. Supports native 1080p with in-pass synchronized audio, multi-shot character consistency, and 6-language prompt support via the RunComfy CLI.

๐ŸŽฏ
wan-2-7๐ŸŽฏSkill

Generate text-to-video with Wan-AI's Wan 2.7 on RunComfy, featuring multi-reference conditioning and audio-driven lip-sync via custom audio tracks. Supports prompt expansion, negative prompts, and up to 1080p resolution through the RunComfy CLI.

๐ŸŽฏ
seedance-v2๐ŸŽฏSkill

Generate cinematic short-form video with ByteDance Seedance 2.0 Pro on RunComfy, supporting multi-modal references including up to 9 images, 3 videos, and 3 audio tracks. Features native lip-synced audio generation and is ideal for brand-consistent multi-language narratives.

๐ŸŽฏ
gpt-image-edit๐ŸŽฏSkill

Edit images with OpenAI GPT Image 2 on RunComfy, excelling at multilingual in-image text editing across any script (Latin, kana, CJK, Cyrillic, Arabic) and multi-reference composition with up to 10 input images. Ideal for identity-preserving edits and layout-precise repositioning.