AI Skill
Pass
Audit score 90

happyhorse-1-0

runcomfy-com/skills

#1 ranked text-to-video model with native 1080p, synchronized audio, and multi-shot character consistency.

What is happyhorse-1-0?

HappyHorse 1.0 is a top-performing text-to-video model hosted on RunComfy, ranked #1 on Artificial Analysis Video Arena. It generates native 1080p video with in-pass synchronized audio and maintains character consistency across multiple shots, making it ideal for brand stories, explainers, and multilingual content. Use it when you need high-quality cinematic video with audio and character continuity.

  • Generate text-to-video up to 15 seconds at native 1080p resolution
  • Produce synchronized audio (dialogue, ambient, Foley) in the same generation pass
  • Maintain character and wardrobe consistency across multi-shot sequences
  • Support prompts in 6 languages (Chinese, English, Japanese, Korean, German, French)
  • Offer flexible aspect ratios (16:9, 9:16, 1:1, 4:3, 3:4) and resolution options (720P, 1080P)
  • Control generation with seed values for reproducible variants

How to install happyhorse-1-0

npx skills add https://github.com/runcomfy-com/skills --skill happyhorse-1-0
Prerequisites
  • RunComfy CLI installed (npm i -g @runcomfy/cli)
  • RunComfy account with authentication (runcomfy login or RUNCOMFY_TOKEN environment variable)
Claude Code
Cursor
Windsurf
Cline

How to use happyhorse-1-0

  1. 1.Install the skill: npx skills add agentspace-so/runcomfy-skills --skill happyhorse-1-0
  2. 2.Authenticate with RunComfy: runcomfy login (or set RUNCOMFY_TOKEN for CI/containers)
  3. 3.Craft a detailed motion-focused prompt describing temporal action, camera work, and audio direction
  4. 4.Run the generation: runcomfy run happyhorse/happyhorse-1-0/text-to-video --input '{"prompt": "<your prompt>"}' --output-dir <path>
  5. 5.Specify optional parameters (aspect_ratio, resolution, duration, seed, watermark) in the input JSON as needed
  6. 6.Wait for polling to complete; the CLI downloads the result video to your output directory

Use cases

Good for
  • Create multi-shot brand stories with one consistent character across scenes
  • Generate talking-head explainers with synchronized voiceover and ambient sound
  • Produce multilingual short-form ads without quality loss across languages
  • Deliver cinematic 1080p video content ready for broadcast or premium platforms
  • Iterate on motion and composition with detailed temporal descriptions
Who it's for
  • Video producers and content creators needing broadcast-quality output
  • Marketing teams creating multilingual brand narratives
  • Explainer video creators requiring synchronized audio
  • Filmmakers prototyping cinematic sequences with character consistency

happyhorse-1-0 FAQ

When should I use HappyHorse 1.0 vs. other models like Wan 2.7 or Seedance 2?

Use HappyHorse 1.0 for multi-shot character consistency and native in-pass audio. Choose Wan 2.7 for fine motion control and multi-reference conditioning, Seedance 2.0 Pro for detailed lip-synced dialogue with reference video, or LTX 2 for ultra-fast iteration.

What makes a good prompt for HappyHorse 1.0?

Describe motion over time with temporal verbs (walks, turns, lifts), front-load camera direction (wide shot, tracking shot), specify lens feel (35mm anamorphic, shallow DOF), and for multi-shot consistency, restate character details at each beat. Avoid static descriptions and keep prompts under 2,500 characters.

Can I use external audio or drive lip-sync with HappyHorse 1.0?

No, HappyHorse 1.0 generates audio in-pass only. For audio-driven lip-sync with external audio, use Wan 2.7 (accepts audio_url) or Seedance 2.0 Pro instead.

What are the duration and aspect ratio limits?

Duration is capped at 15 seconds (3–15 range). Supported aspect ratios are 16:9, 9:16, 1:1, 4:3, and 3:4 only; other ratios will be rejected with a 422 error.

How do I troubleshoot CLI errors?

Check exit codes: 0=success, 64=bad CLI args, 65=bad input JSON/schema mismatch, 69=upstream 5xx, 75=retryable timeout/429, 77=not signed in. See docs.runcomfy.com/cli/troubleshooting for full reference.

Full instructions (SKILL.md)

Source of truth, from runcomfy-com/skills.


name: happyhorse-1-0 displayName: "HappyHorse 1.0 — Pro Pack on RunComfy" description: > Generate text-to-video with HappyHorse 1.0 on RunComfy. Documents HappyHorse 1.0's strengths (#1 on Artificial Analysis Video Arena, native 1080p with in-pass synchronized audio, multi-shot character consistency, 6-language prompt support), the duration / aspect-ratio / resolution schema, and when to route to Wan 2.7 / Seedance 2 / LTX 2 instead. Calls runcomfy run happyhorse/happyhorse-1-0/text-to-video through the local RunComfy CLI. Triggers on "happyhorse", "happy horse", "happyhorse 1.0", "happyhorse video", or any explicit ask to generate video with this model. homepage: https://www.runcomfy.com license: MIT

HappyHorse 1.0 — Pro Pack on RunComfy

runcomfy.com · Text-to-video · GitHub

HappyHorse 1.0 — currently #1 on Artificial Analysis Video Arena (Elo 1333 t2v / 1392 i2v) — hosted on the RunComfy Model API. Native 1080p video with in-pass synchronized audio (dialogue, ambient, Foley) and multi-shot character consistency.

npx skills add agentspace-so/runcomfy-skills --skill happyhorse-1-0 -g

When to pick this model (vs siblings)

You wantUse
Multi-shot story with character / wardrobe consistencyHappyHorse 1.0
Native audio in the same generation passHappyHorse 1.0
Currently-#1 blind-vote video modelHappyHorse 1.0
Detailed lip-synced dialogue + reference videoSeedance 2.0 Pro
Fine motion control + multi-reference conditioningWan 2.7
Ultra-fast iteration (sub-second per frame)LTX 2
Cinematic motion editing on existing footageKling Video O1

If the user said "HappyHorse" / "happy horse video" explicitly, route here regardless.

Prerequisites

  1. RunComfy CLInpm i -g @runcomfy/cli
  2. RunComfy accountruncomfy login opens a browser device-code flow.
  3. CI / containers — set RUNCOMFY_TOKEN=<token> instead of runcomfy login.

Endpoints + input schema

happyhorse/happyhorse-1-0/text-to-video

FieldTypeRequiredDefaultNotes
promptstringyesUp to 2,500 chars. 6 languages (CN/EN/JP/KR/DE/FR).
aspect_ratioenumno16:916:9, 9:16, 1:1, 4:3, 3:4 only.
resolutionenumno1080P720P or 1080P.
durationintno53–15 seconds.
seedintno00..2^31-1. Reuse for variant comparisons.
watermarkboolnotrueProvider watermark.

How to invoke

Default (16:9 1080p 5s):

runcomfy run happyhorse/happyhorse-1-0/text-to-video \
  --input '{"prompt": "<user prompt>"}' \
  --output-dir <absolute/path>

Vertical short (9:16, 8s, no watermark):

runcomfy run happyhorse/happyhorse-1-0/text-to-video \
  --input '{
    "prompt": "<user prompt>",
    "aspect_ratio": "9:16",
    "duration": 8,
    "watermark": false
  }' \
  --output-dir <absolute/path>

Cheaper test pass (720p):

runcomfy run happyhorse/happyhorse-1-0/text-to-video \
  --input '{"prompt": "<user prompt>", "resolution": "720P", "duration": 3}' \
  --output-dir <absolute/path>

The CLI submits, polls every 2s until terminal, then downloads any *.runcomfy.net / *.runcomfy.com URL from the result into --output-dir. Stdout is the result JSON. Stderr is progress.

Prompting — what actually works

Describe motion over time, not a still. "A woman turns from the window, walks two paces to the desk, picks up the cup, lifts it to her face, takes a sip" beats "a woman drinking coffee".

Camera + shot in plain English. Front-load the shot: "Wide shot. ..." / "Tracking shot. ..." / "Locked tripod, low angle. ..." works as a real directive. Specify lens feel: "35mm anamorphic", "shallow DOF", "crushed shadows".

One visual beat per clip when iterating. Don't pile up "she walks AND the dog runs AND a car passes". Pick the beat, get it sharp, then layer with multi-shot prompts.

Multi-shot consistency — when describing two beats, restate the anchor at each: "Shot 1: tall woman in red wool coat, blue scarf, in a rainy alley. Shot 2: same woman in red coat / blue scarf, now ducking under an awning." HappyHorse holds the look but needs the anchor.

Audio direction — say what you want to hear: "distant temple bells, footsteps on wet pavement, no dialogue" or "warm friendly tone, English".

Anti-patterns:

  • Static-frame descriptions (no temporal verbs) → motion will be vague.
  • Conflicting style directions → cancels.
  • 2500 char prompts → degrades.

  • Aspect ratios outside the 5 supported → 422.

Where it shines

Use caseWhy HappyHorse 1.0
Multi-shot brand stories with one consistent characterNative cross-shot identity preservation
Talking-head explainers needing in-clip voiceover + ambientSynchronized audio in the same pass
Multilingual short-form ads6 prompt languages, no script-quality drop
Cinematic 1080p deliveryNative 1080p output, broadcast-ready
Blind-vote leader for general video quality#1 on Artificial Analysis Video Arena

Sample prompts (verified to produce strong results)

From the model page (cinematic scope):

Wide shot. A lone astronaut in dusty orange suit with blue-gray harness
skis across lunar plain, leaving parallel tracks in gray regolith.
Mid-stride, poles planted, pushing in 1/6th gravity with subtle upward
drift. Fine dust haze along ski tracks. Crescent Earth above lunar
horizon, blue-white glow against black sky. Raw sunlight, crushed
shadows, no fill. 8K photorealistic.

Multi-shot consistency:

Shot 1: Medium close-up. A woman in a navy trench coat enters a
rain-slick neon-lit Tokyo alley, looks left, holds up an umbrella.
Shot 2: Same woman in same navy trench, now under the awning of a
ramen shop, shaking water off the umbrella. Warm interior glow, soft
chatter, gentle rain on metal roof in the audio.

Vertical platform-native:

9:16 vertical short. A barista in a black apron pulls a single
espresso shot, steam rising into the morning sun, rich crema slowly
forming. Close-up handheld, shallow DOF, warm cafe ambience and the
hiss of the steam wand.

Limitations

  • Duration cap 15s — for longer narratives, segment into multi-shot prompts and stitch.
  • Aspect ratios — only the 5 documented values; ultra-wide cinematic gets cropped or rejected.
  • Audio is in-pass only — you can't pass external audio to drive lip-sync. For audio-driven lip-sync, use Wan 2.7 (which accepts an audio_url) or Seedance 2.0 Pro.
  • No free image-to-video on this template — i2v is supported by HappyHorse via a separate pipeline; the t2v endpoint here is text-only.

Exit codes

The runcomfy CLI uses sysexits-style codes:

codemeaning
0success
64bad CLI args
65bad input JSON / schema mismatch (e.g. duration: 30 would 422)
69upstream 5xx
75retryable: timeout / 429
77not signed in or token rejected

Full reference: docs.runcomfy.com/cli/troubleshooting.

How it works

  1. The skill invokes runcomfy run happyhorse/happyhorse-1-0/text-to-video with a JSON body matching the schema.
  2. The CLI POSTs to https://model-api.runcomfy.net/v1/models/happyhorse/happyhorse-1-0/text-to-video with the user's bearer token.
  3. The Model API returns a request_id; the CLI polls GET .../requests/<id>/status every 2 seconds.
  4. On terminal status, the CLI fetches GET .../requests/<id>/result and downloads any URL whose host ends with .runcomfy.net or .runcomfy.com into --output-dir. Other URLs are listed but not fetched.
  5. Ctrl-C while polling sends POST .../requests/<id>/cancel so you don't get billed for GPU you stopped.

What this skill is not

Not a self-hosted video runner. Not a capability grant — depends on a working RunComfy account.

Security & Privacy

  • Token storage: runcomfy login writes the API token to ~/.config/runcomfy/token.json with mode 0600 (owner-only read/write). Set RUNCOMFY_TOKEN env var to bypass the file entirely in CI / containers.
  • Input boundary: the user prompt is passed as a JSON string to the CLI via --input. The CLI does NOT shell-expand the prompt; it transmits the JSON body directly to the Model API over HTTPS. No shell injection surface from prompt content.
  • Third-party content: image / mask / video URLs you pass are fetched by the RunComfy model server, not by the CLI on your machine. Treat external URLs as untrusted; image-based prompt injection is a known risk for any image-edit / video-edit model.
  • Outbound endpoints: only model-api.runcomfy.net (request submission) and *.runcomfy.net / *.runcomfy.com (download whitelist for generated outputs). No telemetry, no callbacks.
  • Generated-file size cap: the CLI aborts any single download > 2 GiB to prevent disk-fill from a malicious or runaway model output.