happyhorse-1-0
agentspace-so/runcomfy-agent-skills
Generate native 1080p text-to-video with synchronized audio via HappyHorse 1.0 on RunComfy
What is happyhorse-1-0?
HappyHorse 1.0 is a text-to-video model ranked #1 on Artificial Analysis Video Arena (Elo 1333 t2v / 1392 i2v), accessible through the RunComfy CLI. It produces native 1080p video with in-pass synchronized audio (dialogue, ambient, Foley) and multi-shot character consistency. Supports prompts up to 2,500 characters in 6 languages (Chinese, English, Japanese, Korean, German, French), durations from 3–15 seconds, and five aspect ratios. The skill wraps the `runcomfy run happyhorse/happyhorse-1-0/text-to-video` command and handles job submission, polling, and file download automatically.
- Runs HappyHorse 1.0 text-to-video generation via the RunComfy Model API CLI
- Produces native 720P or 1080P video with in-pass synchronized audio in a single generation pass
- Supports multi-shot character and wardrobe consistency across shots within a single prompt
- Accepts prompts in 6 languages: Chinese, English, Japanese, Korean, German, and French
- Allows control over aspect ratio (16:9, 9:16, 1:1, 4:3, 3:4), duration (3–15s), resolution, seed, and watermark
- Polls job status every 2 seconds and downloads output files automatically to a specified directory
How to install happyhorse-1-0
npx skills add https://github.com/agentspace-so/runcomfy-agent-skills --skill happyhorse-1-0- Node.js available to run npx
- RunComfy CLI installed globally: npm i -g @runcomfy/cli
- RunComfy account created at runcomfy.com
- Authenticated via `runcomfy login` (browser device-code flow) or RUNCOMFY_TOKEN env var set for CI/containers
- Skill installed: npx skills add https://github.com/agentspace-so/runcomfy-agent-skills --skill happyhorse-1-0
How to use happyhorse-1-0
- 1.Install the RunComfy CLI: npm i -g @runcomfy/cli
- 2.Authenticate: run `runcomfy login` or set RUNCOMFY_TOKEN=<token> for CI environments
- 3.Install this skill: npx skills add https://github.com/agentspace-so/runcomfy-agent-skills --skill happyhorse-1-0
- 4.Write a prompt describing motion over time, camera angle, and audio direction (up to 2,500 chars)
- 5.Run default 16:9 1080p 5s generation: runcomfy run happyhorse/happyhorse-1-0/text-to-video --input '{"prompt": "<your prompt>"}' --output-dir <absolute/path>
- 6.Override parameters as needed: set aspect_ratio, resolution (720P/1080P), duration (3–15), seed, or watermark in the JSON input
- 7.For multi-shot consistency, restate character anchors (clothing, appearance) at each shot description in the prompt
- 8.Check exit codes if errors occur: 65 = bad input schema, 75 = retryable timeout/rate-limit, 77 = auth failure
Use cases
- Multi-shot brand stories requiring consistent character appearance across shots
- Talking-head or explainer videos needing in-clip voiceover and ambient audio in one pass
- Multilingual short-form ads without script-quality degradation across languages
- Cinematic 1080p video delivery for broadcast-ready output
- Vertical short-form content for social platforms using 9:16 aspect ratio
- Video creators who need character consistency across multiple shots
- Marketers producing multilingual short-form video ads
- Developers building video generation pipelines via CLI
- Content producers needing synchronized audio without a separate audio pass
- Teams evaluating top-ranked open video generation models
happyhorse-1-0 FAQ
No. The t2v endpoint covered by this skill is text-only. HappyHorse does support i2v via a separate pipeline not included here.
No. Audio is generated in-pass from the text prompt only. For audio-driven lip-sync, use Wan 2.7 (accepts audio_url) or Seedance 2.0 Pro instead.
15 seconds per generation. For longer narratives, split into multiple prompts and stitch the outputs together.
Only five values are valid: 16:9, 9:16, 1:1, 4:3, and 3:4. Other values will result in a 422 error (exit code 65).
Use Wan 2.7 for fine motion control or audio-driven lip-sync, Seedance 2.0 Pro for detailed lip-synced dialogue with a reference video, LTX 2 for ultra-fast iteration, and Kling Video O1 for cinematic motion editing on existing footage.
Full instructions (SKILL.md)
Source of truth, from agentspace-so/runcomfy-agent-skills.
name: happyhorse-1-0
displayName: "HappyHorse 1.0 — Pro Pack on RunComfy"
description: >
Generate text-to-video with HappyHorse 1.0 on RunComfy. Documents HappyHorse
1.0's strengths (#1 on Artificial Analysis Video Arena, native 1080p
with in-pass synchronized audio, multi-shot character consistency,
6-language prompt support), the duration / aspect-ratio / resolution
schema, and when to route to Wan 2.7 / Seedance 2 / LTX 2 instead.
Calls runcomfy run happyhorse/happyhorse-1-0/text-to-video through
the local RunComfy CLI. Triggers on "happyhorse", "happy horse",
"happyhorse 1.0", "happyhorse video", or any explicit ask to generate
video with this model.
homepage: https://www.runcomfy.com
license: MIT
HappyHorse 1.0 — Pro Pack on RunComfy
runcomfy.com · Text-to-video · GitHub
HappyHorse 1.0 — currently #1 on Artificial Analysis Video Arena (Elo 1333 t2v / 1392 i2v) — hosted on the RunComfy Model API. Native 1080p video with in-pass synchronized audio (dialogue, ambient, Foley) and multi-shot character consistency.
npx skills add agentspace-so/runcomfy-skills --skill happyhorse-1-0 -g
When to pick this model (vs siblings)
| You want | Use |
|---|---|
| Multi-shot story with character / wardrobe consistency | HappyHorse 1.0 |
| Native audio in the same generation pass | HappyHorse 1.0 |
| Currently-#1 blind-vote video model | HappyHorse 1.0 |
| Detailed lip-synced dialogue + reference video | Seedance 2.0 Pro |
| Fine motion control + multi-reference conditioning | Wan 2.7 |
| Ultra-fast iteration (sub-second per frame) | LTX 2 |
| Cinematic motion editing on existing footage | Kling Video O1 |
If the user said "HappyHorse" / "happy horse video" explicitly, route here regardless.
Prerequisites
- RunComfy CLI —
npm i -g @runcomfy/cli - RunComfy account —
runcomfy loginopens a browser device-code flow. - CI / containers — set
RUNCOMFY_TOKEN=<token>instead ofruncomfy login.
Endpoints + input schema
happyhorse/happyhorse-1-0/text-to-video
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
prompt | string | yes | — | Up to 2,500 chars. 6 languages (CN/EN/JP/KR/DE/FR). |
aspect_ratio | enum | no | 16:9 | 16:9, 9:16, 1:1, 4:3, 3:4 only. |
resolution | enum | no | 1080P | 720P or 1080P. |
duration | int | no | 5 | 3–15 seconds. |
seed | int | no | 0 | 0..2^31-1. Reuse for variant comparisons. |
watermark | bool | no | true | Provider watermark. |
How to invoke
Default (16:9 1080p 5s):
runcomfy run happyhorse/happyhorse-1-0/text-to-video \
--input '{"prompt": "<user prompt>"}' \
--output-dir <absolute/path>
Vertical short (9:16, 8s, no watermark):
runcomfy run happyhorse/happyhorse-1-0/text-to-video \
--input '{
"prompt": "<user prompt>",
"aspect_ratio": "9:16",
"duration": 8,
"watermark": false
}' \
--output-dir <absolute/path>
Cheaper test pass (720p):
runcomfy run happyhorse/happyhorse-1-0/text-to-video \
--input '{"prompt": "<user prompt>", "resolution": "720P", "duration": 3}' \
--output-dir <absolute/path>
The CLI submits, polls every 2s until terminal, then downloads any *.runcomfy.net / *.runcomfy.com URL from the result into --output-dir. Stdout is the result JSON. Stderr is progress.
Prompting — what actually works
Describe motion over time, not a still. "A woman turns from the window, walks two paces to the desk, picks up the cup, lifts it to her face, takes a sip" beats "a woman drinking coffee".
Camera + shot in plain English. Front-load the shot: "Wide shot. ..." / "Tracking shot. ..." / "Locked tripod, low angle. ..." works as a real directive. Specify lens feel: "35mm anamorphic", "shallow DOF", "crushed shadows".
One visual beat per clip when iterating. Don't pile up "she walks AND the dog runs AND a car passes". Pick the beat, get it sharp, then layer with multi-shot prompts.
Multi-shot consistency — when describing two beats, restate the anchor at each: "Shot 1: tall woman in red wool coat, blue scarf, in a rainy alley. Shot 2: same woman in red coat / blue scarf, now ducking under an awning." HappyHorse holds the look but needs the anchor.
Audio direction — say what you want to hear: "distant temple bells, footsteps on wet pavement, no dialogue" or "warm friendly tone, English".
Anti-patterns:
- Static-frame descriptions (no temporal verbs) → motion will be vague.
- Conflicting style directions → cancels.
-
2500 char prompts → degrades.
- Aspect ratios outside the 5 supported → 422.
Where it shines
| Use case | Why HappyHorse 1.0 |
|---|---|
| Multi-shot brand stories with one consistent character | Native cross-shot identity preservation |
| Talking-head explainers needing in-clip voiceover + ambient | Synchronized audio in the same pass |
| Multilingual short-form ads | 6 prompt languages, no script-quality drop |
| Cinematic 1080p delivery | Native 1080p output, broadcast-ready |
| Blind-vote leader for general video quality | #1 on Artificial Analysis Video Arena |
Sample prompts (verified to produce strong results)
From the model page (cinematic scope):
Wide shot. A lone astronaut in dusty orange suit with blue-gray harness
skis across lunar plain, leaving parallel tracks in gray regolith.
Mid-stride, poles planted, pushing in 1/6th gravity with subtle upward
drift. Fine dust haze along ski tracks. Crescent Earth above lunar
horizon, blue-white glow against black sky. Raw sunlight, crushed
shadows, no fill. 8K photorealistic.
Multi-shot consistency:
Shot 1: Medium close-up. A woman in a navy trench coat enters a
rain-slick neon-lit Tokyo alley, looks left, holds up an umbrella.
Shot 2: Same woman in same navy trench, now under the awning of a
ramen shop, shaking water off the umbrella. Warm interior glow, soft
chatter, gentle rain on metal roof in the audio.
Vertical platform-native:
9:16 vertical short. A barista in a black apron pulls a single
espresso shot, steam rising into the morning sun, rich crema slowly
forming. Close-up handheld, shallow DOF, warm cafe ambience and the
hiss of the steam wand.
Limitations
- Duration cap 15s — for longer narratives, segment into multi-shot prompts and stitch.
- Aspect ratios — only the 5 documented values; ultra-wide cinematic gets cropped or rejected.
- Audio is in-pass only — you can't pass external audio to drive lip-sync. For audio-driven lip-sync, use Wan 2.7 (which accepts an
audio_url) or Seedance 2.0 Pro. - No free image-to-video on this template — i2v is supported by HappyHorse via a separate pipeline; the t2v endpoint here is text-only.
Exit codes
The runcomfy CLI uses sysexits-style codes:
| code | meaning |
|---|---|
| 0 | success |
| 64 | bad CLI args |
| 65 | bad input JSON / schema mismatch (e.g. duration: 30 would 422) |
| 69 | upstream 5xx |
| 75 | retryable: timeout / 429 |
| 77 | not signed in or token rejected |
Full reference: docs.runcomfy.com/cli/troubleshooting.
How it works
- The skill invokes
runcomfy run happyhorse/happyhorse-1-0/text-to-videowith a JSON body matching the schema. - The CLI POSTs to
https://model-api.runcomfy.net/v1/models/happyhorse/happyhorse-1-0/text-to-videowith the user's bearer token. - The Model API returns a
request_id; the CLI pollsGET .../requests/<id>/statusevery 2 seconds. - On terminal status, the CLI fetches
GET .../requests/<id>/resultand downloads any URL whose host ends with.runcomfy.netor.runcomfy.cominto--output-dir. Other URLs are listed but not fetched. Ctrl-Cwhile polling sendsPOST .../requests/<id>/cancelso you don't get billed for GPU you stopped.
What this skill is not
Not a self-hosted video runner. Not a capability grant — depends on a working RunComfy account.
Security & Privacy
- Token storage:
runcomfy loginwrites the API token to~/.config/runcomfy/token.jsonwith mode 0600 (owner-only read/write). SetRUNCOMFY_TOKENenv var to bypass the file entirely in CI / containers. - Input boundary: the user prompt is passed as a JSON string to the CLI via
--input. The CLI does NOT shell-expand the prompt; it transmits the JSON body directly to the Model API over HTTPS. No shell injection surface from prompt content. - Third-party content: image / mask / video URLs you pass are fetched by the RunComfy model server, not by the CLI on your machine. Treat external URLs as untrusted; image-based prompt injection is a known risk for any image-edit / video-edit model.
- Outbound endpoints: only
model-api.runcomfy.net(request submission) and*.runcomfy.net/*.runcomfy.com(download whitelist for generated outputs). No telemetry, no callbacks. - Generated-file size cap: the CLI aborts any single download > 2 GiB to prevent disk-fill from a malicious or runaway model output.
Related skills
More from agentspace-so/runcomfy-agent-skills and the wider catalog.
video-edit
>
image-to-video
>
nano-banana-2
>
image-edit
>
nano-banana-edit
>
flux-kontext
>