AI Skill
Pass
Audit score 90

happyhorse-1-0

agentspace-so/runcomfy-agent-skills

Generate native 1080p text-to-video with synchronized audio via HappyHorse 1.0 on RunComfy

What is happyhorse-1-0?

HappyHorse 1.0 is a text-to-video model ranked #1 on Artificial Analysis Video Arena (Elo 1333 t2v / 1392 i2v), accessible through the RunComfy CLI. It produces native 1080p video with in-pass synchronized audio (dialogue, ambient, Foley) and multi-shot character consistency. Supports prompts up to 2,500 characters in 6 languages (Chinese, English, Japanese, Korean, German, French), durations from 3–15 seconds, and five aspect ratios. The skill wraps the `runcomfy run happyhorse/happyhorse-1-0/text-to-video` command and handles job submission, polling, and file download automatically.

  • Runs HappyHorse 1.0 text-to-video generation via the RunComfy Model API CLI
  • Produces native 720P or 1080P video with in-pass synchronized audio in a single generation pass
  • Supports multi-shot character and wardrobe consistency across shots within a single prompt
  • Accepts prompts in 6 languages: Chinese, English, Japanese, Korean, German, and French
  • Allows control over aspect ratio (16:9, 9:16, 1:1, 4:3, 3:4), duration (3–15s), resolution, seed, and watermark
  • Polls job status every 2 seconds and downloads output files automatically to a specified directory

How to install happyhorse-1-0

npx skills add https://github.com/agentspace-so/runcomfy-agent-skills --skill happyhorse-1-0
Prerequisites
  • Node.js available to run npx
  • RunComfy CLI installed globally: npm i -g @runcomfy/cli
  • RunComfy account created at runcomfy.com
  • Authenticated via `runcomfy login` (browser device-code flow) or RUNCOMFY_TOKEN env var set for CI/containers
  • Skill installed: npx skills add https://github.com/agentspace-so/runcomfy-agent-skills --skill happyhorse-1-0
Claude Code
Cursor
Windsurf
Cline

How to use happyhorse-1-0

  1. 1.Install the RunComfy CLI: npm i -g @runcomfy/cli
  2. 2.Authenticate: run `runcomfy login` or set RUNCOMFY_TOKEN=<token> for CI environments
  3. 3.Install this skill: npx skills add https://github.com/agentspace-so/runcomfy-agent-skills --skill happyhorse-1-0
  4. 4.Write a prompt describing motion over time, camera angle, and audio direction (up to 2,500 chars)
  5. 5.Run default 16:9 1080p 5s generation: runcomfy run happyhorse/happyhorse-1-0/text-to-video --input '{"prompt": "<your prompt>"}' --output-dir <absolute/path>
  6. 6.Override parameters as needed: set aspect_ratio, resolution (720P/1080P), duration (3–15), seed, or watermark in the JSON input
  7. 7.For multi-shot consistency, restate character anchors (clothing, appearance) at each shot description in the prompt
  8. 8.Check exit codes if errors occur: 65 = bad input schema, 75 = retryable timeout/rate-limit, 77 = auth failure

Use cases

Good for
  • Multi-shot brand stories requiring consistent character appearance across shots
  • Talking-head or explainer videos needing in-clip voiceover and ambient audio in one pass
  • Multilingual short-form ads without script-quality degradation across languages
  • Cinematic 1080p video delivery for broadcast-ready output
  • Vertical short-form content for social platforms using 9:16 aspect ratio
Who it's for
  • Video creators who need character consistency across multiple shots
  • Marketers producing multilingual short-form video ads
  • Developers building video generation pipelines via CLI
  • Content producers needing synchronized audio without a separate audio pass
  • Teams evaluating top-ranked open video generation models

happyhorse-1-0 FAQ

Does this model support image-to-video?

No. The t2v endpoint covered by this skill is text-only. HappyHorse does support i2v via a separate pipeline not included here.

Can I use external audio to drive lip-sync?

No. Audio is generated in-pass from the text prompt only. For audio-driven lip-sync, use Wan 2.7 (accepts audio_url) or Seedance 2.0 Pro instead.

What is the maximum video duration?

15 seconds per generation. For longer narratives, split into multiple prompts and stitch the outputs together.

Which aspect ratios are supported?

Only five values are valid: 16:9, 9:16, 1:1, 4:3, and 3:4. Other values will result in a 422 error (exit code 65).

When should I use a different model instead of HappyHorse 1.0?

Use Wan 2.7 for fine motion control or audio-driven lip-sync, Seedance 2.0 Pro for detailed lip-synced dialogue with a reference video, LTX 2 for ultra-fast iteration, and Kling Video O1 for cinematic motion editing on existing footage.

Full instructions (SKILL.md)

Source of truth, from agentspace-so/runcomfy-agent-skills.


name: happyhorse-1-0 displayName: "HappyHorse 1.0 — Pro Pack on RunComfy" description: > Generate text-to-video with HappyHorse 1.0 on RunComfy. Documents HappyHorse 1.0's strengths (#1 on Artificial Analysis Video Arena, native 1080p with in-pass synchronized audio, multi-shot character consistency, 6-language prompt support), the duration / aspect-ratio / resolution schema, and when to route to Wan 2.7 / Seedance 2 / LTX 2 instead. Calls runcomfy run happyhorse/happyhorse-1-0/text-to-video through the local RunComfy CLI. Triggers on "happyhorse", "happy horse", "happyhorse 1.0", "happyhorse video", or any explicit ask to generate video with this model. homepage: https://www.runcomfy.com license: MIT

HappyHorse 1.0 — Pro Pack on RunComfy

runcomfy.com · Text-to-video · GitHub

HappyHorse 1.0 — currently #1 on Artificial Analysis Video Arena (Elo 1333 t2v / 1392 i2v) — hosted on the RunComfy Model API. Native 1080p video with in-pass synchronized audio (dialogue, ambient, Foley) and multi-shot character consistency.

npx skills add agentspace-so/runcomfy-skills --skill happyhorse-1-0 -g

When to pick this model (vs siblings)

You wantUse
Multi-shot story with character / wardrobe consistencyHappyHorse 1.0
Native audio in the same generation passHappyHorse 1.0
Currently-#1 blind-vote video modelHappyHorse 1.0
Detailed lip-synced dialogue + reference videoSeedance 2.0 Pro
Fine motion control + multi-reference conditioningWan 2.7
Ultra-fast iteration (sub-second per frame)LTX 2
Cinematic motion editing on existing footageKling Video O1

If the user said "HappyHorse" / "happy horse video" explicitly, route here regardless.

Prerequisites

  1. RunComfy CLInpm i -g @runcomfy/cli
  2. RunComfy accountruncomfy login opens a browser device-code flow.
  3. CI / containers — set RUNCOMFY_TOKEN=<token> instead of runcomfy login.

Endpoints + input schema

happyhorse/happyhorse-1-0/text-to-video

FieldTypeRequiredDefaultNotes
promptstringyesUp to 2,500 chars. 6 languages (CN/EN/JP/KR/DE/FR).
aspect_ratioenumno16:916:9, 9:16, 1:1, 4:3, 3:4 only.
resolutionenumno1080P720P or 1080P.
durationintno53–15 seconds.
seedintno00..2^31-1. Reuse for variant comparisons.
watermarkboolnotrueProvider watermark.

How to invoke

Default (16:9 1080p 5s):

runcomfy run happyhorse/happyhorse-1-0/text-to-video \
  --input '{"prompt": "<user prompt>"}' \
  --output-dir <absolute/path>

Vertical short (9:16, 8s, no watermark):

runcomfy run happyhorse/happyhorse-1-0/text-to-video \
  --input '{
    "prompt": "<user prompt>",
    "aspect_ratio": "9:16",
    "duration": 8,
    "watermark": false
  }' \
  --output-dir <absolute/path>

Cheaper test pass (720p):

runcomfy run happyhorse/happyhorse-1-0/text-to-video \
  --input '{"prompt": "<user prompt>", "resolution": "720P", "duration": 3}' \
  --output-dir <absolute/path>

The CLI submits, polls every 2s until terminal, then downloads any *.runcomfy.net / *.runcomfy.com URL from the result into --output-dir. Stdout is the result JSON. Stderr is progress.

Prompting — what actually works

Describe motion over time, not a still. "A woman turns from the window, walks two paces to the desk, picks up the cup, lifts it to her face, takes a sip" beats "a woman drinking coffee".

Camera + shot in plain English. Front-load the shot: "Wide shot. ..." / "Tracking shot. ..." / "Locked tripod, low angle. ..." works as a real directive. Specify lens feel: "35mm anamorphic", "shallow DOF", "crushed shadows".

One visual beat per clip when iterating. Don't pile up "she walks AND the dog runs AND a car passes". Pick the beat, get it sharp, then layer with multi-shot prompts.

Multi-shot consistency — when describing two beats, restate the anchor at each: "Shot 1: tall woman in red wool coat, blue scarf, in a rainy alley. Shot 2: same woman in red coat / blue scarf, now ducking under an awning." HappyHorse holds the look but needs the anchor.

Audio direction — say what you want to hear: "distant temple bells, footsteps on wet pavement, no dialogue" or "warm friendly tone, English".

Anti-patterns:

  • Static-frame descriptions (no temporal verbs) → motion will be vague.
  • Conflicting style directions → cancels.
  • 2500 char prompts → degrades.

  • Aspect ratios outside the 5 supported → 422.

Where it shines

Use caseWhy HappyHorse 1.0
Multi-shot brand stories with one consistent characterNative cross-shot identity preservation
Talking-head explainers needing in-clip voiceover + ambientSynchronized audio in the same pass
Multilingual short-form ads6 prompt languages, no script-quality drop
Cinematic 1080p deliveryNative 1080p output, broadcast-ready
Blind-vote leader for general video quality#1 on Artificial Analysis Video Arena

Sample prompts (verified to produce strong results)

From the model page (cinematic scope):

Wide shot. A lone astronaut in dusty orange suit with blue-gray harness
skis across lunar plain, leaving parallel tracks in gray regolith.
Mid-stride, poles planted, pushing in 1/6th gravity with subtle upward
drift. Fine dust haze along ski tracks. Crescent Earth above lunar
horizon, blue-white glow against black sky. Raw sunlight, crushed
shadows, no fill. 8K photorealistic.

Multi-shot consistency:

Shot 1: Medium close-up. A woman in a navy trench coat enters a
rain-slick neon-lit Tokyo alley, looks left, holds up an umbrella.
Shot 2: Same woman in same navy trench, now under the awning of a
ramen shop, shaking water off the umbrella. Warm interior glow, soft
chatter, gentle rain on metal roof in the audio.

Vertical platform-native:

9:16 vertical short. A barista in a black apron pulls a single
espresso shot, steam rising into the morning sun, rich crema slowly
forming. Close-up handheld, shallow DOF, warm cafe ambience and the
hiss of the steam wand.

Limitations

  • Duration cap 15s — for longer narratives, segment into multi-shot prompts and stitch.
  • Aspect ratios — only the 5 documented values; ultra-wide cinematic gets cropped or rejected.
  • Audio is in-pass only — you can't pass external audio to drive lip-sync. For audio-driven lip-sync, use Wan 2.7 (which accepts an audio_url) or Seedance 2.0 Pro.
  • No free image-to-video on this template — i2v is supported by HappyHorse via a separate pipeline; the t2v endpoint here is text-only.

Exit codes

The runcomfy CLI uses sysexits-style codes:

codemeaning
0success
64bad CLI args
65bad input JSON / schema mismatch (e.g. duration: 30 would 422)
69upstream 5xx
75retryable: timeout / 429
77not signed in or token rejected

Full reference: docs.runcomfy.com/cli/troubleshooting.

How it works

  1. The skill invokes runcomfy run happyhorse/happyhorse-1-0/text-to-video with a JSON body matching the schema.
  2. The CLI POSTs to https://model-api.runcomfy.net/v1/models/happyhorse/happyhorse-1-0/text-to-video with the user's bearer token.
  3. The Model API returns a request_id; the CLI polls GET .../requests/<id>/status every 2 seconds.
  4. On terminal status, the CLI fetches GET .../requests/<id>/result and downloads any URL whose host ends with .runcomfy.net or .runcomfy.com into --output-dir. Other URLs are listed but not fetched.
  5. Ctrl-C while polling sends POST .../requests/<id>/cancel so you don't get billed for GPU you stopped.

What this skill is not

Not a self-hosted video runner. Not a capability grant — depends on a working RunComfy account.

Security & Privacy

  • Token storage: runcomfy login writes the API token to ~/.config/runcomfy/token.json with mode 0600 (owner-only read/write). Set RUNCOMFY_TOKEN env var to bypass the file entirely in CI / containers.
  • Input boundary: the user prompt is passed as a JSON string to the CLI via --input. The CLI does NOT shell-expand the prompt; it transmits the JSON body directly to the Model API over HTTPS. No shell injection surface from prompt content.
  • Third-party content: image / mask / video URLs you pass are fetched by the RunComfy model server, not by the CLI on your machine. Treat external URLs as untrusted; image-based prompt injection is a known risk for any image-edit / video-edit model.
  • Outbound endpoints: only model-api.runcomfy.net (request submission) and *.runcomfy.net / *.runcomfy.com (download whitelist for generated outputs). No telemetry, no callbacks.
  • Generated-file size cap: the CLI aborts any single download > 2 GiB to prevent disk-fill from a malicious or runaway model output.