AI Skill
Fail
Audit score 45

nano-banana-2

agentspace-so/runcomfy-agent-skills

Generate images with Google Nano Banana 2 (Gemini flash-tier) via RunComfy CLI — optimized prompting patterns included.

What is nano-banana-2?

Nano Banana 2 is a flash-tier text-to-image skill that calls Google's Gemini-family image model through the RunComfy CLI. It bundles documented prompting patterns for sharper output, supports batch generation (up to 4 images), multiple aspect ratios and resolution tiers (0.5K–4K), in-image typography, optional web-grounded context, and a safety-tolerance dial. Best suited for rapid ideation, social thumbnails, and poster/card assets with quoted text rendering.

  • Runs `google/nano-banana-2/text-to-image` via the local RunComfy CLI
  • Supports 1–4 images per request across 11 aspect ratios and 4 resolution tiers
  • Renders in-image typography predictably when text is explicitly quoted in the prompt
  • Optionally grounds generation in current web context via `enable_web_search`
  • Locks composition across prompt variants using a seed parameter
  • Documents when to route to sibling models (Nano Banana Pro, GPT Image 2, Flux 2, Seedream)

How to install nano-banana-2

npx skills add https://github.com/agentspace-so/runcomfy-agent-skills --skill nano-banana-2
Prerequisites
  • Node.js available to run `npx`
  • RunComfy CLI installed: `npm i -g @runcomfy/cli`
  • RunComfy account created at runcomfy.com
  • Authenticated via `runcomfy login` (browser device-code flow)
  • For CI/containers: `RUNCOMFY_TOKEN=<token>` environment variable set instead of login
Claude Code
Cursor
Windsurf
Cline

How to use nano-banana-2

  1. 1.Install the skill: `npx skills add https://github.com/agentspace-so/runcomfy-agent-skills --skill nano-banana-2`
  2. 2.Ensure RunComfy CLI is installed and you are logged in (`runcomfy login`)
  3. 3.Run a default 1K square draft: `runcomfy run google/nano-banana-2/text-to-image --input '{"prompt": "<your prompt>"}' --output-dir <path>`
  4. 4.For ideation batches, set `num_images: 4`, `resolution: "0.5K"`, and desired `aspect_ratio`
  5. 5.Lock `seed` when iterating small prompt variants to keep composition stable
  6. 6.Quote in-image text literally in the prompt and specify placement and font style
  7. 7.Enable `enable_web_search: true` only when the prompt references current events or real entities
  8. 8.Check exit codes (0 = success, 75 = retryable timeout/429, 77 = auth failure) for error handling

Use cases

Good for
  • Batch ideation drafts (4-up at 0.5K, promote winner to 2K)
  • Social-platform thumbnails in native aspect ratios (9:16, 4:5, 16:9, etc.)
  • Posters and product cards requiring quoted in-image text
  • Web-grounded imagery referencing current events or real entities
  • Reproducible variant testing with seed-locked compositions
Who it's for
  • Designers iterating on social media assets quickly
  • Marketers producing thumbnail batches for A/B testing
  • Developers integrating image generation into CI pipelines
  • Content creators needing platform-native vertical or wide-format images
  • Anyone explicitly requesting Nano Banana 2 or Gemini image generation

nano-banana-2 FAQ

When should I use Nano Banana 2 vs Nano Banana Pro?

Use Nano Banana 2 for rapid drafts, batch thumbnails, and typography-heavy assets. Default to Nano Banana Pro for hyperrealistic portraits or when maximum detail matters more than speed.

How do I get reliable in-image text rendering?

Quote the exact characters in your prompt (e.g., the label reads 'AURA') and specify placement and font style. Vague references like 'with the brand name on it' produce unpredictable results.

Does enabling web search cost more?

Yes. `enable_web_search` adds both latency and cost. Only enable it when the prompt explicitly references current events or real-world entities that require fresh context.

What resolutions are supported and which should I default to?

Supported tiers are 0.5K (drafts), 1K (default), 2K (final), and 4K (max). Default to 1K unless the user explicitly requests higher; 2K and 4K cost more.

Can I use this skill to edit an existing image?

No. This endpoint generates images from text only. For subject-preserving edits (swap background, apply changes), use the sibling `nano-banana-edit` skill instead.

Full instructions (SKILL.md)

Source of truth, from agentspace-so/runcomfy-agent-skills.


name: nano-banana-2 displayName: "Nano Banana 2 — Pro Pack on RunComfy" description: > Generate images with Google Nano Banana 2 (Gemini-family flash-tier text-to-image) on RunComfy — bundled with the model's documented prompting patterns so the skill gets sharper output than naive prompting against the same model. Documents Nano Banana 2's strengths (rapid iteration, in-image typography rendering, predictable framing, optional web-grounded context), the resolution-tier pricing, the safety-tolerance dial, and when to route to Nano Banana Pro / GPT Image 2 / Flux 2 / Seedream instead. Calls runcomfy run google/nano-banana-2/text-to-image through the local RunComfy CLI. Triggers on "nano banana", "nano-banana-2", "nano banana 2", "google image gen", "gemini image", or any explicit ask to generate with this model. homepage: https://www.runcomfy.com license: MIT

Nano Banana 2 — Pro Pack on RunComfy

runcomfy.com · Model page · GitHub

Google Nano Banana 2 — the flash-tier text-to-image model in the Gemini family — hosted on the RunComfy Model API. Optimized for ideation, social-thumbnail batches, and rapid drafts with strong in-image typography.

npx skills add agentspace-so/runcomfy-skills --skill nano-banana-2 -g

When to pick this model (vs siblings)

Nano Banana 2 is the flash-tier of the Google image-gen line. Pick it when iteration speed and predictable framing matter more than maximum detail.

You wantUse
Rapid drafts, social thumbnails, batch variantsNano Banana 2
In-image typography with predictable renderingNano Banana 2
Web-grounded image (current events / real entities)Nano Banana 2 + enable_web_search
Image edit (preserve subject, swap background)Nano Banana Edit (sibling skill)
Heavy stylization, painterly lookFlux 2
Maximum prompt adherence + multilingual textGPT Image 2
2K–4K hero shots, max realismSeedream 5
Hyperrealistic portraitNano Banana Pro

If the user said "Nano Banana" / "nano-banana-2" / "Gemini image" explicitly, route here regardless. If they said "Nano Banana" without specifying 2 vs Pro, default to Pro for portraits and 2 for everything else.

Prerequisites

  1. RunComfy CLInpm i -g @runcomfy/cli
  2. RunComfy accountruncomfy login opens a browser device-code flow.
  3. CI / containers — set RUNCOMFY_TOKEN=<token> instead of runcomfy login.

Endpoints + input schema

google/nano-banana-2/text-to-image

FieldTypeRequiredDefaultNotes
promptstringyesSubject-first description.
num_imagesintno11–4. Use 4 for ideation rounds.
seedintno0Reuse for reproducibility.
aspect_ratioenumnoautoauto, 21:9, 16:9, 3:2, 4:3, 5:4, 1:1, 4:5, 3:4, 2:3, 9:16.
resolutionenumno1K0.5K (drafts), 1K (default), 2K (final), 4K (max).
output_formatenumnopngpng, jpeg, webp.
safety_toleranceintno41 (strict) – 6 (permissive).
limit_generationsboolnotrueLimit each prompt round to one generation.
enable_web_searchboolnofalseAdds web grounding (extra cost + latency).

For image edit (preserve subject + apply changes), see the sibling nano-banana-edit skill.

How to invoke

Default draft (1K, square, png):

runcomfy run google/nano-banana-2/text-to-image \
  --input '{"prompt": "<user prompt>"}' \
  --output-dir <absolute/path>

Vertical 4-up batch for ideation:

runcomfy run google/nano-banana-2/text-to-image \
  --input '{
    "prompt": "<user prompt>",
    "num_images": 4,
    "aspect_ratio": "9:16",
    "resolution": "0.5K"
  }' \
  --output-dir <absolute/path>

Final at 2K with seed lock:

runcomfy run google/nano-banana-2/text-to-image \
  --input '{
    "prompt": "<user prompt>",
    "resolution": "2K",
    "aspect_ratio": "16:9",
    "seed": 42
  }' \
  --output-dir <absolute/path>

Web-grounded (current event / real entity):

runcomfy run google/nano-banana-2/text-to-image \
  --input '{
    "prompt": "<prompt referencing a real-world event from this week>",
    "enable_web_search": true
  }' \
  --output-dir <absolute/path>

Prompting — what actually works

Subject-first declarative grammar. "A cinematic close-up portrait of an American woman standing under neon lights in rainy Tokyo, shallow depth of field, reflective wet streets, ultra-detailed, realistic skin texture" — primary subject, then action, environment, style, camera. Front-load subject; trail with directives.

Exact text quoting for in-image typography. "The label reads 'AURA' in clean bold sans-serif, centered, white on black" — quote the literal characters. Specify placement and font style. Don't say "with the brand name on it" and hope.

Consistent seeds for refinement. Lock seed when iterating a single prompt across small variants — keeps composition stable.

Web-grounding, sparingly. Turn on enable_web_search only when the prompt names current events / real entities. Adds latency + cost; off by default.

Don't conflict styles. "minimalist + ornate + retro + cyberpunk" cancels. Pick 1–2 anchors.

Anti-patterns:

  • Trying to verbally describe a stable subject identity — use the edit endpoint with image refs instead.
  • Asking for resolutions outside the 4 tiers → 422.
  • Aspect ratios outside the 11 supported values → 422.
  • Non-quoted in-image text → unpredictable rendering.

Where it shines

Use caseWhy Nano Banana 2
Marketing draft thumbnails (batch of 4)Fast iteration at 0.5K, then promote winner to 2K
Social-platform-nativeWide aspect ratio support including 9:16, 4:5, 21:9
In-image typography for posters / cardsPredictable text rendering when characters are quoted
Web-grounded current-event imageryenable_web_search integrates fresh info
Reproducible variant testingStrong seed + consistent framing

Sample prompts (verified to produce strong results)

Cinematic portrait (page example):

A cinematic close-up portrait of an American woman standing under neon
lights in rainy Tokyo, shallow depth of field, reflective wet streets,
ultra-detailed, realistic skin texture

Brand-asset card with quoted text:

A minimalist 16:9 product card: a matte black ceramic mug centered on a
soft warm-grey paper background, rim highlight from upper-left, the
headline "Brewed Quietly" in clean bold sans-serif top-right, balanced
negative space below, e-commerce ready, clean studio lighting

Vertical platform-native:

A 9:16 vertical hero for a wellness brand: a single ceramic teacup on a
linen runner, soft morning side-light, the words "Slow Down" in
hand-drawn serif large at the top, gentle steam rising, neutral color
palette, uncluttered

Limitations

  • Still images only. No video on this endpoint.
  • Max 4 outputs per request.
  • Web search adds latency + cost — only enable on demand.
  • 2K / 4K cost more — default to 1K unless user asked for higher.
  • For image edit, use the /edit endpoint — not this one.

Exit codes

codemeaning
0success
64bad CLI args
65bad input JSON / schema mismatch
69upstream 5xx
75retryable: timeout / 429
77not signed in or token rejected

Full reference: docs.runcomfy.com/cli/troubleshooting.

How it works

The skill invokes runcomfy run google/nano-banana-2/text-to-image with a JSON body matching the schema. The CLI POSTs to https://model-api.runcomfy.net/v1/models/google/nano-banana-2/text-to-image, polls the request, fetches the result, and downloads any .runcomfy.net/.runcomfy.com URL into --output-dir. Ctrl-C cancels the remote request before exit.

Security & Privacy

  • Token storage: runcomfy login writes the API token to ~/.config/runcomfy/token.json with mode 0600 (owner-only read/write). Set RUNCOMFY_TOKEN env var to bypass the file entirely in CI / containers.
  • Input boundary: the user prompt is passed as a JSON string to the CLI via --input. The CLI does NOT shell-expand the prompt; it transmits the JSON body directly to the Model API over HTTPS. No shell injection surface from prompt content.
  • Third-party content: image / mask / video URLs you pass are fetched by the RunComfy model server, not by the CLI on your machine. Treat external URLs as untrusted; image-based prompt injection is a known risk for any image-edit / video-edit model.
  • Outbound endpoints: only model-api.runcomfy.net (request submission) and *.runcomfy.net / *.runcomfy.com (download whitelist for generated outputs). No telemetry, no callbacks.
  • Generated-file size cap: the CLI aborts any single download > 2 GiB to prevent disk-fill from a malicious or runaway model output.