AI Skill
Review
Audit score 70

image-edit

runcomfy-com/skills

Smart router that picks the right image edit model (Nano Banana, GPT Image 2, Flux Kontext, or Z-Image) based on your intent.

What is image-edit?

This skill intelligently routes image editing requests to the best-fit model in the RunComfy catalog. Use it when you need batch edits, multilingual text rewrites, precise local edits, or mask-driven region replacements—the skill classifies your intent and invokes the right model with optimized prompting patterns.

  • Routes to Nano Banana Edit for batch edits (up to 20 images) and identity-preserving changes
  • Routes to GPT Image 2 Edit for multilingual in-image text rewrites and multi-reference composition
  • Routes to Flux Kontext Pro for single-shot precise local edits with high-fidelity preservation
  • Routes to Z-Image Turbo Inpaint for mask-driven object removal and region replacement
  • Bundles model-specific prompting patterns to reduce iteration and improve edit quality

How to install image-edit

npx skills add https://github.com/runcomfy-com/skills --skill image-edit
Prerequisites
  • RunComfy CLI: npm i -g @runcomfy/cli
  • RunComfy account with runcomfy login
  • For CI/containers: set RUNCOMFY_TOKEN environment variable
Claude Code
Cursor
Windsurf
Cline

How to use image-edit

  1. 1.Install the skill with: npx skills add https://github.com/runcomfy-com/skills --skill image-edit
  2. 2.Describe what you want to edit (e.g., 'swap background', 'rewrite headline', 'remove object')
  3. 3.The skill classifies your intent against the routing table and selects the appropriate model
  4. 4.Provide image URL(s) and any specific parameters (aspect ratio, resolution, mask)
  5. 5.The skill invokes runcomfy run <vendor>/<model>/edit and returns the edited image(s)

Use cases

Good for
  • Batch-edit product SKU galleries with consistent styling and aspect ratio
  • Rewrite in-image headlines in multiple languages (Japanese, Cyrillic, Arabic) while preserving layout
  • Swap backgrounds while keeping subject identity and pose unchanged
  • Remove watermarks, cables, or distractions from images using mask-driven inpainting
  • Compose subjects from one image into scenes from another with matched lighting and palette
Who it's for
  • E-commerce teams managing product image variants
  • Marketing teams localizing visual assets across languages
  • Designers needing precise local edits without full re-renders
  • Content creators removing unwanted objects or backgrounds from photos

image-edit FAQ

Which model should I use for batch editing multiple product images?

Use Nano Banana Edit (default). It supports 1–20 images per call and maintains consistency when you lock aspect_ratio and resolution across the batch.

Can I rewrite text in non-Latin scripts like Japanese or Arabic?

Yes, use GPT Image 2 Edit. It is strongest in class for multilingual typography and honors directional languages like Arabic right-to-left.

How do I remove an object from an image?

For spatial/directional removal, use Nano Banana Edit with a prompt like 'remove the object in the bottom-right corner'. For precise mask-driven removal, use Z-Image Turbo Inpaint with a mask.

What if I need to compose multiple images together?

Use GPT Image 2 Edit with multi-reference composition. Number your references in the prompt (e.g., 'subject from image 1, lighting from image 2') and pass up to 10 image URLs.

Do I need to specify which model to use?

No. Describe your intent naturally ('edit image', 'swap background', 'rewrite headline', etc.) and the skill automatically routes to the best-fit model. Nano Banana Edit is the default if intent is unspecified.

Full instructions (SKILL.md)

Source of truth, from runcomfy-com/skills.


name: image-edit displayName: "Image Edit — Pro Pack on RunComfy" description: > Edit images on RunComfy — this skill is a smart router that matches the user's intent to the right edit model in the RunComfy catalog. Picks Nano Banana Edit (batch up to 20, identity-preserving default), OpenAI GPT Image 2 Edit (multilingual in-image text rewrite, multi-ref composition, layout precision), Flux Kontext Pro (single-ref high-fidelity local edit), or Z-Image Turbo Inpaint (mask-driven precise region edit). Bundles each model's documented prompting patterns so the skill gets sharper edits without burning iterations on the wrong model. Calls runcomfy run <vendor>/<model>/edit through the local RunComfy CLI. Triggers on "image edit", "edit image", "image-to-image", "i2i", "swap background", "remove object", "rewrite headline", or any explicit ask to edit a single or batch of images. homepage: https://www.runcomfy.com license: MIT

Image Edit — Pro Pack on RunComfy

runcomfy.com · Nano Banana Edit · GPT Image 2 Edit · Flux Kontext · Z-Image Inpaint · GitHub

Image edit, intent-routed. This skill doesn't lock you to one model — it picks the right edit model in the RunComfy catalog based on what the user actually wants: batch identity-preservation, multilingual text rewrite, single-shot precise edit, or mask-driven region replacement.

npx skills add agentspace-so/runcomfy-skills --skill image-edit -g

Pick the right model for the user's intent

User intentModelWhy
Batch edit 1–20 images consistently (SKU gallery, A/B variants)Nano Banana EditUp to 20 input images per call; locked aspect/resolution for series
Swap background, preserve subject identityNano Banana EditStrong identity preservation under "keep X unchanged" prompts
Localized object removal / addition with spatial language ("the left object", "upper-right corner")Nano Banana EditHonors directional spatial scope
Multilingual / non-Latin in-image text rewrite (Japanese kana, Cyrillic, Arabic)GPT Image 2 EditStrongest in class for multilingual typography
Multi-reference composition (subject from img1, scene from img2, palette from img3)GPT Image 2 EditNumbered refs route cues correctly
Layout-precise repositioning ("move headline from top-right to bottom-center")GPT Image 2 EditDirectional language honored at layout level
Identity preservation across translated headline variantsGPT Image 2 EditSame source asset → many language variants, identity stable
Single-shot precise local edit ("she's now holding an orange umbrella")Flux Kontext ProSingle-ref single-instruction, high-fidelity preservation
Mask-driven object removal (cables, watermarks, distractions)Z-Image Turbo InpaintMask-required, strength-tunable, edge-consistent
Mask-driven region replacement (full background swap with mask)Z-Image Turbo InpaintHigh strength + clean mask = clean replacement
Default if unspecifiedNano Banana EditMost flexible, supports both single and batch

The agent reads this table, classifies the user's intent, and picks the matching subsection below.

Prerequisites

  1. RunComfy CLInpm i -g @runcomfy/cli
  2. RunComfy accountruncomfy login.
  3. CI / containers — set RUNCOMFY_TOKEN=<token>.

Route 1: Nano Banana Edit — default for general edit + batch

Model: google/nano-banana-2/edit

Schema

FieldTypeRequiredDefaultNotes
promptstringyesLead with preservation goals, end with the change.
image_urlsarrayyes1–20 publicly-fetchable HTTPS URLs.
number_of_imagesintno11–4 outputs per call.
aspect_ratioenumnoautoauto follows input; lock for batch consistency.
resolutionenumno1K0.5K / 1K / 2K / 4K.
output_formatenumnopngpng / jpeg / webp.
seedintnoReproducibility.
enable_web_searchboolnofalseWeb-grounded edits (extra latency).

Invoke

runcomfy run google/nano-banana-2/edit \
  --input '{
    "prompt": "Keep the subject identity, pose, and clothing unchanged. Convert the background into a rainy neon cyberpunk street.",
    "image_urls": ["https://.../portrait.jpg"]
  }' \
  --output-dir <absolute/path>

Batch (lock aspect + resolution):

runcomfy run google/nano-banana-2/edit \
  --input '{
    "prompt": "Replace the watermark in the bottom-right with the text \"AURA\" in clean white sans-serif. Keep everything else exactly as in the input.",
    "image_urls": ["https://.../sku-1.jpg", "https://.../sku-2.jpg", "https://.../sku-3.jpg"],
    "aspect_ratio": "1:1",
    "resolution": "1K"
  }' \
  --output-dir <absolute/path>

Prompting tips

  • Preservation first: "Keep [identity / pose / brand / framing] unchanged." Then state the change.
  • Spatial scope: "background only", "the left object", "upper-right quadrant" — concrete locations honored.
  • Batch consistency: lock aspect_ratio and resolution across the batch.
  • Iterate small: split compound edits into multiple shorter passes.

Route 2: GPT Image 2 Edit — multilingual text + multi-ref composition

Model: openai/gpt-image-2/edit

Schema

FieldTypeRequiredDefaultNotes
promptstringyesEdit instruction; lead with preservation.
imagesstring[]yesUp to 10 HTTPS URLs. First is primary; rest are auxiliary.
sizeenumnoautoauto, 1024_1024, 1024_1536, 1536_1024. Only these.

Invoke

Multilingual text rewrite:

runcomfy run openai/gpt-image-2/edit \
  --input '{
    "prompt": "Keep the photograph, layout, and brand mark exactly as in the input. Replace only the in-image headline. The new headline reads \"今日のおすすめ\" in bold Japanese kana, same position and font weight.",
    "images": ["https://.../poster-en.jpg"]
  }' \
  --output-dir <absolute/path>

Multi-ref composition:

runcomfy run openai/gpt-image-2/edit \
  --input '{
    "prompt": "Compose subject from image 1 into the room from image 2. Match the lighting and color palette of image 2. Keep image 1 subject identity unchanged.",
    "images": ["https://.../subject.jpg", "https://.../room.jpg"]
  }' \
  --output-dir <absolute/path>

Prompting tips

  • Quote in-image text exactly. Name the script for non-Latin: "Japanese kana", "Cyrillic", "Arabic right-to-left".
  • Number multi-refs: "subject from image 1, lighting from image 2".
  • Directional layout language: "move the headline from top-right to bottom-center", "replace the watermark in the bottom-right".
  • size: "auto" preserves input ratio — recommended unless the edit changes framing.

Route 3: Flux Kontext Pro — single-shot precise local edit

Model: blackforestlabs/flux-1-kontext/pro/edit

Schema (minimal)

FieldTypeRequiredNotes
promptstringyesOne declarative edit instruction.
imagestringyesSingle source image URL.
aspect_ratioenumnoPick from supported W:H values.
seedintnoReproducibility.

Single image only — no array. For multi-image flows, use Route 1 (Nano Banana Edit).

Invoke

runcomfy run blackforestlabs/flux-1-kontext/pro/edit \
  --input '{
    "prompt": "Keep the person'\''s face, pose, and clothing unchanged. Add an orange umbrella in her left hand and a slight smile.",
    "image": "https://.../portrait.jpg"
  }' \
  --output-dir <absolute/path>

Prompting tips

  • One declarative instruction. "She is now holding an orange umbrella and smiling" — imperative, single change.
  • Preservation first. Lead with "Keep [unchanged elements]" then state the change.
  • Iterate small. Compound edits drift on a single pass; split into sequential passes.

Route 4: Z-Image Turbo Inpaint — mask-driven precise region edit

Model: tongyi-mai/z-image/turbo/inpainting

Schema

FieldTypeRequiredNotes
promptstringyesWhat to fill / replace; preservation constraints for the unmasked surround.
imagestringyesSource image URL.
mask_imagestringyesGrayscale mask URL (white = inpaint, black = preserve).
strengthfloatno0.3–0.6 retouching, 0.7–1.0 full replacement.
control_scalefloatno0.6–0.9 typical.
aspect_ratioenumnoW:H output ratio.
seedintnoReproducibility.

Invoke

Object removal (low strength):

runcomfy run tongyi-mai/z-image/turbo/inpainting \
  --input '{
    "prompt": "Remove overhead cables; preserve rooflines and sky gradient; thin clean sky.",
    "image": "https://.../street.jpg",
    "mask_image": "https://.../cables-mask.png",
    "strength": 0.5,
    "control_scale": 0.8
  }' \
  --output-dir <absolute/path>

Region replacement (high strength):

runcomfy run tongyi-mai/z-image/turbo/inpainting \
  --input '{
    "prompt": "Replace busy backdrop with smooth light gray studio paper; mask background only.",
    "image": "https://.../product.jpg",
    "mask_image": "https://.../bg-mask.png",
    "strength": 0.9
  }' \
  --output-dir <absolute/path>

Prompting tips

  • A mask URL is required — grayscale, white = inpaint region, black = preserve. Slight blur on mask edges (1–3px) blends better than sharp binary.
  • Strength by intent: 0.3–0.5 for retouching / cleanup, 0.6–0.7 for object replacement with style match, 0.8–1.0 for full-region replacement.
  • Name what stays outside the mask in the prompt: "preserve rooflines and sky gradient", "match brick pattern and mortar tone".
  • Spatial labels still help even though the mask defines the region: "the left shelf", "upper-right quadrant".

Limitations

  • Each route inherits its model's limits. Nano Banana: 1–20 inputs, 1–4 outputs. GPT Image 2 Edit: up to 10 refs, 4 fixed sizes. Flux Kontext: single ref. Z-Image Inpaint: mask required.
  • No multi-route blending. This skill picks one model per call.
  • Brand-specific overrides — if the user named a specific model, route to the corresponding brand skill (gpt-image-edit, flux-kontext, nano-banana-edit) for fuller treatment.

Exit codes

codemeaning
0success
64bad CLI args
65bad input JSON / schema mismatch
69upstream 5xx
75retryable: timeout / 429
77not signed in or token rejected

Full reference: docs.runcomfy.com/cli/troubleshooting.

How it works

The skill picks one of Nano Banana Edit / GPT Image 2 Edit / Flux Kontext Pro / Z-Image Turbo Inpaint based on user intent and invokes runcomfy run <model_id> with the matching JSON body. The CLI POSTs to the Model API, polls the request, fetches the result, and downloads any .runcomfy.net/.runcomfy.com URL into --output-dir. Ctrl-C cancels the remote request before exit.

Security & Privacy

  • Token storage: runcomfy login writes the API token to ~/.config/runcomfy/token.json with mode 0600 (owner-only read/write). Set RUNCOMFY_TOKEN env var to bypass the file entirely in CI / containers.
  • Input boundary: the user prompt is passed as a JSON string to the CLI via --input. The CLI does NOT shell-expand the prompt; it transmits the JSON body directly to the Model API over HTTPS. No shell injection surface from prompt content.
  • Third-party content: image / mask / video URLs you pass are fetched by the RunComfy model server, not by the CLI on your machine. Treat external URLs as untrusted; image-based prompt injection is a known risk for any image-edit / video-edit model.
  • Outbound endpoints: only model-api.runcomfy.net (request submission) and *.runcomfy.net / *.runcomfy.com (download whitelist for generated outputs). No telemetry, no callbacks.
  • Generated-file size cap: the CLI aborts any single download > 2 GiB to prevent disk-fill from a malicious or runaway model output.