Skill

Pass

Audit score 90

gpt-image-2

gargantuax/openskills

Full OpenAI-compatible GPT Image 2 coverage for text-to-image, edits, and streaming responses.

What is gpt-image-2?

Comprehensive Python entrypoint for all GPT Image 2 routes including generations, edits, and responses with strict validation. Use when you need advanced image workflows beyond simple text-to-image: multi-image batches, mask edits, streaming, partial previews, and mixed text+image flows. Works with any OpenAI-compatible gateway.

Text-to-image generation with configurable size, quality, and format
Multi-image batch generation with filename patterns
Image editing with mask support and multiple reference images
Streaming responses with partial image previews
Mixed text and image input through Responses API
Strict pre-flight validation of model constraints and feature compatibility

How to install gpt-image-2

npx skills add https://github.com/gargantuax/openskills --skill gpt-image-2

Prerequisites

Python environment with OpenAI SDK
OPENAI_API_KEY environment variable or .env file
Optional: OPENAI_BASE_URL for non-OpenAI endpoints

Claude Code

Cursor

Windsurf

Cline

How to use gpt-image-2

1.Install the skill via npx skills add
2.Review references/config.md for environment variables and defaults
3.Choose your API route: generations (text-to-image), edits (mask-based), or responses (advanced flows)
4.Run scripts/gpt_image.py with appropriate subcommand and flags
5.Use --dry-run to validate payload before sending
6.Use --save-response to debug raw JSON or SSE streams

Use cases

Good for

Generate hero images and marketing assets from text prompts
Create product variations by editing base images with masks
Batch-generate multiple design options with consistent parameters
Stream image generation with progressive preview updates
Build AI-powered image editing workflows with reference images and masks

Who it's for

AI agents and coding assistants automating image workflows
Developers building image generation pipelines
Product teams generating marketing and design assets
Teams using OpenAI-compatible image endpoints

gpt-image-2 FAQ

When should I use generations vs. edits vs. responses?

Use generations for simple text-to-image calls. Use edits for multipart image edits with masks. Use responses for advanced flows: streaming, mixed text+image input, previous_response_id, tool_choice, action, and optional tool_model.

Can I generate multiple images at once?

Yes. Use the --n flag with generations or responses, and provide an output pattern like image-{index}.png to save each result separately.

Does this work with non-OpenAI endpoints?

Yes. Set OPENAI_BASE_URL to any OpenAI-compatible gateway endpoint. The skill respects both .env files and process environment variables.

What validation does the skill perform?

It validates model size, aspect ratio, and feature constraints before sending requests. Use --dry-run to inspect the built request without sending it.

How do I stream images with partial previews?

Use the responses subcommand with --stream and --partial-images flags (0-3 previews). Provide an output pattern to save streamed results.

Full instructions (SKILL.md)

Source of truth, from gargantuax/openskills.

name: gpt-image-2 description: Full OpenAI-compatible GPT Image 2 coverage across images/generations, images/edits, and responses with the image_generation tool. Use when the one-shot image helper is not enough - text-to-image, mask edits, multi-image batches, streaming, partial_images, and mixed text+image Responses flows. Reads .env and respects process environment variables; works with any OpenAI-compatible gateway.

GPT Image 2

A single Python entrypoint that covers every GPT Image 2 route, with strict pre-flight validation of the model's size, aspect, and feature constraints.

Workflow

Open references/config.md to pick environment variables and defaults.
Open references/api-surface.md to choose between generations, edits, and responses.
Prefer OPENAI_BASE_URL=https://api.openai.com/v1 unless the user asks for a different OpenAI-compatible endpoint.
Use gpt-image-2 for generations and edits; use a text-capable Responses model such as gpt-5.4 for responses.
Run scripts/gpt_image.py with one of the three subcommands.
Add --dry-run first when the payload shape is the main risk.
Add --save-response <path> when the raw JSON body or SSE event stream needs to be kept for debugging.

Commands

Text-to-image through the public Images API:

python .\skills\gpt-image-2\scripts\gpt_image.py generations `
  --prompt "A bold product hero image for a developer tool homepage" `
  --output .\out\hero.png `
  --size 1536x1024 `
  --quality high `
  --format png

Multi-image batch with a filename pattern:

python .\skills\gpt-image-2\scripts\gpt_image.py generations `
  --prompt "A cinematic city skyline at night" `
  --output .\out\skyline-{index}.webp `
  --n 3 `
  --format webp `
  --compression 90

Image edits with two inputs plus a mask:

python .\skills\gpt-image-2\scripts\gpt_image.py edits `
  --prompt "Blend the two references into one clean marketing illustration" `
  --image .\refs\subject.png `
  --image .\refs\background.png `
  --mask .\refs\mask.png `
  --output .\out\edit-{index}.png `
  --image-field-style brackets `
  --n 2

Responses API with streaming and partial previews:

python .\skills\gpt-image-2\scripts\gpt_image.py responses `
  --input-text "Generate a poster for an AI developer summit" `
  --model gpt-5.4 `
  --output .\out\poster-{index}.png `
  --stream `
  --partial-images 2 `
  --save-response .\out\poster-events.json

Responses API edit with a local image plus a mask:

python .\skills\gpt-image-2\scripts\gpt_image.py responses `
  --input-text "Turn this product shot into a clean studio ad" `
  --model gpt-5.4 `
  --input-image .\refs\product.png `
  --mask .\refs\mask.png `
  --output .\out\studio.png `
  --action edit

Inspect the built request without sending it:

python .\skills\gpt-image-2\scripts\gpt_image.py generations `
  --prompt "A minimal cover image" `
  --output .\out\cover.png `
  --dry-run

Rules

Use generations for public text-to-image calls.
Use edits for multipart image edits and mask uploads.
Use responses for advanced flows: streaming, mixed text + image input, previous_response_id, tool_choice, action, and optional tool_model.
Process environment variables override .env; CLI flags override both.
Never print secrets.
--output takes either a single path or a pattern such as image-{index}.png for multi-image or streaming flows.
responses uses a top-level Responses model separate from the image model; default it to gpt-5.4 unless you need another text-capable model.
quality on Responses tool flows is passed through, but final behavior still depends on the hosted image tool.
On OpenAI GPT image models, omit response_format; image data already comes back as base64.
Fail fast on unsupported gpt-image-2 combinations: transparent background, invalid size, partial_images outside 0..3, or stream=true with n>1 on public Images routes.

Resources

Script: scripts/gpt_image.py
Config reference: references/config.md
API surface reference: references/api-surface.md

Related skills

More from gargantuax/openskills and the wider catalog.

nanobanana

gargantuax/openskills

Gemini-native Nano Banana image generation and editing with batch support and custom endpoints.

10k installs