claude-skill-codex-imagegen

Use OpenAI's gpt-image-2 — OpenAI's most capable image generation model — from inside Claude Code.

📦 What

A Claude Code skill that calls Codex CLI's $imagegen (gpt-image-2) on plain natural-language asks — "generate a hero image", "make a favicon", "insert images that fit the site" — and lands the result where you actually wanted it. No new slash command to learn. Claude calls it as part of whatever it's already doing.

💡 Why

Claude Code has no built-in image model. So most vibe-coded sites either ship without imagery or paste in stock that doesn't match. And generated images from a year ago screamed "AI" louder than the layout did, so people stopped trying. gpt-image-2 finally clears that bar — near-perfect text rendering, consistent lighting, real subject framing — which makes the image layer the cheapest way out of the "every AI site looks the same" trap. This skill makes that an in-session step, optionally guided by a DESIGN.md you keep at your project root.

🚀 Quickstart

git clone https://github.com/JunSeo99/claude-skill-codex-imagegen \
  ~/.claude/skills/codex-imagegen

Restart Claude Code, then ask in natural language:

"Generate a 1600×900 hero image for this landing page, save to assets/hero.png."

Want consistency across an entire site? Drop a DESIGN.md at the project root, then:

"Using DESIGN.md as the style reference, insert images that fit the site."

That's it. Full details below.

Claude Code does not ship with an image-generation model of its own. This skill closes that gap by teaching Claude Code to call gpt-image-2 through the OpenAI Codex CLI's built-in $imagegen feature — so you can generate icons, banners, OG cards, illustrations, infographics, and photo edits without ever leaving your Claude Code session.

The skill bundles a verified prompting playbook, a CLI reference, a security note, and a sample asset produced during validation.

Sample 1600×900 hero image generated by gpt-image-2 via this skill.

Why this skill exists

Claude Code can already drive the Codex CLI, but $imagegen has rough edges that Claude misses on its own:

gpt-image-2 ignores the exact output size you request (e.g. 256×256 → 1254×1254)
Transparent PNGs are not supported by gpt-image-2 (only gpt-image-1.5 supports them, per the OpenAI guide)
The raw PNG lands at ~/.codex/generated_images/<session-uuid>/ig_*.png — not where you asked
"Stunning, cinematic, 8K" keyword prompts produce visibly worse output than the five-part structured prompts the official OpenAI Cookbook recommends
The naive non-interactive recipe requires --dangerously-bypass-approvals-and-sandbox, which hands the Codex sub-agent broad shell power — not a safe default

This skill bakes those facts in, defaults to a safer split workflow (Codex generates only; the host does the file moves), and only opts into the bypass mode when explicitly requested.

What people use it for

The tool is general — anything that needs a PNG/JPEG/WebP written to disk fits. In practice the workflows that come up most often:

Hero images and background photography for landing pages and marketing sites
OG cards and social previews generated per page
Favicons and app icons at the sizes you actually need
Blog post illustrations that match the post's tone instead of leaning on stock libraries
Brand asset drafts — logos, banners, badges — to iterate before committing to a designer
Infographic placeholders and diagrams with consistent visual language
Photo edits — change-X-keep-Y patterns on an existing image

The workflow it was originally built around is solo developers shipping sites without a designer — where image quality and stylistic consistency are the main signal separating a vibe-coded site from a polished product. With a DESIGN.md at the project root (see Usage), Claude Code can generate a coherent image set across the whole site in one pass. But none of that requires you to be using it for a site; the skill is just as happy producing a single OG card or a batch of game-asset placeholders.

⚠️ Security note: this skill defines two run modes. The default is safe; the opt-in "automated" mode uses --dangerously-bypass-approvals-and-sandbox. Read SECURITY.md before using the automated mode in a directory whose prompts or contents you do not control.

Requirements

macOS or Linux
Claude Code — this skill is a filesystem skill loaded from ~/.claude/skills/, which is a Claude Code feature (Claude.ai web uses a different skill upload mechanism)
Codex CLI v0.130 or newer (npm i -g @openai/codex)
A logged-in Codex session (codex login) — uses your ChatGPT/Codex subscription
Optional: OPENAI_API_KEY in your environment to bill batch jobs against the API instead

Verified against codex-cli 0.130.0 on macOS (Darwin 25.4.0). sips ships with macOS; on Linux the skill falls back to ImageMagick convert.

Installation

Option A — clone into the skills directory (recommended for daily use)

git clone https://github.com/JunSeo99/claude-skill-codex-imagegen.git
mkdir -p ~/.claude/skills
ln -s "$(pwd)/claude-skill-codex-imagegen/skill" ~/.claude/skills/codex-imagegen

Symlinking from skill/ lets you git pull to update without re-copying files.

Option B — install the prebuilt `.skill` bundle

A pre-packaged distributable lives in dist/codex-imagegen.skill (it's a zip with a .skill extension).

git clone https://github.com/JunSeo99/claude-skill-codex-imagegen.git
mkdir -p ~/.claude/skills
unzip claude-skill-codex-imagegen/dist/codex-imagegen.skill -d ~/.claude/skills/

Option C — copy the folder

git clone https://github.com/JunSeo99/claude-skill-codex-imagegen.git
mkdir -p ~/.claude/skills
cp -r claude-skill-codex-imagegen/skill ~/.claude/skills/codex-imagegen

Restart Claude Code (or start a new session) so the skill is discovered.

Usage

The skill activates on phrases such as:

"generate an image", "make an icon", "create a banner", "OG image"
"hero illustration", "make a favicon", "brand mark", "product shot"
"imagegen", "GPT Image 2", "codex image"

Multilingual triggers are supported via the skill's description field — localized prompts (Korean, Japanese, etc.) work without configuration.

Basic usage — single asset

Any request that produces a visual file saved to disk:

You: Make a 512×512 hero icon for my landing page — a single seedling growing from a flat horizon, line-art only, no text.

Claude: (invokes the skill, composes a five-part prompt, runs codex exec in Mode A — safe, parses the absolute path from stdout, cps and sips-resizes it to ./assets/hero-icon.png, then opens the file with Read to verify it matches your intent)

By default Claude runs Codex without --dangerously-bypass-approvals-and-sandbox and does the cp/sips step itself, in its own approved-tool context. The Codex sub-agent never gets carte-blanche shell access during normal use.

If you want a single-step automated flow (Mode B) — e.g. for batching — you can opt in:

You: Generate these 12 favicons in automated mode.

Claude: (after confirming, runs Codex with --dangerously-bypass-approvals-and-sandbox so the sub-agent handles cp/sips itself)

Read SECURITY.md before opting in.

For complex prompts (text in the image, photo edits, brand assets), Claude reads references/prompting-guide.md before generating to apply the structured prompt template and avoid known pitfalls.

For full-site image sets — pair with a `DESIGN.md`

For projects that need a coherent visual language across multiple slots — hero, OG card, empty states, illustrations, favicons — drop a DESIGN.md at the project root with your palette, typography, and illustration style. Then ask Claude Code:

You: Using DESIGN.md as the style reference, insert images that fit the site.

Claude reads DESIGN.md, scans the codebase for slots that need imagery, writes prompts that incorporate the palette and tone, calls this skill for each, and inserts the resulting paths into the right <img> tags. The hero image, the empty-state illustration, and the OG card all end up looking like they belong to the same product.

A minimal DESIGN.md that works well:

# Design

## Concept
Calm, considered, modern.

## Palette
- Surface (main):  #F4F1ED  — warm off-white
- Surface (cards): #FFFFFF
- Text:            #1A1A1A
- Accent / CTA:    #C46A4E  — soft terracotta, used sparingly

## Typography
- Inter, system-ui sans-serif

## Illustration style
- Single subject, plenty of whitespace, no busy backgrounds
- Soft natural light from upper left
- No text inside images unless explicitly asked
- Avoid stock-photo vibes and over-saturated colors

The qualitative Illustration style block carries most of the consistency work. Palette obviously matters too, but it's the descriptive instructions ("hand-folded paper feel", "no busy backgrounds", "warm tones") that keep each image from looking like it came from a different stock-image library.

Before / After — what the image layer changes

To make the difference concrete, here's the same coffee-shop landing page built two ways. Identical component code in both — same Next.js 15, same Tailwind, same shadcn-style markup, same content, same navigation. The only thing that varies is the image layer.

Without images	With images

0 images. Lucide `Coffee` over a purple-blue gradient hero, `Bean` icons inside product cards, `Sparkles` over a gradient story, `Droplet`/`Flame` brewing icons. The textbook AI default stack.	8 images generated by this skill with a `DESIGN.md` at the project root. Hero photography, five custom coffee-bag product shots (origin, roast, tasting notes, roast/best-by dates, brew recipe all on-label), a roastery story background, three brewing macro shots.

Both pages were produced in the same session. The right-hand one took roughly one extra command — "Using DESIGN.md as the style reference, insert images that fit the site." The demo project itself is intentionally kept outside this repo to keep the skill bundle small.

What's in the skill

skill/
├── SKILL.md                  Workflow (Mode A & B), recipes, failure modes, triggers
├── references/
│   ├── prompting-guide.md    5-part structure, text rendering, edit pattern, anti-patterns
│   └── cli-reference.md      codex exec flags, output paths, sips/convert post-processing, costs
└── assets/
    └── hero.png              Sample 1600×900 hero image generated by gpt-image-2

The five-part prompt structure (drawn from fal.ai and the OpenAI Cookbook) is:

Scene/context — environment, time of day, mood
Subject — main figure or object
Details — style, medium, lighting, lens, color, texture
Use case — drives output size and aspect
Constraints — what to preserve, what to forbid

Cost

ChatGPT/Codex subscription: 1 image turn ≈ 3–5 text turns of usage limit
API key mode (OPENAI_API_KEY set): priced per image, typically $0.04 – $0.35
- Image output: $30.00 / 1M output tokens
- Image input: $8.00 / 1M input tokens ($2.00 / 1M cached input)
- Plus text-input tokens for your prompt (see OpenAI pricing for the current text rate on gpt-image-2)

For batch work (10+ images), the API key mode is generally cheaper than the subscription.

Known limitations of gpt-image-2

Limitation	Workaround the skill applies
Output size doesn't match request	Always adds "at exactly WxH pixels"; host runs `sips -z H W` (macOS) or `convert -resize WxH!` (Linux)
No transparent PNG	Documents the limitation; suggests gpt-image-1.5 via Image API or post-processing
Long multi-line text passages, brand names, and very small text in dense layouts still wobble (short labels and CJK render near-perfectly)	EXACT TEXT marker + double quotes for literal strings; letter-by-letter spelling for brand names; HTML/CSS overlay for paragraph-length text
Latency up to 2 min on complex prompts	Bash timeout set to 300000 ms
Imprecise element placement in complex layouts	Falls back to simplification or SVG-then-rasterize suggestion

Compatibility

Component	Tested
`codex-cli`	0.130.0
OS	macOS (Darwin 25.4.0); Linux untested but expected to work with ImageMagick fallback
Claude Code	App / CLI (filesystem skills)

Output-path layout under ~/.codex/generated_images/ and $imagegen invocation semantics are not part of the Codex CLI's public contract. If a future codex-cli release changes them, please open an issue with the new behavior.

Contributing

Issues and PRs welcome. Useful directions:

Linux-side post-processing parity (ImageMagick verified end-to-end)
Additional recipes (favicons, app store screenshots, social card pipelines)
Improved non-Latin text rendering tips (CJK, Arabic, Devanagari, etc.)
Migration notes when newer Codex CLI versions change $imagegen behavior

When changing the skill body, run the validators from anthropics/skills:

python3 path/to/skill-creator/scripts/quick_validate.py skill/
python3 path/to/skill-creator/scripts/package_skill.py skill/ dist/

Security

See SECURITY.md for the trust boundary and the threat model around --dangerously-bypass-approvals-and-sandbox.

Changelog

See CHANGELOG.md.

Acknowledgements

This project is independent and not affiliated with, endorsed by, or sponsored by Anthropic or OpenAI. "Claude", "Claude Code", "OpenAI", "Codex", and "GPT" are trademarks of their respective owners.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
dist		dist
skill		skill
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.ja.md		README.ja.md
README.ko.md		README.ko.md
README.md		README.md
README.zh-CN.md		README.zh-CN.md
SECURITY.md		SECURITY.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

claude-skill-codex-imagegen

📦 What

💡 Why

🚀 Quickstart

Why this skill exists

What people use it for

Requirements

Installation

Option A — clone into the skills directory (recommended for daily use)

Option B — install the prebuilt `.skill` bundle

Option C — copy the folder

Usage

Basic usage — single asset

For full-site image sets — pair with a `DESIGN.md`

Before / After — what the image layer changes

What's in the skill

Cost

Known limitations of gpt-image-2

Compatibility

Contributing

Security

Changelog

Acknowledgements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

claude-skill-codex-imagegen

📦 What

💡 Why

🚀 Quickstart

Why this skill exists

What people use it for

Requirements

Installation

Option A — clone into the skills directory (recommended for daily use)

Option B — install the prebuilt .skill bundle

Option C — copy the folder

Usage

Basic usage — single asset

For full-site image sets — pair with a DESIGN.md

Before / After — what the image layer changes

What's in the skill

Cost

Known limitations of gpt-image-2

Compatibility

Contributing

Security

Changelog

Acknowledgements

License

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Option B — install the prebuilt `.skill` bundle

For full-site image sets — pair with a `DESIGN.md`

Packages