Use OpenAI's gpt-image-2 — OpenAI's most capable image generation model — from inside Claude Code.
A Claude Code skill that calls Codex CLI's $imagegen (gpt-image-2) on plain natural-language asks — "generate a hero image", "make a favicon", "insert images that fit the site" — and lands the result where you actually wanted it. No new slash command to learn. Claude calls it as part of whatever it's already doing.
Claude Code has no built-in image model. So most vibe-coded sites either ship without imagery or paste in stock that doesn't match. And generated images from a year ago screamed "AI" louder than the layout did, so people stopped trying. gpt-image-2 finally clears that bar — near-perfect text rendering, consistent lighting, real subject framing — which makes the image layer the cheapest way out of the "every AI site looks the same" trap. This skill makes that an in-session step, optionally guided by a DESIGN.md you keep at your project root.
git clone https://github.com/JunSeo99/claude-skill-codex-imagegen \
~/.claude/skills/codex-imagegenRestart Claude Code, then ask in natural language:
"Generate a 1600×900 hero image for this landing page, save to assets/hero.png."
Want consistency across an entire site? Drop a DESIGN.md at the project root, then:
"Using DESIGN.md as the style reference, insert images that fit the site."
That's it. Full details below.
Claude Code does not ship with an image-generation model of its own. This skill closes that gap by teaching Claude Code to call gpt-image-2 through the OpenAI Codex CLI's built-in $imagegen feature — so you can generate icons, banners, OG cards, illustrations, infographics, and photo edits without ever leaving your Claude Code session.
The skill bundles a verified prompting playbook, a CLI reference, a security note, and a sample asset produced during validation.
Sample 1600×900 hero image generated by gpt-image-2 via this skill.
Claude Code can already drive the Codex CLI, but $imagegen has rough edges that Claude misses on its own:
- gpt-image-2 ignores the exact output size you request (e.g. 256×256 → 1254×1254)
- Transparent PNGs are not supported by gpt-image-2 (only gpt-image-1.5 supports them, per the OpenAI guide)
- The raw PNG lands at
~/.codex/generated_images/<session-uuid>/ig_*.png— not where you asked - "Stunning, cinematic, 8K" keyword prompts produce visibly worse output than the five-part structured prompts the official OpenAI Cookbook recommends
- The naive non-interactive recipe requires
--dangerously-bypass-approvals-and-sandbox, which hands the Codex sub-agent broad shell power — not a safe default
This skill bakes those facts in, defaults to a safer split workflow (Codex generates only; the host does the file moves), and only opts into the bypass mode when explicitly requested.
The tool is general — anything that needs a PNG/JPEG/WebP written to disk fits. In practice the workflows that come up most often:
- Hero images and background photography for landing pages and marketing sites
- OG cards and social previews generated per page
- Favicons and app icons at the sizes you actually need
- Blog post illustrations that match the post's tone instead of leaning on stock libraries
- Brand asset drafts — logos, banners, badges — to iterate before committing to a designer
- Infographic placeholders and diagrams with consistent visual language
- Photo edits — change-X-keep-Y patterns on an existing image
The workflow it was originally built around is solo developers shipping sites without a designer — where image quality and stylistic consistency are the main signal separating a vibe-coded site from a polished product. With a DESIGN.md at the project root (see Usage), Claude Code can generate a coherent image set across the whole site in one pass. But none of that requires you to be using it for a site; the skill is just as happy producing a single OG card or a batch of game-asset placeholders.
⚠️ Security note: this skill defines two run modes. The default is safe; the opt-in "automated" mode uses--dangerously-bypass-approvals-and-sandbox. ReadSECURITY.mdbefore using the automated mode in a directory whose prompts or contents you do not control.
- macOS or Linux
- Claude Code — this skill is a filesystem skill loaded from
~/.claude/skills/, which is a Claude Code feature (Claude.ai web uses a different skill upload mechanism) - Codex CLI v0.130 or newer (
npm i -g @openai/codex) - A logged-in Codex session (
codex login) — uses your ChatGPT/Codex subscription - Optional:
OPENAI_API_KEYin your environment to bill batch jobs against the API instead
Verified against codex-cli 0.130.0 on macOS (Darwin 25.4.0). sips ships with macOS; on Linux the skill falls back to ImageMagick convert.
git clone https://github.com/JunSeo99/claude-skill-codex-imagegen.git
mkdir -p ~/.claude/skills
ln -s "$(pwd)/claude-skill-codex-imagegen/skill" ~/.claude/skills/codex-imagegenSymlinking from skill/ lets you git pull to update without re-copying files.
A pre-packaged distributable lives in dist/codex-imagegen.skill (it's a zip with a .skill extension).
git clone https://github.com/JunSeo99/claude-skill-codex-imagegen.git
mkdir -p ~/.claude/skills
unzip claude-skill-codex-imagegen/dist/codex-imagegen.skill -d ~/.claude/skills/git clone https://github.com/JunSeo99/claude-skill-codex-imagegen.git
mkdir -p ~/.claude/skills
cp -r claude-skill-codex-imagegen/skill ~/.claude/skills/codex-imagegenRestart Claude Code (or start a new session) so the skill is discovered.
The skill activates on phrases such as:
- "generate an image", "make an icon", "create a banner", "OG image"
- "hero illustration", "make a favicon", "brand mark", "product shot"
- "imagegen", "GPT Image 2", "codex image"
Multilingual triggers are supported via the skill's description field — localized prompts (Korean, Japanese, etc.) work without configuration.
Any request that produces a visual file saved to disk:
You: Make a 512×512 hero icon for my landing page — a single seedling growing from a flat horizon, line-art only, no text.
Claude: (invokes the skill, composes a five-part prompt, runs
codex execin Mode A — safe, parses the absolute path from stdout,cps andsips-resizes it to./assets/hero-icon.png, then opens the file with Read to verify it matches your intent)
By default Claude runs Codex without --dangerously-bypass-approvals-and-sandbox and does the cp/sips step itself, in its own approved-tool context. The Codex sub-agent never gets carte-blanche shell access during normal use.
If you want a single-step automated flow (Mode B) — e.g. for batching — you can opt in:
You: Generate these 12 favicons in automated mode.
Claude: (after confirming, runs Codex with
--dangerously-bypass-approvals-and-sandboxso the sub-agent handlescp/sipsitself)
Read SECURITY.md before opting in.
For complex prompts (text in the image, photo edits, brand assets), Claude reads references/prompting-guide.md before generating to apply the structured prompt template and avoid known pitfalls.
For projects that need a coherent visual language across multiple slots — hero, OG card, empty states, illustrations, favicons — drop a DESIGN.md at the project root with your palette, typography, and illustration style. Then ask Claude Code:
You: Using DESIGN.md as the style reference, insert images that fit the site.
Claude reads DESIGN.md, scans the codebase for slots that need imagery, writes prompts that incorporate the palette and tone, calls this skill for each, and inserts the resulting paths into the right <img> tags. The hero image, the empty-state illustration, and the OG card all end up looking like they belong to the same product.
A minimal DESIGN.md that works well:
# Design
## Concept
Calm, considered, modern.
## Palette
- Surface (main): #F4F1ED — warm off-white
- Surface (cards): #FFFFFF
- Text: #1A1A1A
- Accent / CTA: #C46A4E — soft terracotta, used sparingly
## Typography
- Inter, system-ui sans-serif
## Illustration style
- Single subject, plenty of whitespace, no busy backgrounds
- Soft natural light from upper left
- No text inside images unless explicitly asked
- Avoid stock-photo vibes and over-saturated colorsThe qualitative Illustration style block carries most of the consistency work. Palette obviously matters too, but it's the descriptive instructions ("hand-folded paper feel", "no busy backgrounds", "warm tones") that keep each image from looking like it came from a different stock-image library.
To make the difference concrete, here's the same coffee-shop landing page built two ways. Identical component code in both — same Next.js 15, same Tailwind, same shadcn-style markup, same content, same navigation. The only thing that varies is the image layer.
Both pages were produced in the same session. The right-hand one took roughly one extra command — "Using DESIGN.md as the style reference, insert images that fit the site." The demo project itself is intentionally kept outside this repo to keep the skill bundle small.
skill/
├── SKILL.md Workflow (Mode A & B), recipes, failure modes, triggers
├── references/
│ ├── prompting-guide.md 5-part structure, text rendering, edit pattern, anti-patterns
│ └── cli-reference.md codex exec flags, output paths, sips/convert post-processing, costs
└── assets/
└── hero.png Sample 1600×900 hero image generated by gpt-image-2
The five-part prompt structure (drawn from fal.ai and the OpenAI Cookbook) is:
- Scene/context — environment, time of day, mood
- Subject — main figure or object
- Details — style, medium, lighting, lens, color, texture
- Use case — drives output size and aspect
- Constraints — what to preserve, what to forbid
- ChatGPT/Codex subscription: 1 image turn ≈ 3–5 text turns of usage limit
- API key mode (
OPENAI_API_KEYset): priced per image, typically $0.04 – $0.35- Image output: $30.00 / 1M output tokens
- Image input: $8.00 / 1M input tokens ($2.00 / 1M cached input)
- Plus text-input tokens for your prompt (see OpenAI pricing for the current text rate on gpt-image-2)
For batch work (10+ images), the API key mode is generally cheaper than the subscription.
| Limitation | Workaround the skill applies |
|---|---|
| Output size doesn't match request | Always adds "at exactly WxH pixels"; host runs sips -z H W (macOS) or convert -resize WxH! (Linux) |
| No transparent PNG | Documents the limitation; suggests gpt-image-1.5 via Image API or post-processing |
| Long multi-line text passages, brand names, and very small text in dense layouts still wobble (short labels and CJK render near-perfectly) | EXACT TEXT marker + double quotes for literal strings; letter-by-letter spelling for brand names; HTML/CSS overlay for paragraph-length text |
| Latency up to 2 min on complex prompts | Bash timeout set to 300000 ms |
| Imprecise element placement in complex layouts | Falls back to simplification or SVG-then-rasterize suggestion |
| Component | Tested |
|---|---|
codex-cli |
0.130.0 |
| OS | macOS (Darwin 25.4.0); Linux untested but expected to work with ImageMagick fallback |
| Claude Code | App / CLI (filesystem skills) |
Output-path layout under ~/.codex/generated_images/ and $imagegen invocation semantics are not part of the Codex CLI's public contract. If a future codex-cli release changes them, please open an issue with the new behavior.
Issues and PRs welcome. Useful directions:
- Linux-side post-processing parity (ImageMagick verified end-to-end)
- Additional recipes (favicons, app store screenshots, social card pipelines)
- Improved non-Latin text rendering tips (CJK, Arabic, Devanagari, etc.)
- Migration notes when newer Codex CLI versions change
$imagegenbehavior
When changing the skill body, run the validators from anthropics/skills:
python3 path/to/skill-creator/scripts/quick_validate.py skill/
python3 path/to/skill-creator/scripts/package_skill.py skill/ dist/See SECURITY.md for the trust boundary and the threat model around --dangerously-bypass-approvals-and-sandbox.
See CHANGELOG.md.
- OpenAI — Codex CLI image generation feature
- OpenAI — Image Generation guide
- OpenAI Cookbook — GPT Image Prompting Guide
- fal.ai — GPT Image 2 Prompting Guide
- anthropics/skills — skill-creator template
This project is independent and not affiliated with, endorsed by, or sponsored by Anthropic or OpenAI. "Claude", "Claude Code", "OpenAI", "Codex", and "GPT" are trademarks of their respective owners.
MIT © 2026 JunSeo99


