The book cover is where 2026 self-publishing finally became different from 2024. The cover used to be the hard part — $200-500 to a designer, two weeks of revisions, and a result that often missed the genre conventions Amazon's algorithm rewards.
ChatGPT Images 2.0 changes that. Not because it makes designers obsolete, but because for the first time, an AI image model produces book-cover-grade imagery with text rendering that's actually legible and aesthetic control sophisticated enough to nail genre conventions.
This guide is what we'd hand a self-publisher who's never used AI image generation. It assumes you understand publishing basics (KDP requirements, paperback vs ebook, genre conventions) but not necessarily prompt engineering. If you want the foundational technical review of gpt-image-2 first, start with the honest guide.
---
Before We Start: What This Workflow Won't Do
Be clear about what you're getting. gpt-image-2 will give you:
- Strong photographic or illustrative imagery for the cover
- Genre-appropriate aesthetic that reads correctly at thumbnail size
- Text rendering at a quality good enough for ebook covers (though print covers should still use Figma)
- 5-10x faster iteration than working with a freelance designer
What it won't do:
- Replace genre research (you still need to study what bestsellers in your genre look like)
- Render your title in a licensed font (use Figma for that)
- Generate a print-ready KDP file with proper bleed (use Figma or Photoshop for that)
- Tell you when your cover is wrong (still get one human to review before publishing)
If that fits your workflow, here's the step-by-step.
---
Step 1: Genre Research (Yes, This Comes First)
Before you generate anything, spend 30 minutes doing this:
- Open Amazon
- Go to your genre's bestseller list (e.g., Kindle Store > Romance > Contemporary Romance)
- Screenshot the top 30 covers
- Note: dominant color palette, typography style, illustration vs photography vs text-only, presence of human subjects, mood (warm/cool, light/dark)
- Identify the 3-5 covers that you'd be proud to be confused with
This is the brief for your prompt. AI image generation amplifies what you ask for. If you don't know what a bestselling cover in your genre looks like, you'll generate something that doesn't fit.
- Contemporary Romance: illustrated couples, bright pastel palettes, hand-lettered titles
- Thriller / Suspense: photographic, high contrast, single human silhouette, cool palette
- Literary Fiction: abstract or symbolic imagery, restrained typography, cream/cool/black
- Epic Fantasy: illustrated, character-focused, rich color, ornate typography
- Cozy Mystery: illustrated, warm palette, charming detail, hand-lettered title
- Business / Self-Help: typography-led, single bold concept, clean photography or geometric
Your prompt will reference these conventions explicitly.
---
Step 2: Write the Prompt
Here's the structure that works for gpt-image-2 specifically. Note the order — it matters.
```
Book cover for [genre]. [Title] by [Author].
Genre conventions to reference: [explicit list — palette, typography
style, imagery type, mood from your research].
Specific concept: [the single visual idea — one clear image, not a
list of stacked metaphors].
Composition: [where the title text will go, what the focal point
is, where the negative space is, what the eye-tracking flow is].
Color palette: [3-5 hex codes or named colors, restricted].
Mood: [3-5 adjectives].
Critical: [whitespace instructions — where you'll composite the
title and author name later. DO NOT generate any text in this image.]
[Aspect ratio for the trim size — 2:3 for most paperback formats]
```
Real Example: Literary Fiction Cover
```
Book cover, literary fiction. "The Quiet Year" by Sarah Chen.
Genre conventions to reference: literary fiction in 2026 favors
abstract or symbolic photography, restrained color palette
(cream + cool tones + one accent), elegant serif typography,
quiet composition over busy.
Specific concept: a single empty wooden chair beside a window with
soft morning light entering. Not a person — the absence is the point.
Slight shadow on the floor. Aged hardwood floor texture.
Composition: chair positioned in lower-third. Upper two-thirds
of the canvas: window with soft diffused light, gauze curtain
visible. Strong negative space at the very top for title placement.
Color palette: muted cream wall (#E8E4DD), cool window light (#D4DDE0),
warm wood floor accent (#A0826D), single thread of soft sage (#9CAF88).
Mood: contemplative, quiet, restrained, hopeful, literary.
Critical: leave the upper third of the canvas with clean
photographic space (the window light) for title overlay. DO NOT
generate any text in this image — title and author will be
composited separately in Figma using licensed typography.
--ar 2:3
```
Why this prompt works: Specific, photographic, single concept (one chair, not "a chair beside a desk beside a bookshelf"). Color palette in hex codes constrains gpt-image-2's aesthetic bias. Whitespace instruction tells the model to compose for text overlay. The "DO NOT generate text" instruction prevents the model from inventing typography you don't want.
Real Example: Romance Cover (Illustrated)
```
Book cover, contemporary romance. "Summer in Cassis" by
Madeline Reeves.
Genre conventions to reference: contemporary romance in 2026 favors
illustrated covers, bright optimistic palette, two characters
in a romantic moment, hand-lettered title aesthetic.
Specific concept: two people walking along a Mediterranean coastal
village street at sunset, holding hands but not making eye contact,
suggesting the early stage of a relationship. Whitewashed buildings,
flower boxes, sun-drenched stone street.
Style: warm illustrated style, similar to Sally Rooney covers
crossed with travel illustration. Watercolor-like color application,
visible illustration line work, intentionally hand-made aesthetic.
Color palette: cream stone (#EBE3D2), terracotta accents (#C97A4F),
deep mediterranean blue (#3E5C76), warm sunset glow (#F6C29D),
single botanical green (#7B9560).
Mood: hopeful, warm, slightly nostalgic, romantic but restrained.
Composition: characters in the lower-third walking away from
viewer (so faces are not the focus). Strong street perspective
leading the eye toward the upper-third sunset. Top quarter
clean for title overlay.
Critical: leave the upper quarter of the canvas with clean
sunset sky for title placement. DO NOT generate any text in
this image.
--ar 2:3
```
Why this prompt works: Names a reference aesthetic (Sally Rooney covers + travel illustration), constrains palette to specific hex codes, specifies the characters' positioning to support cover-typography placement, and explicitly excludes text generation.
---
Step 3: Generate + Iterate (3-8 Generations)
Here's how to iterate efficiently:
- Run the prompt as written. Get 1-2 outputs.
- Identify what's wrong: composition, palette drift, mood mismatch.
- Don't iterate the same prompt — the noise amplification bug means quality drifts after 2-3 sessions on the same image. Start a fresh session with a refined prompt.
- After 4-6 fresh prompts, you should have something usable.
- "Same composition, but cooler color temperature overall"
- "Same scene, but the chair is slightly larger and the window light is more dramatic"
- "Reduce the saturation by about 30%, keep the same composition"
When to abandon a generation: if the model keeps producing the same wrong element three times in a row, your prompt needs structural change, not iteration. Rewrite the prompt.
---
Step 4: Compose in Figma
Take your final gpt-image-2 generation into Figma:
- Create a frame at exact KDP trim dimensions:
- 6×9 paperback: 5.06×8.5 inches at 300dpi → 1518×2550 pixels
- 5×8 paperback: 5×8 inches at 300dpi → 1500×2400 pixels
- For ebook: 1600×2560 (Amazon recommended)
- Place the gpt-image-2 PNG, scale to fill, lock position
- Add title in a licensed serif font:
- Literary fiction: Caslon, Garamond, or similar humanist serif
- Romance: hand-lettered or script-style font
- Thriller: bold sans-serif or condensed serif
- Self-help: clean sans-serif (Founders Grotesk, Inter)
- Add author name in complementary typography (smaller, lower hierarchy)
- For paperback: add spine + back cover designed in matching aesthetic
- Apply KDP-required bleed (0.125 inch on each side)
Why Figma instead of Photoshop: Figma's vector-first approach makes type editing instant. If you want to test 5 title variations, that's 30 seconds in Figma vs. 10 minutes in Photoshop.
---
Step 5: KDP Quality Check
Before uploading to KDP, run this checklist:
- Resolution: Minimum 300dpi at trim size
- Color space: RGB for ebook, can stay RGB for paperback (KDP converts)
- Bleed: 0.125 inch on all four sides (paperback only)
- Trim safety: Critical text/elements at least 0.25 inch from trim edges
- Spine width: Calculate using KDP's spine calculator (varies by page count)
- Barcode area: Lower-right back cover, 2×1.2 inch reserved space
- Title legibility: Test at 200×300 pixels (Amazon thumbnail size). If your title is unreadable at thumbnail size, it's wrong.
The thumbnail test is the most often-skipped step. If your gorgeous cover doesn't read at thumbnail size, browsing readers will scroll past it without a click.
---
Step 6: A/B Test (For Books You're Serious About)
If you're treating this book as a real launch (not just "let's see what happens"), A/B test 2-3 cover variants:
- Generate 2-3 distinct concepts in gpt-image-2 (different specific concepts, not just iterations)
- Composite each in Figma with the same title typography
- Show the three to a small group of target-audience readers (Reddit subreddit for the genre, Facebook group, your email list)
- Ask one question: "If you saw all three of these on Amazon, which one would you click first?"
- Pick the winner
This is what serious authors do. AI generation makes the cost of generating multiple options drop from $1500+ to $5. Take advantage of it.
---
What This Doesn't Replace
A book cover designer who knows your genre intimately. If you're publishing a book that needs to compete with the top 100 in a busy genre, hiring a specialist designer who's done 100+ covers in your genre is still worth $500-1500. They know the algorithmic conventions in ways that take an outsider 50 hours to learn.
Your eye for aesthetics. AI generation amplifies your taste. If your taste is "this looks fine I guess," your covers will look fine I guess. If you've spent 20 hours studying covers in your genre, your prompts will produce winners.
The genre-fit decision. Sometimes the right cover for your book is wrong for the algorithm. Hard literary fiction with a thriller-styled cover will get clicks but disappoint readers. Genre-fit is a strategy decision, not a generation problem.
---
The Bottom Line
gpt-image-2 + Figma + 30 minutes of genre research = a book cover that competes at thumbnail size. Cost: $1-5 in API. Time: 1-3 hours total. Quality: comparable to a $400-800 freelance cover for genre fiction; not yet comparable to a $2000+ literary or premium genre cover.
For self-publishers shipping multiple books per year, this is the workflow that makes prolific publishing economically viable. For first-time authors with one big book, this is the workflow that gets you to 80% quality in hours instead of weeks. For both, it's a real change.
---
Get the Full Prompt Pack
ChatGPT Images 2.0 Prompts Pack includes book-cover-specific prompts across 6 genres, with full failure-mode documentation per prompt. MIT-licensed, free.
For the foundational review of gpt-image-2: ChatGPT Images 2.0 Honest Guide.
For deeper book-cover production: AI Book Cover Generator: KDP + Self-Publishing Guide.
— Atilla