/
DE

⚡ Promptolis Original · Coding & Development

🗃️ Database Index Strategist

Designs your DB index strategy: which queries need indexes, the 4 index types (B-tree, GIN, partial, expression) and when each fits, how to read EXPLAIN plans, and the indexes to DROP that aren't pulling weight.

⏱️ 5 min to set up 🤖 ~110 seconds in Claude 🗓️ Updated 2026-04-28

Why this is epic

Most teams index reactively: query is slow, add an index, hope. Then 200 indexes later, writes are slow + nobody knows which indexes pull weight. This Original designs indexes deliberately + identifies which to drop.

Outputs the strategy: per-query analysis, index type per query (B-tree default, GIN for arrays/JSONB, partial for filtered queries, expression for transformed data), the EXPLAIN-plan reading, and the index audit (which to drop).

Calibrated to 2026 Postgres reality: GIN indexes for JSONB queries, partial indexes for tenant-isolated queries, the cost of write amplification with many indexes, when covering indexes pay off.

Includes the index audit: most production DBs have ~30% of indexes that pg_stat says are unused. Dropping them halves write amplification + reduces backup size.

The prompt

Promptolis Original · Copy-ready
<role> You are a database performance engineer with 8+ years optimizing Postgres + MySQL for SaaS, e-commerce, analytics. You have audited 100+ databases + designed indexes for 50+ schemas. You know which indexes pay off vs which add write tax without read benefit. You are direct. You will tell a builder their query plan shows a sequential scan because they're missing an index, that they have 50 indexes but only 12 are used, or that their composite index column order is wrong. You refuse to recommend 'add an index' as a generic answer — you'll specify which index for which query. </role> <principles> 1. B-tree default. Most queries don't need exotic. 2. Match index to query pattern (WHERE + ORDER BY). 3. Composite column order matters. 4. Each index taxes writes. Audit unused. 5. EXPLAIN ANALYZE > guessing. 6. Partial indexes for filtered tables (multi-tenant, status='active'). 7. GIN for JSONB; expression for transformed lookups. </principles> <input> <database>{Postgres / MySQL / version}</database> <scenario>{slow query / new schema / index audit / migration}</scenario> <table-shape>{schema of relevant tables — columns + types + sizes}</table-shape> <query-or-pattern>{the slow query OR the access pattern for the new schema}</query-or-pattern> <existing-indexes>{current indexes on relevant tables}</existing-indexes> <explain-plan>{EXPLAIN ANALYZE output if available}</explain-plan> <scale>{table row counts, daily growth}</scale> <read-write-ratio>{rough %}</read-write-ratio> </input> <output-format> # Index Strategy: [scenario] ## Query Pattern Analysis What queries hit this table. Which columns + operators + sorts. ## Recommended Indexes For each query: index name, definition (CREATE INDEX statement), why this index, expected impact. ## EXPLAIN Plan Walkthrough (if applicable) Line-by-line of the plan. What each step does. Bottleneck. ## Indexes to Drop If existing indexes audit: which to drop, why, how to verify safe to drop. ## Composite Index Column Order For multi-column indexes: the rationale for column ordering. ## Special Index Types GIN, partial, expression — when each applies in your case. ## Write Amplification Estimated impact on writes. Trade-off acknowledged. ## Verification After Apply How to confirm the index helped. Specific queries to run before/after. ## Migration Plan If changing live: CONCURRENTLY index creation, monitoring during build. ## What This Strategy Won't Solve Honest limits. ## Key Takeaways 3-5 bullets — for the team's playbook. </output-format> <auto-intake> If input incomplete: ask for database, scenario, table shape, query/pattern, existing indexes, EXPLAIN plan, scale, read-write ratio. </auto-intake> Now, design the index strategy:

Example: input → output

Here's how this prompt actually performs. Real input below, real output from Claude Opus 4.

📝 Input
<database>Postgres 16</database>
<scenario>Slow query — taking 4-8 seconds in production</scenario>
<table-shape>
Table `events` (audit log):
  id BIGSERIAL PRIMARY KEY
  workspace_id UUID NOT NULL  -- multi-tenant; queries always filter by this
  user_id UUID
  event_type TEXT NOT NULL    -- e.g., 'user.login', 'project.created', 'document.shared'
  resource_id UUID             -- the entity acted on (project_id, document_id, etc.)
  resource_type TEXT           -- 'project', 'document', etc.
  metadata JSONB               -- arbitrary structured data
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
</table-shape>
<query-or-pattern>
The slow query (run from a customer's audit log page):
```sql
SELECT id, event_type, user_id, resource_id, resource_type, metadata, created_at
FROM events
WHERE workspace_id = $1
  AND created_at >= $2 AND created_at < $3
  AND ($4::text IS NULL OR event_type = $4)
ORDER BY created_at DESC
LIMIT 100;
```

Usage: customer views their audit log, paginated. Filters: workspace (always), date range (always, last 7-30 days typical), optional event_type. Sort: most recent first.
</query-or-pattern>
<existing-indexes>
  events_pkey (id) — primary key (B-tree, default)
  idx_events_workspace_id (workspace_id) — basic single-column
  idx_events_user_id (user_id) — added 'just in case'
  idx_events_created_at (created_at) — added when query was already slow
</existing-indexes>
<explain-plan>
```
Limit  (cost=0.42..1234.56 rows=100 width=...) (actual time=4523.123..4789.234 rows=100 loops=1)
  ->  Index Scan Backward using idx_events_created_at on events  (cost=0.42..234567.89 rows=18923 width=...) (actual time=4523.121..4789.230 rows=100 loops=1)
        Filter: ((workspace_id = '...'::uuid) AND (created_at >= '...') AND (created_at < '...') AND (...))
        Rows Removed by Filter: 1247563
Planning Time: 0.234 ms
Execution Time: 4789.345 ms
```
</explain-plan>
<scale>events table: ~50M rows currently. Growing ~500K rows/day. Per-workspace: avg ~10K rows, max ~500K (large customers).</scale>
<read-write-ratio>~95% writes (events being recorded), ~5% reads (audit log views). But reads are expensive.</read-write-ratio>
🤖 Output

Index Strategy: Audit Log Slow Query

Query Pattern Analysis

The query has 3 filter dimensions + 1 sort:

  • workspace_id = (always equality, always present) — most selective, varies per call
  • created_at >= AND < (range, always present) — typically last 7-30 days
  • event_type = (equality, OPTIONAL — null often) — moderately selective
  • ORDER BY created_at DESC LIMIT 100

This is a classic compound-filter + range + sort pattern. The plan currently chooses the wrong index because none of your existing indexes match the access pattern.

EXPLAIN Plan Walkthrough

Index Scan Backward using idx_events_created_at on events
  Filter: ((workspace_id = '...'::uuid) AND (created_at >= '...') AND ...)
  Rows Removed by Filter: 1247563

Translation: Postgres uses the created_at index to scan backward (newest first), reads each row, then filters by workspace + date range. It scans 1.25M rows to find 100 matching the workspace. That's the slow part — Rows Removed by Filter: 1247563.

The issue: existing indexes don't combine workspace_id + created_at. The planner picks created_at (covers the sort) but loses the workspace selectivity.

Recommended Indexes

Drop existing inefficient indexes (audit them first):
-- These don't match your query pattern; review pg_stat_user_indexes for usage:
SELECT indexrelname, idx_scan, idx_tup_read 
FROM pg_stat_user_indexes 
WHERE relname = 'events';

If idx_events_user_id and idx_events_workspace_id show low idx_scan, drop them. They're write-tax without read benefit.

Add the right index:
-- Composite index matching the query pattern
CREATE INDEX CONCURRENTLY idx_events_workspace_created 
  ON events (workspace_id, created_at DESC);

Why this index:

  • workspace_id first (equality filter — limits to ~10K rows per workspace)
  • created_at DESC second (matches sort + range filter)
  • Postgres can use this for: filter by workspace, then range scan by created_at, in DESC order natively (no sort needed)

Expected impact: query drops from ~4.5s to <50ms. The planner walks ~10K workspace rows + filters by date range, instead of 1.25M rows.

Optional: covering index (if event_type filter is common)

If a significant fraction of queries filter by event_type, add a partial index:

-- Partial index for the most-queried event types
CREATE INDEX CONCURRENTLY idx_events_workspace_created_user_login 
  ON events (workspace_id, created_at DESC) 
  WHERE event_type = 'user.login';

Only use if event_type filtering is high-volume. Otherwise, the main composite index is sufficient.

For JSONB metadata queries (future-proofing)

If you start querying inside the metadata JSONB:

-- GIN index on JSONB for containment queries
CREATE INDEX CONCURRENTLY idx_events_metadata_gin 
  ON events USING GIN (metadata);

Only add when actually needed; GIN indexes are ~3x bigger than B-tree.

Composite Index Column Order

For (workspace_id, created_at DESC):

Why workspace_id first:

  • Equality filter — most selective at workspace level
  • Postgres builds the B-tree by workspace, then within each workspace by created_at
  • Reading is: 'find this workspace's branch, then walk created_at within it'

Why created_at DESC:

  • Matches the ORDER BY direction
  • Postgres reads index entries in stored order — DESC matches your query exactly

Wrong order would be: (created_at, workspace_id). Then the planner would have to scan all events in the date range across all workspaces, then filter. Much slower.

Rule of thumb:

1. Equality columns first (high selectivity)

2. Range columns after

3. ORDER BY columns at the end (in matching direction)

Special Index Types in Your Case

Partial Index (consider for cardinality reasons)

If 80% of your audit log queries filter by recent dates only:

CREATE INDEX idx_events_recent 
  ON events (workspace_id, created_at DESC) 
  WHERE created_at >= NOW() - INTERVAL '90 days';

Only indexes recent rows. Smaller index, faster build, less write tax. But: requires periodic rebuild as 'recent' rolls forward.

Expression Index (not needed here, but for future)

For case-insensitive search on event_type:

CREATE INDEX idx_events_event_type_lower ON events (LOWER(event_type));

Matches queries like WHERE LOWER(event_type) = 'user.login'.

GIN Index (when querying JSONB)

If metadata contains queryable structures:

-- For 'metadata @> ...' queries
CREATE INDEX idx_events_metadata ON events USING GIN (metadata);

-- Or path-specific (smaller):
CREATE INDEX idx_events_metadata_user_id ON events ((metadata->>'user_id'));

Evaluate only when JSONB queries are actually slow.

Write Amplification

Before: 4 indexes (incl. PK). Write amplification ~5x (each write updates the row + updates 4 indexes).

After: 2 indexes (PK + composite). Write amplification ~3x.

Net write performance improvement: ~25-30% faster writes. With 95/5 read-write split, your write speedup is meaningful.

If you keep idx_events_user_id (assume some queries use it), still net better than 4 indexes.

Verification After Apply

Before deploying:

-- 1. Build new index CONCURRENTLY in production (no table lock)
CREATE INDEX CONCURRENTLY idx_events_workspace_created 
  ON events (workspace_id, created_at DESC);

-- 2. Test query plan
EXPLAIN (ANALYZE, BUFFERS) 
SELECT id, event_type, ... 
FROM events 
WHERE workspace_id = '...' AND created_at >= '...' AND created_at < '...' 
ORDER BY created_at DESC LIMIT 100;

-- Should show: Index Scan using idx_events_workspace_created
-- Execution Time: <50ms

Pass criteria:

  • Plan uses new composite index
  • Execution time <50ms (down from ~4.5s)
  • No more 'Rows Removed by Filter' line

After deploying, monitor production:

  • Audit log endpoint p95 latency drops from ~5s to <100ms
  • DB CPU from queries drops noticeably

Migration Plan

For adding the new index in production:

-- CONCURRENTLY = no table lock; takes longer to build
CREATE INDEX CONCURRENTLY idx_events_workspace_created 
  ON events (workspace_id, created_at DESC);

For 50M rows, expect ~10-30 minutes to build. Monitor during build:

-- Check index build progress (PG 12+)
SELECT phase, pg_size_pretty(bytes_done) AS done, pg_size_pretty(bytes_total) AS total 
FROM pg_stat_progress_create_index;

For dropping unused indexes (after audit):

-- Verify safe to drop:
SELECT indexrelname, idx_scan, last_idx_scan 
FROM pg_stat_user_indexes 
WHERE relname = 'events' AND indexrelname = 'idx_events_user_id';

-- If idx_scan = 0 over 30+ days of usage:
DROP INDEX CONCURRENTLY idx_events_user_id;

Use CONCURRENTLY for both create + drop in production. No table lock.

Reset stats first if you want a clean window of measurement:

SELECT pg_stat_reset();
-- Wait 30 days
-- Then evaluate idx_scan counts

What This Strategy Won't Solve

  • Won't help if writes are saturating I/O. Index strategy helps reads; write throughput needs different optimizations (partitioning, sharding, async writes).
  • Won't fix queries that scan large date ranges. Customer asks for '5 years of audit log' — even with indexes, that's 100M rows. Requires data retention policy or partitioning.
  • Won't help if statistics are stale. Run ANALYZE events; periodically for the planner to choose the right index.
  • Won't compensate for poor query design. WHERE workspace_id::text = $1 (cast that prevents index use) — fix the query, not the index.
  • Won't replace partitioning at scale. When events table reaches 1B rows, even good indexes get slow. Consider table partitioning by created_at month.

Key Takeaways

  • Composite (workspace_id, created_at DESC) is your fix. Drops query from 4.5s → <50ms by matching the access pattern.
  • Drop unused single-column indexes (workspace_id alone, user_id alone) after audit. Each unused index taxes writes ~10-30%.
  • Build with CONCURRENTLY in production. No table lock; takes longer but safe.
  • Composite index column order: equality first (workspace_id), range second (created_at). Match query pattern.
  • Audit pg_stat_user_indexes quarterly. Most production DBs have 30%+ unused indexes silently taxing writes.

Common use cases

  • Engineer with a slow query needing the right index added
  • DBA reviewing existing indexes for waste
  • Backend engineer designing schema for a new feature
  • Tech lead doing performance audit of a Postgres database
  • Engineer migrating from MySQL to Postgres + needing index strategy
  • Solo dev whose DB is 'just slow' but they don't know what to index

Best AI model for this

Claude Opus 4. Index strategy needs reasoning about query patterns, write amplification, and storage costs — exactly Claude's strengths. ChatGPT GPT-5 second-best.

Pro tips

  • B-tree default. ~90% of indexes are B-tree. Don't over-think.
  • GIN for JSONB queries (`@>`, `?`). Partial for filtered queries (`WHERE status = 'active'`). Expression for transformed lookups (`LOWER(email)`).
  • Index columns matching WHERE clauses + ORDER BY. Not every column.
  • Composite index column order matters: most-selective first if equality, but ORDER BY columns at end.
  • Each index taxes writes ~10-30%. 50 indexes = writes are 50% of what they could be.
  • Drop unused indexes. `pg_stat_user_indexes` shows usage. ~0 reads = candidate for drop.
  • EXPLAIN ANALYZE is your tool. Read the plan; don't guess.

Customization tips

  • Always paste the actual SQL query + EXPLAIN ANALYZE output. Index design depends on the precise access pattern.
  • Include table shape with column types + estimated sizes. Index choices differ for narrow tables vs wide.
  • List existing indexes precisely. The audit benefits from knowing what's already there.
  • Specify scale (row counts + growth). Indexing strategy at 1M rows differs from 1B rows (partitioning becomes the lever).
  • Mention read-write ratio. Heavily-read tables justify more indexes; heavily-written tables need fewer.
  • Use the Index Audit Mode variant for codebases with 50+ existing indexes — it identifies unused/redundant before adding new ones.

Variants

Slow Query Mode

For diagnosing + indexing a specific slow query — emphasizes EXPLAIN reading + targeted index.

Schema Design Mode

For designing indexes for a new schema — emphasizes upfront query patterns analysis.

Index Audit Mode

For auditing existing indexes — finds unused, redundant, or harmful indexes.

Multi-Tenant Mode

For tenant-isolated apps — emphasizes partial indexes + tenant-prefix patterns.

Frequently asked questions

How do I use the Database Index Strategist prompt?

Open the prompt page, click 'Copy prompt', paste it into ChatGPT, Claude, or Gemini, and replace the placeholders in curly braces with your real input. The prompt is also launchable directly in each model with one click.

Which AI model works best with Database Index Strategist?

Claude Opus 4. Index strategy needs reasoning about query patterns, write amplification, and storage costs — exactly Claude's strengths. ChatGPT GPT-5 second-best.

Can I customize the Database Index Strategist prompt for my use case?

Yes — every Promptolis Original is designed to be customized. Key levers: B-tree default. ~90% of indexes are B-tree. Don't over-think.; GIN for JSONB queries (`@>`, `?`). Partial for filtered queries (`WHERE status = 'active'`). Expression for transformed lookups (`LOWER(email)`).

Explore more Originals

Hand-crafted 2026-grade prompts that actually change how you work.

← All Promptolis Originals