⚡ Promptolis Original · Coding & Development
🗃️ Database Index Strategist
Designs your DB index strategy: which queries need indexes, the 4 index types (B-tree, GIN, partial, expression) and when each fits, how to read EXPLAIN plans, and the indexes to DROP that aren't pulling weight.
Why this is epic
Most teams index reactively: query is slow, add an index, hope. Then 200 indexes later, writes are slow + nobody knows which indexes pull weight. This Original designs indexes deliberately + identifies which to drop.
Outputs the strategy: per-query analysis, index type per query (B-tree default, GIN for arrays/JSONB, partial for filtered queries, expression for transformed data), the EXPLAIN-plan reading, and the index audit (which to drop).
Calibrated to 2026 Postgres reality: GIN indexes for JSONB queries, partial indexes for tenant-isolated queries, the cost of write amplification with many indexes, when covering indexes pay off.
Includes the index audit: most production DBs have ~30% of indexes that pg_stat says are unused. Dropping them halves write amplification + reduces backup size.
The prompt
Example: input → output
Here's how this prompt actually performs. Real input below, real output from Claude Opus 4.
<database>Postgres 16</database>
<scenario>Slow query — taking 4-8 seconds in production</scenario>
<table-shape>
Table `events` (audit log):
id BIGSERIAL PRIMARY KEY
workspace_id UUID NOT NULL -- multi-tenant; queries always filter by this
user_id UUID
event_type TEXT NOT NULL -- e.g., 'user.login', 'project.created', 'document.shared'
resource_id UUID -- the entity acted on (project_id, document_id, etc.)
resource_type TEXT -- 'project', 'document', etc.
metadata JSONB -- arbitrary structured data
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
</table-shape>
<query-or-pattern>
The slow query (run from a customer's audit log page):
```sql
SELECT id, event_type, user_id, resource_id, resource_type, metadata, created_at
FROM events
WHERE workspace_id = $1
AND created_at >= $2 AND created_at < $3
AND ($4::text IS NULL OR event_type = $4)
ORDER BY created_at DESC
LIMIT 100;
```
Usage: customer views their audit log, paginated. Filters: workspace (always), date range (always, last 7-30 days typical), optional event_type. Sort: most recent first.
</query-or-pattern>
<existing-indexes>
events_pkey (id) — primary key (B-tree, default)
idx_events_workspace_id (workspace_id) — basic single-column
idx_events_user_id (user_id) — added 'just in case'
idx_events_created_at (created_at) — added when query was already slow
</existing-indexes>
<explain-plan>
```
Limit (cost=0.42..1234.56 rows=100 width=...) (actual time=4523.123..4789.234 rows=100 loops=1)
-> Index Scan Backward using idx_events_created_at on events (cost=0.42..234567.89 rows=18923 width=...) (actual time=4523.121..4789.230 rows=100 loops=1)
Filter: ((workspace_id = '...'::uuid) AND (created_at >= '...') AND (created_at < '...') AND (...))
Rows Removed by Filter: 1247563
Planning Time: 0.234 ms
Execution Time: 4789.345 ms
```
</explain-plan>
<scale>events table: ~50M rows currently. Growing ~500K rows/day. Per-workspace: avg ~10K rows, max ~500K (large customers).</scale>
<read-write-ratio>~95% writes (events being recorded), ~5% reads (audit log views). But reads are expensive.</read-write-ratio>
Index Strategy: Audit Log Slow Query
Query Pattern Analysis
The query has 3 filter dimensions + 1 sort:
workspace_id =(always equality, always present) — most selective, varies per callcreated_at >= AND <(range, always present) — typically last 7-30 daysevent_type =(equality, OPTIONAL — null often) — moderately selectiveORDER BY created_at DESC LIMIT 100
This is a classic compound-filter + range + sort pattern. The plan currently chooses the wrong index because none of your existing indexes match the access pattern.
EXPLAIN Plan Walkthrough
Index Scan Backward using idx_events_created_at on events
Filter: ((workspace_id = '...'::uuid) AND (created_at >= '...') AND ...)
Rows Removed by Filter: 1247563
Translation: Postgres uses the created_at index to scan backward (newest first), reads each row, then filters by workspace + date range. It scans 1.25M rows to find 100 matching the workspace. That's the slow part — Rows Removed by Filter: 1247563.
The issue: existing indexes don't combine workspace_id + created_at. The planner picks created_at (covers the sort) but loses the workspace selectivity.
Recommended Indexes
Drop existing inefficient indexes (audit them first):
-- These don't match your query pattern; review pg_stat_user_indexes for usage:
SELECT indexrelname, idx_scan, idx_tup_read
FROM pg_stat_user_indexes
WHERE relname = 'events';
If idx_events_user_id and idx_events_workspace_id show low idx_scan, drop them. They're write-tax without read benefit.
Add the right index:
-- Composite index matching the query pattern
CREATE INDEX CONCURRENTLY idx_events_workspace_created
ON events (workspace_id, created_at DESC);
Why this index:
workspace_idfirst (equality filter — limits to ~10K rows per workspace)created_at DESCsecond (matches sort + range filter)- Postgres can use this for: filter by workspace, then range scan by created_at, in DESC order natively (no sort needed)
Expected impact: query drops from ~4.5s to <50ms. The planner walks ~10K workspace rows + filters by date range, instead of 1.25M rows.
Optional: covering index (if event_type filter is common)
If a significant fraction of queries filter by event_type, add a partial index:
-- Partial index for the most-queried event types
CREATE INDEX CONCURRENTLY idx_events_workspace_created_user_login
ON events (workspace_id, created_at DESC)
WHERE event_type = 'user.login';
Only use if event_type filtering is high-volume. Otherwise, the main composite index is sufficient.
For JSONB metadata queries (future-proofing)
If you start querying inside the metadata JSONB:
-- GIN index on JSONB for containment queries
CREATE INDEX CONCURRENTLY idx_events_metadata_gin
ON events USING GIN (metadata);
Only add when actually needed; GIN indexes are ~3x bigger than B-tree.
Composite Index Column Order
For (workspace_id, created_at DESC):
Why workspace_id first:
- Equality filter — most selective at workspace level
- Postgres builds the B-tree by workspace, then within each workspace by created_at
- Reading is: 'find this workspace's branch, then walk created_at within it'
Why created_at DESC:
- Matches the ORDER BY direction
- Postgres reads index entries in stored order — DESC matches your query exactly
Wrong order would be: (created_at, workspace_id). Then the planner would have to scan all events in the date range across all workspaces, then filter. Much slower.
Rule of thumb:
1. Equality columns first (high selectivity)
2. Range columns after
3. ORDER BY columns at the end (in matching direction)
Special Index Types in Your Case
Partial Index (consider for cardinality reasons)
If 80% of your audit log queries filter by recent dates only:
CREATE INDEX idx_events_recent
ON events (workspace_id, created_at DESC)
WHERE created_at >= NOW() - INTERVAL '90 days';
Only indexes recent rows. Smaller index, faster build, less write tax. But: requires periodic rebuild as 'recent' rolls forward.
Expression Index (not needed here, but for future)
For case-insensitive search on event_type:
CREATE INDEX idx_events_event_type_lower ON events (LOWER(event_type));
Matches queries like WHERE LOWER(event_type) = 'user.login'.
GIN Index (when querying JSONB)
If metadata contains queryable structures:
-- For 'metadata @> ...' queries
CREATE INDEX idx_events_metadata ON events USING GIN (metadata);
-- Or path-specific (smaller):
CREATE INDEX idx_events_metadata_user_id ON events ((metadata->>'user_id'));
Evaluate only when JSONB queries are actually slow.
Write Amplification
Before: 4 indexes (incl. PK). Write amplification ~5x (each write updates the row + updates 4 indexes).
After: 2 indexes (PK + composite). Write amplification ~3x.
Net write performance improvement: ~25-30% faster writes. With 95/5 read-write split, your write speedup is meaningful.
If you keep idx_events_user_id (assume some queries use it), still net better than 4 indexes.
Verification After Apply
Before deploying:
-- 1. Build new index CONCURRENTLY in production (no table lock)
CREATE INDEX CONCURRENTLY idx_events_workspace_created
ON events (workspace_id, created_at DESC);
-- 2. Test query plan
EXPLAIN (ANALYZE, BUFFERS)
SELECT id, event_type, ...
FROM events
WHERE workspace_id = '...' AND created_at >= '...' AND created_at < '...'
ORDER BY created_at DESC LIMIT 100;
-- Should show: Index Scan using idx_events_workspace_created
-- Execution Time: <50ms
Pass criteria:
- Plan uses new composite index
- Execution time <50ms (down from ~4.5s)
- No more 'Rows Removed by Filter' line
After deploying, monitor production:
- Audit log endpoint p95 latency drops from ~5s to <100ms
- DB CPU from queries drops noticeably
Migration Plan
For adding the new index in production:
-- CONCURRENTLY = no table lock; takes longer to build
CREATE INDEX CONCURRENTLY idx_events_workspace_created
ON events (workspace_id, created_at DESC);
For 50M rows, expect ~10-30 minutes to build. Monitor during build:
-- Check index build progress (PG 12+)
SELECT phase, pg_size_pretty(bytes_done) AS done, pg_size_pretty(bytes_total) AS total
FROM pg_stat_progress_create_index;
For dropping unused indexes (after audit):
-- Verify safe to drop:
SELECT indexrelname, idx_scan, last_idx_scan
FROM pg_stat_user_indexes
WHERE relname = 'events' AND indexrelname = 'idx_events_user_id';
-- If idx_scan = 0 over 30+ days of usage:
DROP INDEX CONCURRENTLY idx_events_user_id;
Use CONCURRENTLY for both create + drop in production. No table lock.
Reset stats first if you want a clean window of measurement:
SELECT pg_stat_reset();
-- Wait 30 days
-- Then evaluate idx_scan counts
What This Strategy Won't Solve
- Won't help if writes are saturating I/O. Index strategy helps reads; write throughput needs different optimizations (partitioning, sharding, async writes).
- Won't fix queries that scan large date ranges. Customer asks for '5 years of audit log' — even with indexes, that's 100M rows. Requires data retention policy or partitioning.
- Won't help if statistics are stale. Run
ANALYZE events;periodically for the planner to choose the right index. - Won't compensate for poor query design.
WHERE workspace_id::text = $1(cast that prevents index use) — fix the query, not the index. - Won't replace partitioning at scale. When events table reaches 1B rows, even good indexes get slow. Consider table partitioning by
created_atmonth.
Key Takeaways
- Composite
(workspace_id, created_at DESC)is your fix. Drops query from 4.5s → <50ms by matching the access pattern. - Drop unused single-column indexes (
workspace_idalone,user_idalone) after audit. Each unused index taxes writes ~10-30%. - Build with
CONCURRENTLYin production. No table lock; takes longer but safe. - Composite index column order: equality first (workspace_id), range second (created_at). Match query pattern.
- Audit pg_stat_user_indexes quarterly. Most production DBs have 30%+ unused indexes silently taxing writes.
Common use cases
- Engineer with a slow query needing the right index added
- DBA reviewing existing indexes for waste
- Backend engineer designing schema for a new feature
- Tech lead doing performance audit of a Postgres database
- Engineer migrating from MySQL to Postgres + needing index strategy
- Solo dev whose DB is 'just slow' but they don't know what to index
Best AI model for this
Claude Opus 4. Index strategy needs reasoning about query patterns, write amplification, and storage costs — exactly Claude's strengths. ChatGPT GPT-5 second-best.
Pro tips
- B-tree default. ~90% of indexes are B-tree. Don't over-think.
- GIN for JSONB queries (`@>`, `?`). Partial for filtered queries (`WHERE status = 'active'`). Expression for transformed lookups (`LOWER(email)`).
- Index columns matching WHERE clauses + ORDER BY. Not every column.
- Composite index column order matters: most-selective first if equality, but ORDER BY columns at end.
- Each index taxes writes ~10-30%. 50 indexes = writes are 50% of what they could be.
- Drop unused indexes. `pg_stat_user_indexes` shows usage. ~0 reads = candidate for drop.
- EXPLAIN ANALYZE is your tool. Read the plan; don't guess.
Customization tips
- Always paste the actual SQL query + EXPLAIN ANALYZE output. Index design depends on the precise access pattern.
- Include table shape with column types + estimated sizes. Index choices differ for narrow tables vs wide.
- List existing indexes precisely. The audit benefits from knowing what's already there.
- Specify scale (row counts + growth). Indexing strategy at 1M rows differs from 1B rows (partitioning becomes the lever).
- Mention read-write ratio. Heavily-read tables justify more indexes; heavily-written tables need fewer.
- Use the Index Audit Mode variant for codebases with 50+ existing indexes — it identifies unused/redundant before adding new ones.
Variants
Slow Query Mode
For diagnosing + indexing a specific slow query — emphasizes EXPLAIN reading + targeted index.
Schema Design Mode
For designing indexes for a new schema — emphasizes upfront query patterns analysis.
Index Audit Mode
For auditing existing indexes — finds unused, redundant, or harmful indexes.
Multi-Tenant Mode
For tenant-isolated apps — emphasizes partial indexes + tenant-prefix patterns.
Frequently asked questions
How do I use the Database Index Strategist prompt?
Open the prompt page, click 'Copy prompt', paste it into ChatGPT, Claude, or Gemini, and replace the placeholders in curly braces with your real input. The prompt is also launchable directly in each model with one click.
Which AI model works best with Database Index Strategist?
Claude Opus 4. Index strategy needs reasoning about query patterns, write amplification, and storage costs — exactly Claude's strengths. ChatGPT GPT-5 second-best.
Can I customize the Database Index Strategist prompt for my use case?
Yes — every Promptolis Original is designed to be customized. Key levers: B-tree default. ~90% of indexes are B-tree. Don't over-think.; GIN for JSONB queries (`@>`, `?`). Partial for filtered queries (`WHERE status = 'active'`). Expression for transformed lookups (`LOWER(email)`).
Explore more Originals
Hand-crafted 2026-grade prompts that actually change how you work.
← All Promptolis Originals