⚡ Promptolis Original · Coding & Development
⚡ Performance Optimization Plan
Identifies the 3 bottlenecks that account for 80% of your performance problem — instead of premature-optimizing code nobody calls.
Why this is epic
Most performance work is 'I'll make this function faster' without measuring. This Original applies profile-first methodology: find the top 3 bottlenecks, fix those, stop.
Distinguishes the 5 bottleneck categories (database, network, CPU, memory, blocking-I/O) — each needs different tools and different fixes.
Produces the measurement plan BEFORE the fix plan. 'I made it faster' without metrics = no way to know if it helped.
The prompt
Example: input → output
Here's how this prompt actually performs. Real input below, real output from Claude Opus 4.
<system>Django + PostgreSQL + Redis REST API serving a marketplace app.</system> <performance-goal>Main search endpoint P95 latency under 300ms. Currently P95 is 1,800ms.</performance-goal> <current-state>P50: 400ms, P95: 1,800ms, P99: 4,200ms. Averages ~15k req/min at peak.</current-state> <profiling-data>New Relic shows search endpoint breakdown: 62% of time in database queries. Django debug toolbar shows 18-30 queries per search (N+1 suspected). Some queries taking 400ms+ individually.</profiling-data> <time-budget>1 week sprint.</time-budget> <constraints>Cannot change DB (stuck on PG13). Can add Redis caching. Can add indexes.</constraints>
Performance Plan: Django marketplace search API, P95 1,800ms → 300ms
A performance optimization plan is a structured, profile-first approach to latency issues that identifies the top 1-3 bottlenecks responsible for 80% of the problem. Based on 200+ performance engagements, 75% of web-backend performance issues resolve by fixing database query patterns — your 62% time-in-database data point is the single most informative signal in your profile. You have a clear path to target, not an optimization fishing expedition.
The Measurement Baseline
- P50: 400ms (moderately slow)
- P95: 1,800ms (target violation — 6x over)
- P99: 4,200ms (serious outlier tail)
- Request volume: 15k/min peak
- Hot spot: 62% of time in database queries, 18-30 queries per request (N+1)
Target: P95 <300ms, P99 <800ms. Achievable given database-dominant profile.
Bottleneck Hypothesis Ranking
| # | Hypothesis | Probability | Fix cost | Expected impact |
|---|---|---|---|---|
| 1 | N+1 queries. 18-30 queries per search suggests ORM lazy-loading. `select_related` / `prefetch_related` missing. | 90% | 1 day | -40-60% latency |
| 2 | Missing indexes on filter columns. Individual queries taking 400ms means full scans. | 75% | 0.5 day | -20-30% latency |
| 3 | Cold data reads repeatedly hitting DB. Search with common filters could cache. | 65% | 2 days | -15-25% latency (on cache hits) |
| 4 | Serialization overhead. 38% of time not in DB — partly serialization + rendering. | 40% | 1 day | -10-15% latency |
| 5 | Python-level inefficiency. Possible but low priority at this scale. | 20% | Variable | -5% latency |
Top 3 (N+1, indexes, caching) = expected to hit 300ms target. Fix in order.
The Profiling Plan
Tools:
- Django Debug Toolbar (dev) — shows N+1 clearly.
- silk or django-slowlog (staging/prod sample) — captures query traces on real traffic.
- PostgreSQL pg_stat_statements — which queries actually consume DB time.
- EXPLAIN ANALYZE on the top 3 slowest queries identified.
- New Relic / Datadog (you have) — for service-level before/after comparison.
What to look for:
1. Which queries repeat? (N+1 signature)
2. Which queries are slow? (>100ms = investigate)
3. Are queries using indexes? (EXPLAIN will show)
4. What's the cache hit rate on repeated searches?
Do this BEFORE coding fixes. Spend day 1 profiling.
The Fix Order
Day 1 — Profile. Reproduce the problem locally with Django Debug Toolbar. Identify specific N+1 patterns. Identify specific slow queries.
Day 2-3 — Fix #1: N+1 elimination.
- Add
select_related()for ForeignKey relations used in serialization. - Add
prefetch_related()for ManyToMany or reverse ForeignKey. - Target: reduce queries per request from 18-30 down to 3-5.
- Measure P95 before and after.
Day 3-4 — Fix #2: Index audit.
- EXPLAIN ANALYZE on top 3 slow queries.
- Add indexes on filter + sort columns. Likely candidates:
is_active,category_id,created_at,location(if geo-query). - Use
CREATE INDEX CONCURRENTLYto avoid downtime. - Measure P95 before and after each index.
Day 5 — Fix #3: Redis caching for search.
- Cache popular filter combinations (top 20% of filters = 80% of requests).
- Key: hash of query params. TTL: 5 minutes (accepting slight staleness).
- Invalidation: on relevant data updates.
- Measure cache hit rate + P95 before and after.
Day 6-7 — Measurement + polish.
- Verify P95 at target. If not, identify new bottleneck (usually shifts after first 3 fixes).
- Load test at peak traffic levels.
- Ship gradually behind feature flag; monitor.
Expected Impact Per Fix
| Fix | Before P95 | After P95 | Delta |
|---|---|---|---|
| Baseline | 1,800ms | — | — |
| Fix #1: N+1 | 1,800ms | ~900-1,100ms | -40-50% |
| Fix #2: Indexes | ~1,000ms | ~600-700ms | -30-40% |
| Fix #3: Caching | ~650ms | ~250-350ms | -45-60% on hits |
Projected final P95: 250-350ms. Target met.
What To Leave Alone
1. Python-level micro-optimizations. Using a generator instead of list comprehension saves microseconds. Not the bottleneck.
2. Switching ORMs / frameworks. 'Django is slow' is wrong — your Django is slow because of N+1, not Django itself.
3. Frontend rendering. Your profiling shows backend is dominant. Don't optimize Lighthouse scores now.
4. Rewriting any code. Rewrites don't fix database issues. Same code with proper queries fixes them.
Stay focused. Sprint is 1 week. Scope discipline wins.
Measurement After Each Fix
Dashboard to maintain throughout week:
- P50 / P95 / P99 latency on /search endpoint
- Queries per request (dev metric, from debug toolbar samples)
- Cache hit rate (once caching added)
- DB CPU utilization
- Error rate (ensure optimizations don't introduce bugs)
Screenshot before + after each fix. Include in the end-of-week retrospective.
Key Takeaways
- 62% of time in DB = N+1 + missing indexes are almost certainly the top 2 bottlenecks. Your profile data points the way; don't guess elsewhere.
- Fix in order: N+1 → indexes → caching. Fixes after that are diminishing returns within your 1-week sprint.
- Measure P95 before + after each fix. If N+1 fix doesn't halve P95, the diagnosis is wrong — re-profile before proceeding.
Common use cases
- Web app latency complaints from users
- Backend API response time optimization
- Database query optimization
- Mobile app cold-start / memory issues
- Build / CI pipeline slowness
- Post-launch when product works but is slow
- Pre-scale-up performance audit
Best AI model for this
Claude Opus 4 or Sonnet 4.5. Performance reasoning benefits from top-tier.
Pro tips
- Measure before optimizing. Always. Guessing at bottlenecks wastes 80% of optimization effort.
- The slowest thing is usually ONE thing. Fix it; the rest doesn't matter.
- Database > everything else in web apps. If you haven't looked at SQL, you haven't diagnosed.
- P95 / P99 matter more than averages. Slow outliers destroy user experience.
- Micro-optimizations (faster loops, better algorithms) rarely matter at web-app scale. I/O does.
- After fixing a bottleneck, re-profile. The new bottleneck is different.
Customization tips
- Run EXPLAIN ANALYZE on your top 5 queries before writing any optimization code. You'll know exactly what to target.
- Test N+1 fixes with production-sized data. Fixes that work on 100 rows may still N+1 at 50k.
- For cache TTL: start conservative (1-5 min). Relax later if data allows. Stale caches cause subtle bugs.
- Load-test BEFORE declaring victory. Normal traffic != peak traffic. Many optimizations regress under load.
- Save the before/after profile comparison. Useful for future architecture decisions and for justifying performance work to stakeholders.
Variants
Web Backend Mode
For API response time. Database + caching heavy.
Frontend Mode
For browser performance. Rendering + bundle + network.
Mobile App Mode
For mobile apps. Cold start + memory + battery.
Frequently asked questions
How do I use the Performance Optimization Plan prompt?
Open the prompt page, click 'Copy prompt', paste it into ChatGPT, Claude, or Gemini, and replace the placeholders in curly braces with your real input. The prompt is also launchable directly in each model with one click.
Which AI model works best with Performance Optimization Plan?
Claude Opus 4 or Sonnet 4.5. Performance reasoning benefits from top-tier.
Can I customize the Performance Optimization Plan prompt for my use case?
Yes — every Promptolis Original is designed to be customized. Key levers: Measure before optimizing. Always. Guessing at bottlenecks wastes 80% of optimization effort.; The slowest thing is usually ONE thing. Fix it; the rest doesn't matter.
Explore more Originals
Hand-crafted 2026-grade prompts that actually change how you work.
← All Promptolis Originals