⚡ Promptolis Original · Coding & Development
🤖 AI Coding Agent Workflow
The prompt + loop structure for getting 10x out of Claude Code / Cursor / AI agents — without the 'spins forever, ships bad code' failure mode most people hit.
Why this is epic
Most developers use AI coding agents as autocomplete — missing 80% of the value. This Original teaches the 4-loop structure (spec → agent run → review → refine) that senior engineers who 10x'd their output actually use.
Names the 5 failure modes (underspecified prompts, no test-first, accepting first output, not checking dependencies, AI-generated over-engineering) — with specific counter-patterns for each.
Distinguishes 3 agent-use patterns (pair-programming, delegated-task, bulk-refactor) — each needs different prompt structure and oversight.
The prompt
Example: input → output
Here's how this prompt actually performs. Real input below, real output from Claude Opus 4.
<task>Add a new REST API endpoint to our backend: POST /api/v1/reports/export that accepts a date range + format (csv/pdf) and returns a job ID for async generation. Need to integrate with our existing Celery queue.</task> <your-role>Senior engineer, 7 years experience.</your-role> <tools>Claude Code.</tools> <codebase-context>Python/Django, PostgreSQL, Celery + Redis. Existing API patterns use DRF. ~50k LOC codebase. 40% test coverage.</codebase-context> <time-pressure>Sprint task. 2-3 day window including PR review.</time-pressure>
AI Agent Workflow: Report export endpoint with async Celery job
An AI coding agent workflow is a structured 4-loop pattern (spec → run → review → refine) that extracts senior-engineer-level output from AI assistants while avoiding the 'fluent bad code' trap. Based on observed productivity patterns across 2 years of AI-assisted development, engineers using structured 4-loop workflows ship 3-5x faster than those using AI as autocomplete — and produce code that passes review at similar rates to manually-written code. Your task is ideal for delegated-task pattern because it has clear inputs, outputs, and integration points.
Pattern Classification
Delegated task. You have a clear goal (new endpoint), known integration points (DRF, Celery), and bounded scope. You can specify it fully and let the agent implement with less back-and-forth. Not pair-programming (too much typing for well-defined work). Not bulk-refactor (single feature).
The 4-Loop Structure
Loop 1 — Spec (15-30 min): Write the full specification BEFORE opening Claude Code. Include existing patterns to follow, test expectations, integration requirements.
Loop 2 — Agent run (30-60 min): Give the spec; let agent draft endpoint + tests + Celery task. Ask clarifying questions if needed.
Loop 3 — Review (20-30 min): Review code against spec. Run tests. Check for hallucinations.
Loop 4 — Refine (30-60 min): Iterate with specific feedback. Fix what's wrong. Expand test coverage if thin.
Total: 2-3 hours for MVP, vs. 6-8 hours hand-written. Save 3-5 hours.
The Specification Prompt
Context: I'm working in a Django + DRF + Celery codebase. Existing API patterns use:
- DRF ViewSets with serializers
- Celery tasks in tasks.py per app
- Jobs table for tracking async work (look at reports/models.py)
Task: Add POST /api/v1/reports/export endpoint.
Requirements:
1. Accept JSON body: { "start_date": "YYYY-MM-DD", "end_date": "YYYY-MM-DD", "format": "csv" | "pdf" }
2. Validate: end_date >= start_date, date range <= 1 year
3. Authenticated endpoint (user must be logged in)
4. Create a ReportJob record with status="pending"
5. Dispatch Celery task generate_report_task(job_id)
6. Return 202 Accepted with { "job_id": "<uuid>", "status": "pending" }
Integration points:
- ReportJob model exists in reports/models.py — use existing pattern
- Celery task should update job status (pending → running → completed/failed)
- Don't implement actual CSV/PDF generation yet — stub with TODO + log statement
Tests required (TDD):
- Unit test: endpoint validation (missing fields, invalid dates, end < start)
- Unit test: endpoint creates job + dispatches task
- Integration test: task updates job status
- All tests using pytest + pytest-django conventions (match existing test/ structure)
Please:
1. First, write the tests (failing)
2. Then write the implementation
3. Show me both in your response
4. List any assumptions you're making
The Review Checklist
After agent produces code:
- [ ] Tests actually test the stated behavior. Not just 'tests exist' — do they verify the requirements?
- [ ] All imports resolve. No hallucinated packages. Check
drf_standardized_errorsactually exists in your requirements. - [ ] Follows existing patterns. Does the ViewSet match your other endpoints? If not, flag.
- [ ] Error handling matches codebase conventions. Django exceptions vs. DRF exceptions — consistent?
- [ ] Celery task handles failure gracefully. Retry logic? DLQ?
- [ ] No over-engineering. Did agent add features you didn't ask for (pagination, filtering, extras)? Usually delete those.
- [ ] Migration needed? If ReportJob model is modified, did agent generate migration?
- [ ] Security: endpoint auth correctly enforced? No IDOR? Rate limiting needed?
The 5 Failure Modes (Avoid These)
1. Underspecified prompt. 'Add an endpoint' gives you a generic implementation that doesn't fit your codebase. Spec was your longest time investment — do it well.
2. Accepting first output. Review EVERY file. AI produces plausible code that subtly doesn't match your conventions. Refinement is not optional.
3. Skipping test-first. AI can write tests + code faster than you can, but only if you ASK for tests. Otherwise it often skips them or produces trivial ones.
4. Trusting imports. AI occasionally imports packages that don't exist or don't exist at the version you're using. Run `pip install -r requirements.txt` + run tests — catches this immediately.
5. AI over-engineering. Agent may add pagination, filtering, caching, all not in your spec. Delete the unrequested additions.
The Test-First Protocol
Ask for tests BEFORE implementation in the same prompt:
- 'First, write the tests (failing)'
- 'Then write the implementation to pass them'
This is TDD-like and produces 2-3x better test coverage than implementation-first prompts. Even if you iterate, the tests define the target and shift the agent into a known-good pattern.
If you've already gotten implementation-first output, refuse to accept without tests — 'Now write the tests that verify this implementation covers requirements X, Y, Z.'
Iteration Strategy
When to refine:
- Output matches spec but style is off → refine with specific feedback ('match the ViewSet pattern in reports/views.py — particularly the error response shape')
- Tests missing edge cases → 'Add tests for: empty request body, invalid format string, dates in wrong order'
- Imports wrong → 'The package X isn't in our requirements; use Y pattern from reports/views.py instead'
When to start over (new thread):
- Output diverges significantly from what you asked
- Agent is confused about your codebase patterns after 2 refinements
- Accumulated refinements are making the context muddy
Rule of thumb: if 3 rounds of refinement don't get you close, start over with a better spec.
Your Time-Budget
| Phase | Time | Without AI | With AI |
|---|---|---|---|
| Spec | 30 min | 30 min | 30 min |
| Implementation | — | 4-5 hrs | 1 hr (agent) + 30 min review + 30 min refine |
| Tests | — | 2 hrs | (included in agent run) |
| Integration | 1 hr | 1 hr | 1 hr |
| PR cleanup | 30 min | 30 min | 30 min |
| Total | — | 8-9 hrs | 4 hrs |
50% time savings. Better spec = bigger savings.
Key Takeaways
- Spec writing is your highest-leverage activity. 30 min on spec saves 3 hours on implementation.
- Always ask for tests FIRST. Tests-then-implementation produces better output than reversed.
- Review against the spec. Not against 'does it look right' — does it match what you asked for?
- Iterate 2-3 times max per thread. If it's not landing, start over with a tighter spec.
Common use cases
- Developers new to AI coding agents learning to maximize productivity
- Senior engineers structuring team AI workflows
- Team leads training junior engineers on AI-assisted coding
- Code reviewers evaluating AI-generated code
- Building products faster (solo founders, small teams)
- Migrating legacy codebases with AI assistance
- Writing tests at scale with AI
Best AI model for this
Claude Opus 4 or Sonnet 4.5. Meta-prompting benefits from top-tier.
Pro tips
- Write the spec BEFORE engaging the agent. AI without specification = AI guessing.
- Always test-first. Ask for tests before code. Or give the test, ask for implementation.
- Never accept first output for anything complex. Iterate 2-3 times with specific feedback.
- Check imports and dependencies. AI hallucinates packages that don't exist.
- For bulk refactors, test on 1 file first; extrapolate to 100 only after verifying the pattern.
- Track where AI saves you time vs. where it costs. After 30 days, you'll know your personal ROI pattern.
Customization tips
- Keep a 'spec template' for common task types (new endpoint, new feature, bug fix). Reusable specs compound over weeks.
- After every AI task, note what worked + what needed refining. After 30 tasks, you'll have your personal playbook.
- For teams: share effective specs. Good specs transfer; individual prompts don't.
- Don't use AI for work where you can't verify output. Areas where you're weak (e.g., security, infra) require extra review rigor.
- Expect AI productivity gain to flatten around 40-60% savings for most tasks. Beyond that, specification quality dominates, not AI capability.
Variants
Pair-Programming Mode
For interactive back-and-forth coding. Tighter loops, smaller chunks.
Delegated-Task Mode
For 'go implement feature X' style. Longer spec, more autonomous run.
Bulk-Refactor Mode
For applying changes across many files. Pattern + verification.
Frequently asked questions
How do I use the AI Coding Agent Workflow prompt?
Open the prompt page, click 'Copy prompt', paste it into ChatGPT, Claude, or Gemini, and replace the placeholders in curly braces with your real input. The prompt is also launchable directly in each model with one click.
Which AI model works best with AI Coding Agent Workflow?
Claude Opus 4 or Sonnet 4.5. Meta-prompting benefits from top-tier.
Can I customize the AI Coding Agent Workflow prompt for my use case?
Yes — every Promptolis Original is designed to be customized. Key levers: Write the spec BEFORE engaging the agent. AI without specification = AI guessing.; Always test-first. Ask for tests before code. Or give the test, ask for implementation.
Explore more Originals
Hand-crafted 2026-grade prompts that actually change how you work.
← All Promptolis Originals