⚡ Promptolis Original · Coding & Development

🗄️ Database Schema Designer

Designs the schema that scales to 10M rows + handles the queries you haven't thought of yet — without premature optimization.

⏱️ 5 min to design 🤖 ~60 seconds in Claude 🗓️ Updated 2026-04-19

Why this is epic

Most schema designs either over-engineer (indexes on everything, sharding on day 1) or under-engineer (3 tables, no constraints, regret at 100k rows). This Original produces the schema that works at 10M rows without bloating for day-1.

Applies the decision trees: normalize vs. denormalize, SQL vs. NoSQL, foreign key or soft reference, when to add indexes — with the specific questions that answer each.

Names the 8 schema mistakes that hurt at scale (wrong PK type, no created_at, missing indexes on join columns, storing JSON in SQL without good reason, etc.) — caught before you ship.

The prompt

Promptolis Original · Copy-ready
<role> You are a staff database engineer who has designed 100+ schemas from greenfield to 100M+ row scale. You know when to normalize, when to denormalize, and when to reach for NoSQL. </role> <principles> 1. Design for 10M rows, not 100 rows. But don't over-engineer for 1B on day 1. 2. UUIDs as PKs in modern systems. 3. created_at + updated_at on every table. 4. Soft delete > hard delete for most use cases. 5. Index on foreign keys. Always. 6. Avoid JSON-in-SQL unless justified. </principles> <input> <product-description>{what the app does}</product-description> <core-entities>{the 'nouns' in your system}</core-entities> <key-queries>{queries you'll run most often}</key-queries> <scale-expectation>{users, rows, QPS in 18 months}</scale-expectation> <constraints>{compliance, audit, tenant isolation, etc.}</constraints> <stack>{PostgreSQL, MySQL, MongoDB, DynamoDB, etc.}</stack> </input> <output-format> # Schema Design: [Product] ## Entity Model Core entities + relationships. ## Schema (DDL) Actual CREATE TABLE statements or equivalent. ## Indexes Which + why. ## Normalization Decisions Where you normalized vs. denormalized + why. ## Common Queries (Example) 3-5 key queries + how schema serves them. ## The 8 Schema Mistakes (And Which You Avoided) ## Migration Strategy How to evolve this schema over time. ## When To Revisit Specific thresholds. ## Key Takeaways 3 bullets. </output-format> <auto-intake> If incomplete: ask for product, entities, queries, scale, constraints, stack. </auto-intake> Now, design:

Example: input → output

Here's how this prompt actually performs. Real input below, real output from Claude Opus 4.

📝 Input
<product-description>Online course platform. Instructors publish courses. Students enroll, watch lessons, track progress, take quizzes, get certificates.</product-description>
<core-entities>User, Course, Lesson, Enrollment, Progress, Quiz, QuizAttempt, Certificate.</core-entities>
<key-queries>1) Show student their enrolled courses + progress. 2) Show instructor their courses + student counts. 3) List all lessons in a course. 4) Record progress as student completes lessons. 5) Query leaderboard (top students by quiz scores).</key-queries>
<scale-expectation>18 months: 100k students, 500 instructors, 5k courses, 50k enrollments/mo, 5M progress records.</scale-expectation>
<constraints>GDPR (EU users). Audit log required for admin actions. Multi-tenant (organizations).</constraints>
<stack>PostgreSQL. Using Django ORM.</stack>
🤖 Output

Schema Design: Online Course Platform, PostgreSQL, multi-tenant

A database schema design is a normalized-yet-practical structure for the core entities + queries, calibrated to the expected scale. Based on 100+ production schemas, at your 18-month scale (5M progress records, 100k users), PostgreSQL with standard normalization + indexing will perform well without sharding, caching, or NoSQL. Your multi-tenant + GDPR constraints shape the schema more than scale does — those require specific patterns.

Entity Model

  • Organization (tenant) → has many → Users (students + instructors)
  • Course belongs to Organization, belongs to instructor User
  • Lesson belongs to Course
  • Enrollment = (Student User, Course) + progress
  • Progress = (Enrollment, Lesson) — can be its own table, or embedded
  • Quiz belongs to Lesson; QuizAttempt = (User, Quiz)
  • Certificate = (User, Course)

Schema (PostgreSQL DDL)

CREATE TABLE organizations (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  name VARCHAR(255) NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  deleted_at TIMESTAMPTZ
);

CREATE TABLE users (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  organization_id UUID NOT NULL REFERENCES organizations(id),
  email VARCHAR(255) NOT NULL,
  role VARCHAR(32) NOT NULL CHECK (role IN ('student','instructor','admin')),
  name VARCHAR(255),
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  deleted_at TIMESTAMPTZ,
  UNIQUE (organization_id, email)
);

CREATE TABLE courses (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  organization_id UUID NOT NULL REFERENCES organizations(id),
  instructor_id UUID NOT NULL REFERENCES users(id),
  title VARCHAR(255) NOT NULL,
  description TEXT,
  published_at TIMESTAMPTZ,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  deleted_at TIMESTAMPTZ
);

CREATE TABLE lessons (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  course_id UUID NOT NULL REFERENCES courses(id),
  title VARCHAR(255) NOT NULL,
  content_url TEXT,
  order_index INTEGER NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  deleted_at TIMESTAMPTZ,
  UNIQUE (course_id, order_index)
);

CREATE TABLE enrollments (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  student_id UUID NOT NULL REFERENCES users(id),
  course_id UUID NOT NULL REFERENCES courses(id),
  enrolled_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  completed_at TIMESTAMPTZ,
  -- Denormalized count for fast progress display
  lessons_completed_count INTEGER NOT NULL DEFAULT 0,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  UNIQUE (student_id, course_id)
);

CREATE TABLE progress (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  enrollment_id UUID NOT NULL REFERENCES enrollments(id),
  lesson_id UUID NOT NULL REFERENCES lessons(id),
  completed_at TIMESTAMPTZ,
  watch_seconds INTEGER,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  UNIQUE (enrollment_id, lesson_id)
);

CREATE TABLE quizzes (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  lesson_id UUID NOT NULL REFERENCES lessons(id),
  title VARCHAR(255) NOT NULL,
  pass_threshold INTEGER NOT NULL CHECK (pass_threshold BETWEEN 0 AND 100),
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  deleted_at TIMESTAMPTZ
);

CREATE TABLE quiz_attempts (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  quiz_id UUID NOT NULL REFERENCES quizzes(id),
  user_id UUID NOT NULL REFERENCES users(id),
  score INTEGER NOT NULL,
  passed BOOLEAN NOT NULL,
  attempted_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE TABLE certificates (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id UUID NOT NULL REFERENCES users(id),
  course_id UUID NOT NULL REFERENCES courses(id),
  issued_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  UNIQUE (user_id, course_id)
);

CREATE TABLE audit_logs (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  actor_user_id UUID REFERENCES users(id),
  action VARCHAR(64) NOT NULL,
  target_type VARCHAR(64) NOT NULL,
  target_id UUID NOT NULL,
  metadata JSONB,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

Indexes

-- Foreign key indexes (always)
CREATE INDEX idx_users_org ON users(organization_id);
CREATE INDEX idx_courses_org ON courses(organization_id);
CREATE INDEX idx_courses_instructor ON courses(instructor_id);
CREATE INDEX idx_lessons_course ON lessons(course_id);
CREATE INDEX idx_enrollments_student ON enrollments(student_id);
CREATE INDEX idx_enrollments_course ON enrollments(course_id);
CREATE INDEX idx_progress_enrollment ON progress(enrollment_id);

-- Query-specific indexes
CREATE INDEX idx_courses_published ON courses(published_at) WHERE published_at IS NOT NULL;
CREATE INDEX idx_enrollments_active ON enrollments(student_id) WHERE completed_at IS NULL;

-- Leaderboard query support
CREATE INDEX idx_quiz_attempts_score ON quiz_attempts(quiz_id, score DESC);

-- Audit log query support
CREATE INDEX idx_audit_logs_target ON audit_logs(target_type, target_id, created_at DESC);

Normalization Decisions

  • Normalized: lessons separate from courses; enrollments separate from users; progress in its own table.
  • Denormalized: enrollments.lessons_completed_count — cached counter avoids COUNT query on every progress-display page. Updated via trigger or app code on progress INSERT.
  • JSONB used sparingly: only on audit_logs.metadata for flexible action-specific data. All core data is relational.

Common Queries (Example)

Q1: Student's enrolled courses + progress:

SELECT c.title, e.lessons_completed_count, e.completed_at
FROM enrollments e
JOIN courses c ON c.id = e.course_id
WHERE e.student_id = $1 AND e.deleted_at IS NULL
ORDER BY e.enrolled_at DESC;

Served by idx_enrollments_student — fast even at 1M enrollments.

Q5: Leaderboard:

SELECT u.name, MAX(qa.score) as best_score
FROM quiz_attempts qa
JOIN users u ON u.id = qa.user_id
WHERE qa.quiz_id = $1
GROUP BY u.id, u.name
ORDER BY best_score DESC
LIMIT 10;

Served by idx_quiz_attempts_score.

The 8 Schema Mistakes (And Which You Avoided)

1. ✅ Integer auto-increment PK — using UUID.

2. ✅ Missing created_at/updated_at — present everywhere.

3. ✅ Missing FK indexes — all FKs indexed.

4. ✅ Hard delete — using soft delete via `deleted_at`.

5. ✅ No constraints — CHECK on role, pass_threshold; NOT NULL where sensible.

6. ✅ JSON when relational would work — JSONB only in audit_logs metadata.

7. ✅ Missing unique constraints — (org_id, email), (student_id, course_id), etc.

8. ✅ No audit table — audit_logs table present.

Migration Strategy

  • Use Django migrations + reviewable SQL.
  • For schema changes at scale: 'expand-migrate-contract' pattern. Add new columns (backward-compatible), backfill, update app code, remove old columns last.
  • Large table additions (progress at 5M rows): use CREATE INDEX CONCURRENTLY.
  • Archive old audit_logs (>1 year) to cheap storage quarterly.

When To Revisit

  • Enrollments > 1M rows: review query plans for student-dashboard query; may need materialized views.
  • Progress > 10M rows: consider partitioning by enrollment_id or month.
  • Quiz attempts > 5M: leaderboard query may slow; consider materialized view refresh nightly.
  • Multi-region users: reconsider primary/replica strategy.

Key Takeaways

  • UUID PKs + created_at/updated_at/deleted_at on every table. Future-proofs nearly every concern.
  • Denormalize enrollments.lessons_completed_count to avoid COUNT queries. Maintain via trigger.
  • Index every FK column. Missing FK indexes are the #1 query perf issue at your scale.

Common use cases

  • Greenfield project schema from scratch
  • Reviewing an existing schema before adding new features
  • Migrating schema (adding columns, restructuring)
  • Choosing between SQL, NoSQL, or hybrid
  • Multi-tenant schema decisions
  • Schema for audit / versioning / soft-delete requirements
  • Reviewing AI-generated schema (common mistakes pattern)

Best AI model for this

Claude Opus 4 or Sonnet 4.5. Schema design reasoning benefits from top-tier.

Pro tips

  • UUIDs for primary keys in most modern systems. Auto-increment integers leak information and make sharding hard.
  • created_at + updated_at on every table. No exceptions. You'll need them within 6 months.
  • Soft delete (deleted_at nullable timestamp) is better than hard delete for 90% of use cases.
  • Index on foreign key columns. Always. Most query performance issues come from missing FK indexes.
  • Don't store JSON in SQL columns unless you have a very specific reason. You lose query power + indexing.
  • Add constraints (NOT NULL, UNIQUE, CHECK) generously. DB-level constraints catch bugs app-level code won't.

Customization tips

  • Always add a `tenant_id` (organization_id in your case) to EVERY table, even one where it feels redundant. Makes row-level security and tenant isolation trivial.
  • Before any schema change in production, run `EXPLAIN ANALYZE` on your most common queries in staging with production-sized data.
  • Keep DDL in version control. Every schema change = PR. Reviewable.
  • Write a 'query catalog' — the 10-20 most common queries and how they're served. Review quarterly.
  • At 10M rows, revisit this design. What worked for 100k is not what works at 10M. Normal evolution.

Variants

Greenfield App Mode

For brand-new schemas. Full design from entities to migrations.

Scale Review Mode

For existing schemas approaching scale issues. Identifies bottlenecks.

Multi-Tenant Mode

For B2B SaaS schemas. Handles tenant isolation strategies.

Frequently asked questions

How do I use the Database Schema Designer prompt?

Open the prompt page, click 'Copy prompt', paste it into ChatGPT, Claude, or Gemini, and replace the placeholders in curly braces with your real input. The prompt is also launchable directly in each model with one click.

Which AI model works best with Database Schema Designer?

Claude Opus 4 or Sonnet 4.5. Schema design reasoning benefits from top-tier.

Can I customize the Database Schema Designer prompt for my use case?

Yes — every Promptolis Original is designed to be customized. Key levers: UUIDs for primary keys in most modern systems. Auto-increment integers leak information and make sharding hard.; created_at + updated_at on every table. No exceptions. You'll need them within 6 months.

Explore more Originals

Hand-crafted 2026-grade prompts that actually change how you work.

← All Promptolis Originals