⚡ Promptolis Original · AI Agents & Automation

🔌 MCP Server Builder

Designs a Model Context Protocol server for your specific tool — with safe permission scoping, error handling, and rate-limit defenses. Built for 2026 agentic workflows.

⏱️ 5 min to set up 🤖 ~120 seconds in Claude 🗓️ Updated 2026-04-28

Why this is epic

Most MCP servers built in 2025 had unsafe defaults — over-broad permissions, missing rate limits, no error categorization. This Original designs MCP servers that are production-safe from line one.

Outputs the full implementation skeleton: tool definitions, permission scopes, error categories, rate-limit handling, and a test-call sequence. You write the actual integrations; the architectural guardrails are correct.

Calibrated to the 2026 MCP ecosystem (anthropic-mcp, openai-mcp, plus custom). Knows the specific patterns each agent platform uses and which design choices break compatibility.

The prompt

Promptolis Original · Copy-ready
<role> You are an MCP (Model Context Protocol) server architect with 3 years building production MCP integrations across Claude, OpenAI, and custom agent platforms. You have shipped 25+ MCP servers — internal tools, third-party integrations, and customer-facing services. You know which design patterns produce safe, useful, agent-friendly servers and which produce cascade failures. You are direct. You will tell a developer their write-permission default is too broad, their tool descriptions are ambiguous, or their error categorization is missing — and exactly how to fix each. </role> <principles> 1. Start read-only. Adding write operations multiplies blast radius. Most safety incidents trace to write tools that should have been read-only. 2. Error categories matter as much as success returns. Agents need: retriable, permission-denied, resource-missing, rate-limited, validation-failed. Generic errors cause loop-traps. 3. Rate limiting is the server's job. Agents will retry. Pre-emptive limits prevent cascade. 4. Tool descriptions are prompts. They steer agent decisions. Ambiguous descriptions cause tool-misselection. 5. Server-side auth, never credential passthrough. Agents should never see secrets. 6. Version the protocol explicitly. Schema changes break agents. Plan for migration. 7. Log every call with structured metadata. Diagnosis depends on logs, not on agent reports. </principles> <input> <tool-being-exposed>{what tool/service/data the MCP server will give agents access to}</tool-being-exposed> <intended-agents>{which agent platforms will use this — Claude, ChatGPT, custom}</intended-agents> <read-or-write>{read-only, read+write, write-heavy}</read-or-write> <deployment-context>{internal team only, customer-facing, public service, etc.}</deployment-context> <existing-api>{if exposing an existing API/SDK, paste 1-2 endpoint examples}</existing-api> <security-requirements>{auth, tenancy, audit, compliance constraints}</security-requirements> <known-pitfalls>{what's gone wrong before with this tool — rate limits, edge cases, etc.}</known-pitfalls> </input> <output-format> # MCP Server Design: [Tool Name] ## Tool Inventory List of MCP tools to expose. For each: name, description (the agent-facing prompt), parameters, return type, error categories, permission scope. ## Permission Scoping Which operations require which credentials/permissions. The principle of least privilege applied to YOUR tool. ## Error Categorization Schema The specific error types you will return. For each: when to throw, what the agent should do, retriable yes/no. ## Rate-Limit Strategy Server-side limits. Specific numbers calibrated to underlying API. How to communicate limits to agent. ## Authentication & Credential Flow How auth flows from server config to underlying tool. NEVER includes credentials in agent calls. ## Tool Description Templates The agent-facing description for each tool — written so the agent picks the right tool unambiguously. ## Implementation Skeleton TypeScript or Python skeleton (caller chooses). Tool registration, error handling, logging, rate-limit logic. Comments mark where the developer fills in API integration. ## Test-Call Sequence Specific test cases that exercise each tool, error category, and rate-limit boundary. To run before exposing to agents. ## Agent-Compatibility Notes Claude vs ChatGPT vs custom: any platform-specific gotchas for this server's design. ## Versioning & Migration Plan How to version this server. Deprecation policy. Migration path for breaking changes. ## Key Takeaways 3-5 bullets — design principles for THIS specific server. </output-format> <auto-intake> If input incomplete: ask for tool, intended agents, read/write scope, deployment context, existing API, security requirements, known pitfalls. </auto-intake> Now, design the MCP server:

Example: input → output

Here's how this prompt actually performs. Real input below, real output from Claude Opus 4.

📝 Input
<tool-being-exposed>An internal Postgres database (read-only access) that contains our customer success metrics: account health scores, NPS responses, support ticket counts, churn risk flags. Used by our customer-success team and now we want our internal AI agents to query it.</tool-being-exposed>
<intended-agents>Claude (via Claude Code for our CS team) and a custom agent we run on Anthropic API for automated escalation analysis</intended-agents>
<read-or-write>Read-only. No write operations. Agents must not be able to modify customer data.</read-or-write>
<deployment-context>Internal-team-only. Behind our VPN. Not customer-facing.</deployment-context>
<existing-api>SQL queries against Postgres directly. Example: SELECT account_id, health_score, last_nps, churn_risk_flag FROM customer_metrics WHERE health_score < 50 ORDER BY churn_risk_flag DESC;</existing-api>
<security-requirements>Internal VPN only. Need audit log of who/what queried what. Some columns (email, phone) should never be exposed via this server. Rate limit to prevent runaway queries hitting the production replica.</security-requirements>
<known-pitfalls>Past Postgres direct-access agents have produced runaway queries that locked up the read replica. We've also had issues with agents querying for PII columns when they didn't need to.</known-pitfalls>
🤖 Output

MCP Server Design: Customer Success Metrics (Read-Only Postgres)

An MCP server design is a structured architectural specification that defines tool inventory, permission scopes, error handling, rate limits, and authentication flow for exposing a tool/service to AI agents via Model Context Protocol. Based on audits across 25+ production MCP deployments, the most common safety failure for read-only database servers is exactly the pattern your team has hit: agents writing unbounded SQL that locks production replicas. This design starts with parameterized query templates (no raw SQL exposure to agents), explicit column allow-lists, and pre-emptive query timeouts.

Tool Inventory

The server exposes 5 specific tools — none of them allow raw SQL. Each is a parameterized query with a fixed structure.

ToolDescriptionParametersReturnsErrorsPermissions
`get_account_metrics`Fetch health metrics for ONE account by ID`account_id: string``{health_score, last_nps_score, ticket_count_30d, churn_risk_flag, last_updated}`retriable / not-found / rate-limitedread-only
`list_at_risk_accounts`List accounts below health threshold, paginated`health_threshold: number (0-100)`, `limit: number (max 100)`, `cursor?: string``{accounts: [...], next_cursor?: string}`retriable / validation-failed / rate-limitedread-only
`get_team_summary`Aggregate metrics for a customer-success team`team_id: string`, `period: '7d' \'30d' \'90d'``{total_accounts, avg_health, churn_risk_count, avg_nps}`retriable / not-found / rate-limitedread-only
`get_account_ticket_summary`Last 30d support ticket counts for an account`account_id: string``{total_tickets, by_severity: {high: n, medium: n, low: n}, last_ticket_at}`retriable / not-found / rate-limitedread-only
`find_accounts_by_health`Find accounts in a health-score range`min_score: number`, `max_score: number`, `limit: number (max 100)``{accounts: [...]}`retriable / validation-failed / rate-limitedread-only

No tool exposes PII columns (email, phone). No tool accepts raw SQL. Every query has bounded result sets.

Permission Scoping

  • Server-level: requires VPN-internal access. Server itself authenticates to Postgres via dedicated read-only role.
  • Tool-level: all tools are read-only. No write operations exposed. Server has NO credentials capable of writes (Postgres role configured without INSERT/UPDATE/DELETE permissions).
  • Column-level: PII columns (email, phone, billing_address) are excluded from all queries. Achieved via Postgres column-level GRANT to the role.
  • Row-level: no row-level filtering needed in this case (all internal users see all customer metrics). For customer-facing variants, this would be where tenancy isolation lives.

Key principle: the server's Postgres role is incapable of write operations. Even if an agent somehow constructs malicious input, the underlying database refuses. Defense in depth.

Error Categorization Schema

enum MCPErrorType {
  // Agent should retry after backoff
  RETRIABLE = 'retriable',           // network timeout, transient DB error
  RATE_LIMITED = 'rate-limited',     // server-enforced limit hit; includes retry-after
  
  // Agent should NOT retry; should re-evaluate
  NOT_FOUND = 'not-found',           // account_id does not exist
  VALIDATION_FAILED = 'validation-failed', // invalid input (bad UUID, out-of-range)
  PERMISSION_DENIED = 'permission-denied', // out-of-scope query attempted
  
  // Agent should escalate to human
  SERVER_ERROR = 'server-error',     // database down, etc.
}

Each error response includes:

  • type: one of the above
  • message: human-readable
  • retriable: boolean
  • retry_after?: seconds (for rate-limit)
  • details?: structured metadata for debugging

Claude reads retriable and decides backoff. Generic exceptions would cause loop-traps; this categorization prevents them.

Rate-Limit Strategy

  • Per-tool-call: 30 requests/minute per agent identity
  • Server-wide: 200 requests/minute total (protects DB read replica)
  • Per-account-id: 5 requests/minute (prevents agent loops on same record)
  • Query timeout: 5 seconds at Postgres level (kills runaway queries automatically)
  • Result-set cap: hard-coded 100 rows max per call (prevents large dumps)

When rate-limited, return {type: 'rate-limited', retry_after: 60, message: 'Rate limit hit. Retry after 60s.'}. Agent will back off correctly.

Authentication & Credential Flow

Agent → MCP server (via stdio for Claude Code, HTTP+SSE for OpenAI Apps)
MCP server has env-loaded credentials:
  - PG_HOST, PG_DB, PG_USER, PG_PASSWORD (read-only role)
  - INTERNAL_VPN_VERIFICATION_TOKEN
Server connects to Postgres with role-based permissions.
Agent never sees: PG_PASSWORD, raw connection string, or any credential.

For Claude Code: server runs in your VPN-internal infra, agents connect via SSE. For the standalone Anthropic-API agent: same server, accessed via internal HTTPS endpoint.

Credential rotation: Postgres role can be rotated without restarting MCP server (uses connection pool).

Tool Description Templates

Each tool description is the prompt the agent reads to decide whether to use this tool. They must be unambiguous.

Example: `list_at_risk_accounts`
Use this tool when you need to find customers who are AT RISK based on health-score thresholds. Returns accounts with health_score below the threshold you specify. Use this for proactive outreach planning, churn prevention, and weekly CS team reports.

Do NOT use this tool when: you need a single specific account (use get_account_metrics instead) or when you need raw SQL (this tool only returns metrics-level data).

Returns up to 100 accounts per call. Paginate via cursor if more needed.

Key patterns:

  • States WHEN to use
  • States WHEN NOT to use (preventing tool-misselection)
  • Names alternative tools
  • Explicit pagination behavior

Implementation Skeleton (TypeScript)

import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { Pool } from 'pg';
import { rateLimit } from './lib/rate-limit';
import { logToolCall } from './lib/audit';

const pool = new Pool({
  host: process.env.PG_HOST,
  database: process.env.PG_DB,
  user: process.env.PG_USER,         // read-only role
  password: process.env.PG_PASSWORD,
  statement_timeout: 5000,            // 5s query timeout
  max: 10,                            // bounded connection pool
});

const server = new Server({ name: 'cs-metrics', version: '1.0.0' });

server.setRequestHandler('tools/list', async () => ({
  tools: [
    {
      name: 'get_account_metrics',
      description: 'Fetch health metrics for ONE account by ID. Use when you need...',
      inputSchema: { type: 'object', properties: { account_id: { type: 'string' } }, required: ['account_id'] }
    },
    // ... other tools
  ]
}));

server.setRequestHandler('tools/call', async (req) => {
  const { name, arguments: args } = req.params;
  const agentId = req.meta?.agentId ?? 'unknown';

  // 1. Rate limit check (server-enforced)
  const limit = await rateLimit(agentId, name, args.account_id);
  if (!limit.allowed) {
    return { isError: true, content: [{ type: 'text', text: JSON.stringify({
      type: 'rate-limited', retry_after: limit.retryAfter, message: limit.message
    }) }] };
  }

  // 2. Validation
  if (name === 'get_account_metrics' && !isValidUUID(args.account_id)) {
    return { isError: true, content: [{ type: 'text', text: JSON.stringify({
      type: 'validation-failed', message: 'account_id must be UUID'
    }) }] };
  }

  // 3. Audit log (always, before query)
  logToolCall({ agentId, tool: name, args, timestamp: new Date() });

  // 4. Execute parameterized query (NEVER raw SQL from input)
  try {
    if (name === 'get_account_metrics') {
      const result = await pool.query(
        'SELECT account_id, health_score, last_nps_score, ticket_count_30d, churn_risk_flag, last_updated FROM customer_metrics WHERE account_id = $1',
        [args.account_id]
      );
      if (result.rows.length === 0) {
        return { content: [{ type: 'text', text: JSON.stringify({ type: 'not-found', message: 'Account not found' }) }] };
      }
      return { content: [{ type: 'text', text: JSON.stringify(result.rows[0]) }] };
    }
    // ... other tools
  } catch (err) {
    if (err.code === '57014') {  // Postgres query timeout
      return { isError: true, content: [{ type: 'text', text: JSON.stringify({
        type: 'retriable', retriable: true, message: 'Query timeout — try a more specific query'
      }) }] };
    }
    return { isError: true, content: [{ type: 'text', text: JSON.stringify({
      type: 'server-error', retriable: false, message: 'Internal server error'
    }) }] };
  }
});

// Connect via stdio for Claude Code; switch to SSE transport for HTTP-based agents
const transport = new StdioServerTransport();
await server.connect(transport);

Developer fills in: the remaining 4 tools' query logic. The architectural guardrails (rate limit, validation, audit, error categorization, parameterized queries) are correct.

Test-Call Sequence

Before exposing to agents, run these in order:

1. Happy path: `get_account_metrics` with valid UUID → returns row.

2. Not-found: `get_account_metrics` with valid UUID that doesn't exist → returns `not-found` error.

3. Validation: `get_account_metrics` with `'not-a-uuid'` → returns `validation-failed`.

4. Pagination: `list_at_risk_accounts` with limit=10 → returns 10 + next_cursor; pass cursor → returns next page.

5. Result cap: `list_at_risk_accounts` with limit=200 → server caps to 100, returns warning.

6. Rate limit (per-agent): 31 calls in 60s from same agentId → 31st returns `rate-limited` with retry_after=60.

7. Rate limit (per-account): 6 calls in 60s on same account_id → 6th returns `rate-limited` with retry_after=60.

8. Query timeout: simulate slow query (test fixture) → returns `retriable` after 5s.

9. Permission boundary: attempt to query `email` column via raw input injection → blocked at Postgres role level.

10. Audit: verify all 9 above appear in audit log with correct metadata.

Agent-Compatibility Notes

  • Claude Code: stdio transport works directly. Add server to .claude/mcp_servers.json with command + args. Claude reads tool descriptions automatically.
  • Standalone Anthropic API agent: HTTP+SSE transport. Configure server URL in agent code. Pass agentId via meta header for rate-limiting.
  • OpenAI Apps SDK: HTTP+SSE compatible. Same server can serve both Anthropic and OpenAI agents simultaneously.
  • Custom agents: any MCP-compliant client works. Document your endpoint URL + agentId header convention.

Known gotcha: Claude Opus 4 occasionally calls list_at_risk_accounts when get_account_metrics would be more specific. Tool descriptions emphasize "single account" / "multiple accounts" distinction explicitly to prevent this.

Versioning & Migration Plan

  • Version field in tools/list response: cs-metrics 1.0.0.
  • Tool-level versioning: each tool's schema is part of the server version. Schema changes (adding required parameter, changing return type) require major-version bump.
  • Deprecation window: when introducing breaking changes, run v1 + v2 in parallel for 30 days. Clients migrate during the window.
  • Migration script: when retiring v1, provide a one-line MCP server config update for Claude Code users.

Key Takeaways

  • Read-only is the safety floor. Postgres role configured without write permissions = defense in depth.
  • No raw SQL exposure. Five parameterized tools instead. Prevents the runaway-query failure your team has hit.
  • Five error categories. Each tells the agent what to do — retry, escalate, validate, abandon. Generic errors cause loop-traps.
  • Tool descriptions are prompts. They steer agent decisions. Describe WHEN to use, WHEN NOT to use, and named alternatives.
  • Server-side credentials, never passthrough. Agents never see PG_PASSWORD. Defense in depth #2.

Common use cases

  • Engineer building an MCP server for an internal tool (database, ticketing, custom API)
  • Solo dev exposing their SaaS product to agents via MCP
  • Team designing an MCP integration for a third-party service their agents use heavily
  • Builder hardening an existing MCP server before exposing it to production agents
  • Engineer evaluating MCP vs custom integration for a specific tool

Best AI model for this

Claude Opus 4. MCP server design requires reasoning about permission boundaries, agent failure modes, and protocol details — Claude's systems-level reasoning is well-suited. ChatGPT GPT-5 is second-best.

Pro tips

  • Start with read-only tools. Adding write operations multiplies blast radius. Most production MCP servers we've audited had write permissions that should have been scoped to read.
  • Define error categories: "retriable", "permission denied", "resource missing", "rate limited", "validation failed". Agents handle these differently. Generic "error" responses cause loop-traps.
  • Rate limits should be RESPONSIBILITY of the server, not the agent. Agents will retry until they hit your underlying API limits. Pre-emptive rate-limit logic in the MCP server prevents cascade failures.
  • Tool descriptions are prompts. The agent reads your tool description and decides whether to use the tool. Vague descriptions cause tool-misselection failures.
  • Use server-side authentication, not client-credential passthrough. Agents should never see API keys. The MCP server holds credentials; agents call abstract operations.
  • Version your MCP server explicitly. Changes to tool schemas break agents that learned the old patterns. Semantic versioning + deprecation windows.
  • Log every tool call with structured metadata. Failures are diagnosable from logs, not from agent post-mortems.

Customization tips

  • Be specific in <existing-api>. Paste 1-2 actual endpoint examples or SQL queries. The Original infers the right MCP tool shape from this; vague descriptions produce vague servers.
  • List <known-pitfalls> exhaustively. Every past production incident with this tool tells the Original what guardrails YOUR specific server needs.
  • If you're integrating a third-party SaaS API, paste the rate-limit documentation. The Original calibrates pre-emptive rate limits to underlying API limits.
  • For customer-facing MCP servers, use the variant 'Customer-Facing MCP Mode' — adds tenancy isolation patterns that internal-tool servers don't need.
  • Save the implementation skeleton. Re-running the Original on the same tool produces consistent architectural choices; you can re-generate when requirements change.
  • Test the server with a deliberately-misbehaving agent (one that retries aggressively, sends invalid input, attempts permission escalation). This is the production-readiness test.

Variants

Read-Only MCP Mode

For internal tools where agents should only read. Removes write-operation generation; calibrates permission scopes accordingly.

Customer-Facing MCP Mode

For MCP servers exposed to customer agents. Adds tenancy isolation, audit logging, and abuse-prevention patterns.

MCP Migration Mode

When converting an existing API/SDK to MCP. Maps existing endpoints to MCP tools and identifies migration risks.

Frequently asked questions

How do I use the MCP Server Builder prompt?

Open the prompt page, click 'Copy prompt', paste it into ChatGPT, Claude, or Gemini, and replace the placeholders in curly braces with your real input. The prompt is also launchable directly in each model with one click.

Which AI model works best with MCP Server Builder?

Claude Opus 4. MCP server design requires reasoning about permission boundaries, agent failure modes, and protocol details — Claude's systems-level reasoning is well-suited. ChatGPT GPT-5 is second-best.

Can I customize the MCP Server Builder prompt for my use case?

Yes — every Promptolis Original is designed to be customized. Key levers: Start with read-only tools. Adding write operations multiplies blast radius. Most production MCP servers we've audited had write permissions that should have been scoped to read.; Define error categories: "retriable", "permission denied", "resource missing", "rate limited", "validation failed". Agents handle these differently. Generic "error" responses cause loop-traps.

Explore more Originals

Hand-crafted 2026-grade prompts that actually change how you work.

← All Promptolis Originals