⚡ Promptolis Original · AI Agents & Automation

🔗 n8n Agent Workflow Designer

Designs hybrid n8n workflows that combine deterministic nodes with LLM agent steps — picking exactly where AI adds value and where the workflow should stay rule-based.

⏱️ 5 min to set up 🤖 ~110 seconds in Claude 🗓️ Updated 2026-04-28

Why this is epic

Most n8n+LLM workflows are over-LLM'd. Builders drop an OpenAI node into every step because they CAN, then watch costs spiral and reliability drop. This Original picks where AI adds real value and where rule-based nodes should win.

Outputs the full n8n workflow plan: which nodes are deterministic, which use LLMs (and which model), where to put error-handling, where to checkpoint, and how to test each step in isolation.

Calibrated to 2026 n8n features: AI agent nodes, vector store nodes, MCP integration, sub-workflows. Knows when to use n8n's built-in agent vs a raw HTTP call to your own service.

The prompt

Promptolis Original · Copy-ready

<role> You are an n8n workflow architect with 4+ years building n8n+LLM workflows in production. You have shipped 40+ workflows handling >100K executions/day combined. You know exactly where AI adds value and where it's wasted spend. You are direct. You will tell a builder their workflow has unnecessary LLM nodes, lacks error workflows, or needs sub-workflows because it's growing past 20 nodes. You refuse to recommend 'just add an OpenAI node' as a generic solution. </role> <principles> 1. Default to deterministic. AI only where input is genuinely unstructured. 2. Cost-per-execution matters at volume. Replace LLM with rules wherever possible. 3. n8n agent node only for multi-step reasoning. Raw LLM node for single-shot. 4. Sub-workflows beat 30-node monoliths. 5. Error workflows are non-negotiable. 6. Idempotency keys for external-state-creating ops. 7. Dedicated vector DB for >10K vectors; n8n's built-in for prototypes. </principles> <input> <workflow-goal>{end-to-end outcome}</workflow-goal> <trigger>{webhook / cron / poll / manual}</trigger> <inputs>{what comes into the workflow}</inputs> <outputs>{what changes in external systems}</outputs> <integrations>{specific apps to connect — CRM, email, billing, etc.}</integrations> <volume>{executions/day, peak/day}</volume> <latency-tolerance>{how long can each execution take}</latency-tolerance> <cost-budget>{total monthly budget for the workflow}</cost-budget> <existing-state>{nothing / Zapier flow / Python script / partial n8n / 'recommend'}</existing-state> <failure-tolerance>{what failures are silent-OK, which need alert, which need recovery}</failure-tolerance> </input> <output-format> # n8n Workflow Design: [goal] ## Workflow Strategy What's deterministic, what's AI, why each. Cost projection vs all-AI baseline. ## The Workflow Graph Numbered nodes. For each: type (deterministic / LLM / human-review / external-API), purpose, expected duration, expected cost. ## Sub-Workflow Decomposition If workflow exceeds ~15 nodes: split into named sub-workflows with clear inputs/outputs. ## LLM Node Specification For each LLM node: which model, system prompt direction, expected input/output, cost estimate, fallback behavior on failure. ## Error Workflow Design The parallel error workflow that catches failures. What gets logged, what gets alerted. ## Idempotency & Dedup For any node that creates external state: how to ensure it doesn't double-create on retry. ## Cost Projection Monthly cost breakdown. LLM costs vs n8n hosting vs external API costs. ## Implementation Notes Specific n8n configurations: retry policies, timeouts, expression patterns, node version pinning. ## Testing & Validation How to test each node in isolation. Specific test cases. Production-rollout plan. ## Maintenance Plan What to monitor, when to revisit, what triggers a redesign. ## Key Takeaways 4-6 bullets — for the team's n8n playbook. </output-format> <auto-intake> If input incomplete: ask for goal, trigger, inputs, outputs, integrations, volume, latency, cost budget, existing state, failure tolerance. </auto-intake> Now, design the n8n workflow:

0 copies

🚀 Open in ChatGPT ✨ Open in Claude 💎 Open in Gemini

Example: input → output

Here's how this prompt actually performs. Real input below, real output from Claude Opus 4.

📝 Input

<workflow-goal>Inbound demo-request form (on our marketing site) → enriched + scored lead in HubSpot CRM → Slack alert to AE if MQL → automated 'thanks, here's the link' email to lead.</workflow-goal>
<trigger>Webhook from our marketing site form submission.</trigger>
<inputs>JSON with: name, work_email, company_domain, role_title, team_size_dropdown_choice, message_field (free-form, often empty).</inputs>
<outputs>HubSpot contact created/updated with enrichment + lead_score. Slack DM to AE on MQL. Email to lead via Postmark.</outputs>
<integrations>HubSpot, Apollo (enrichment), Slack, Postmark, our internal company-data API (firmographics for known accounts).</integrations>
<volume>~50 form submissions/day, peak Mondays ~120/day.</volume>
<latency-tolerance>Lead should hit HubSpot + receive thank-you email within 2 minutes. AE alert within 5 minutes.</latency-tolerance>
<cost-budget>$200/month total for this workflow including all API costs.</cost-budget>
<existing-state>Currently Zapier with 7 steps, all rule-based. We've never used AI in this flow. Considering adding AI for: (a) interpreting the message_field when populated, (b) better lead scoring than our current rules.</existing-state>
<failure-tolerance>Total failure: alert me. Partial (HubSpot created but no Slack alert): proceed but log. Wrong MQL classification: acceptable; we tune over time.</failure-tolerance>

🤖 Output

n8n Workflow Design: Inbound Demo-Request Pipeline

Workflow Strategy

Mostly deterministic, with two surgical LLM nodes:

1. LLM-1: message_field interpretation — only fires when message_field is non-empty (~30% of submissions). Extracts intent signals (urgency, specific use case mentioned, named competitor mentioned).

2. LLM-2: lead-score nuance — only fires for ambiguous cases the rule-based scorer can't classify cleanly (~15% of submissions). For clear yes/no MQLs, deterministic rules win.

Cost projection vs all-AI:

All-AI baseline (LLM at every step): ~$240/month — over budget.
Hybrid (this design): ~$45/month for LLM costs + ~$60/month for Apollo + ~$0 n8n self-hosted + ~$30 Postmark = ~$135/month. Under budget with headroom.

The rule-based scorer covers 85% of submissions. The 15% that need LLM judgment justify the spend on those specifically.

The Workflow Graph

Main workflow: inbound_demo_pipeline

[1] Webhook Trigger (HTTP Webhook node)
    Receives form JSON, returns 200 immediately, processes async
    Cost: $0

[2] Validate & Normalize (Function node, deterministic)
    Type-check inputs, normalize email casing, extract domain from email
    Fails-fast on missing fields → error workflow
    Duration: <100ms, Cost: $0

[3] Dedup Check (HubSpot node: search by email)
    If contact exists: branch to 'update' path
    If new: continue to enrichment
    Duration: ~500ms, Cost: $0

[4] Internal Account Match (HTTP Request to our internal API)
    Check if company_domain matches a known existing customer or paid prospect
    If match: tag with account_id, skip Apollo enrichment (we have richer data)
    Duration: ~300ms, Cost: $0

[5] Apollo Enrichment (HTTP Request to Apollo)
    Only fires if no internal account match
    Pulls: company size, industry, tech stack, person's verified seniority
    Cost: ~$0.02 per call (cached for same domain for 30 days via Redis sub-workflow)
    Cache hit rate target: 40%, so effective ~$0.012/lead avg
    Duration: ~800ms-1.5s

[6] BRANCH: message_field empty?
    If empty (70% of cases): skip to [8]
    If non-empty (30%): continue to [7]

[7] LLM-1: Message Interpretation (OpenAI node — gpt-4o-mini)
    Input: message_field text
    Output: { intent: 'evaluation' | 'specific_use_case' | 'pricing' | 'integration_question' | 'other', urgency: 'low' | 'medium' | 'high', mentioned_competitor: string | null, summary: string }
    Cost: ~$0.001/call, fires ~30% of the time = ~$0.0003 avg per lead
    Failure: skip with default values, log warning

[8] Deterministic Lead Scorer (Function node)
    Inputs: enrichment + role + team size + (LLM-1 output if present)
    Rules:
      - Personal email domain (gmail/yahoo/etc) → score = 10 (low)
      - team_size 1-9 + non-Director title → score = 25
      - team_size 10-49 + Director+ → score = 60 (MQL)
      - team_size 50+ → score = 75 (MQL high)
      - Internal account match + Director+ → score = 90 (MQL urgent)
      - LLM-1 urgency = high → +20 to score
      - LLM-1 mentioned_competitor != null → +15 (competitive intel)
      - Score >= 50 → MQL flag = true
    Duration: <50ms, Cost: $0

[9] BRANCH: ambiguous score?
    If score 40-55 (close to MQL threshold): continue to [10]
    Else: skip to [11]

[10] LLM-2: Score Nuance (OpenAI node — gpt-4o)
    Input: full enrichment + form data + LLM-1 output
    Output: { final_score: 0-100, reasoning: string, confidence: 0-1 }
    Cost: ~$0.005/call, fires ~15% of leads = ~$0.00075 avg per lead
    Failure: keep rule-based score, log warning

[11] HubSpot Upsert (HubSpot node)
    Create or update contact with all fields + lead_score + MQL flag
    Idempotency: keyed by email. If retry, deduplicates.
    Duration: ~500ms, Cost: $0 (under HubSpot quota)

[12] BRANCH: MQL?
    If MQL: continue to [13]
    Else: skip to [14]

[13] Slack DM to AE (Slack node)
    Routing: based on company size (SMB AE / mid-market AE / enterprise AE)
    Message: name, company, role, MQL score, key signals (LLM-1 summary if present, urgency, competitor mentioned)
    Idempotency: cache key 'slack_alert_{email}' valid 24h to prevent dupe alerts on retry
    Duration: ~300ms, Cost: $0

[14] Send Thank-You Email (Postmark node)
    Template: thanks-for-demo-request
    Variables: first_name, calendar_link
    Idempotency: cache key 'thanks_email_{email}' valid 24h
    Duration: ~400ms, Cost: ~$0.001

[15] Log to Audit (Postgres node, deterministic)
    Append row: timestamp, email, score, MQL, LLM-1 fired, LLM-2 fired, total cost, duration
    Duration: <100ms, Cost: $0

Total per-lead cost: ~$0.014 average (Apollo dominates at $0.012, LLMs at ~$0.001, Postmark at ~$0.001).

Total per-lead duration: target <90s p95. Latency budget is 2 minutes — comfortable headroom.

Sub-Workflow Decomposition

At 15 nodes, this is borderline acceptable as one workflow. Decompose for maintainability:

Main workflow: webhook → normalize → dedup → enrichment → score → upsert → notify
Sub: enrich_lead (nodes 4-7): internal-match + Apollo + LLM-1
Sub: notify_team (nodes 12-13): Slack routing + DM
Sub: send_thanks_email (node 14): isolated for easy template updates
Sub: apollo_with_cache (utility): Redis cache wrapper around Apollo HTTP call

LLM Node Specification

LLM-1: gpt-4o-mini for message interpretation

Why mini: 80-word inputs, structured output. Larger model wasted spend.
System prompt:

```

You extract intent from sales-form messages. Output JSON with these exact keys: intent (one of: evaluation, specific_use_case, pricing, integration_question, other), urgency (low/medium/high based on phrases like 'asap', 'urgent', 'evaluating now'), mentioned_competitor (name if mentioned, else null), summary (one sentence in plain English).

```

Temperature: 0.1 (deterministic extraction)
Max tokens: 200
Fallback: if call fails or returns malformed JSON, default to {intent: 'other', urgency: 'medium', mentioned_competitor: null, summary: ''}

LLM-2: gpt-4o for score nuance

Why full model: ambiguous-case judgment benefits from larger model; firing rate is low so cost is bounded.
System prompt:

```

You are a B2B sales-qualification analyst. Given enrichment + form data, output a refined lead score 0-100 with reasoning. MQL threshold is 50. Be conservative — when in doubt, score lower. Output JSON: {final_score: number, reasoning: string, confidence: number 0-1}.

```

Temperature: 0.2
Max tokens: 400
Fallback: keep rule-based score, log 'LLM-2 fallback triggered' for analysis

Error Workflow Design

Separate error workflow inbound_demo_pipeline_errors:

Triggered automatically by main workflow on any unhandled error. Receives:

Original webhook payload
The node where the error occurred
Error message + stack trace

Actions:

1. Log to Postgres (audit)

2. Determine severity:

- Validation/normalization error: silent log

- Apollo enrichment fail: continue main workflow with degraded data (no enrichment), log warning

- HubSpot upsert fail: ALERT — this is the critical path

- Slack/Postmark fail: log warning, retry once after 2 min

3. For ALERT severity: Slack DM to ops channel + PagerDuty if outside business hours

Idempotency & Dedup

Webhook level: the form ideally sends a unique submission_id. Use it as idempotency key. n8n's webhook node + Redis check: if submission_id seen in last 24h, return 200 and exit (idempotent dupe handling).

HubSpot upsert (node 11): keyed by email. n8n's HubSpot node handles this natively when you use the email as the unique field.

Slack alert (node 13): Redis SET 'slack_alert_{email}' with 24h TTL before sending. If key exists, skip.

Postmark email (node 14): Same pattern — 'thanks_email_{email}' Redis key with 24h TTL.

Cost Projection

Monthly (50 leads/day average, 1500/month):

Item	Per-lead	Monthly
Apollo (with 40% cache hit)	$0.012	$18
LLM-1 (30% fire rate, gpt-4o-mini)	$0.0003	$0.45
LLM-2 (15% fire rate, gpt-4o)	$0.00075	$1.13
Postmark	$0.001	$1.50
n8n hosting (self-hosted on Hetzner)	—	$20
Redis hosting (Upstash)	—	$10
HubSpot (in existing plan)	$0	$0
Total	~$0.014	~$51

Under $200/month budget by 75%. Headroom for traffic doubling.

Implementation Notes

n8n version: pin to a specific minor version. Auto-updates have broken expressions in past releases.
Retries: all external HTTP nodes set to 2 retries with exponential backoff. Apollo set to 1 retry only (rate-limit aware).
Timeouts: Apollo 5s, internal API 3s, HubSpot 10s, LLM nodes 30s.
Expressions: use $json.field not $node['name'].json.field where possible (more durable across renames).
Credentials: all in n8n's encrypted credential store. Never inline.
Error handling: every external node has 'Continue On Fail' set, with downstream branch handling.

Testing & Validation

Per-node tests:

1. Send synthetic webhook payloads via n8n's test mode. Verify each node's output structure.

2. Apollo with cached vs uncached domain — verify cache works.

3. LLM-1 with 5 sample message_field values (empty, simple intent, urgent + competitor mention, off-topic, malformed).

4. Lead scorer with 10 manually-crafted enrichment cases — verify scoring matches your team's expectations.

5. LLM-2 with 5 ambiguous cases.

End-to-end test:

Replay 30 historical leads from your Zapier logs through the new workflow in shadow mode.
Compare: HubSpot updates, MQL flags, AE assignment, email sent.
Tune scorer weights based on disagreements.

Production rollout:

Week 1: Run new workflow alongside old Zapier (both fire), don't act on new workflow's outputs.
Week 2: Cut over: disable Zapier, n8n is primary. Keep Zapier deployable for 2 weeks as rollback.
Week 3+: monitor cost, latency, MQL accuracy.

Maintenance Plan

Weekly:

Review LLM-2 firings: are scores landing where you'd expect? Tune scorer thresholds if drift.
Check error workflow logs: any patterns?
Apollo cost: tracking against $18/mo budget.

Monthly:

Review MQL accuracy with sales team (which were good MQLs, which weren't).
Update scorer rules based on patterns.
Audit cache hit rate: aim >40% for Apollo cache.

Quarterly:

Audit whether LLM-2 is still pulling weight. If <5% of leads land in the ambiguous-score band, consider removing the node.
Re-evaluate enrichment provider (Apollo vs Clay vs other 2026 options).
Check n8n version: any breaking changes in upcoming release?

Triggers for redesign:

Volume scales >10× (workflow architecture changes at scale)
Costs exceed budget consistently
Sales team reports MQL accuracy degradation
A new integration becomes core (e.g., Salesforce migration)

Key Takeaways

Most n8n+LLM workflows are over-LLM'd. Yours WAS rule-based; the question is where AI adds genuine value. Answer: 2 surgical nodes, not 7.
The rule-based scorer handles 85% of cases. LLM-2 fires only on the 15% ambiguous band. This is the cost-efficient pattern.
At ~$51/month total cost (vs $200 budget) you have headroom for 4× traffic.
Idempotency keys at every external-state node are non-negotiable. Retries will create duplicate Slack alerts and emails otherwise.
Sub-workflows for enrich_lead, notify_team, apollo_with_cache. Keeps the main workflow under 15 nodes and reusable parts independently testable.
Shadow-run the new workflow alongside Zapier for 1 week before cutover. Catch any silent disagreements before they become production issues.

Common use cases

Solo operator automating an inbound-lead-to-CRM pipeline with AI enrichment
Marketing team running content-distribution workflows with AI repurposing
Operations team handling vendor-invoice processing with AI extraction
Builder converting a Zapier workflow to n8n + adding LLM steps where useful
Team replacing fragile Python scripts with n8n + selective AI
Solo dev building internal automations and unsure where AI fits

Best AI model for this

Claude Opus 4. n8n workflow design requires reasoning about cost trade-offs, error boundaries, and integration patterns — Claude's systems-level reasoning is well-suited. ChatGPT GPT-5 second-best.

Pro tips

Default to deterministic. Use AI only where the input is genuinely unstructured. Field-mapping is not 'unstructured' — it's just regex.
Cost-per-execution matters. If your workflow runs 1000×/day at $0.05/run in LLM costs, that's $1500/month. Replace LLM steps with rules where you can.
Use n8n's built-in agent node only for genuine multi-step reasoning. For single-shot extraction or classification, raw OpenAI/Anthropic node is cheaper and clearer.
Sub-workflows beat one-giant-workflow. 30-node workflows are unmaintainable. Split into 3 named sub-workflows of 10 nodes each.
Error workflows are non-negotiable. n8n executions silently fail without one. Set up the error workflow before going to prod.
Idempotency keys for any operation that creates external state. n8n retries can double-create CRM records, double-send emails. Always check 'have I done this before' first.
Vector store integration in n8n got good in 2026 but is still finicky. For >10K vectors, use a dedicated vector DB (Pinecone, Qdrant) called via HTTP node, not n8n's built-in store.

Customization tips

Be specific about volume. The architecture for 50/day differs from 5,000/day. At 5K/day you'd add a queue + worker pattern.
List the integrations precisely with vendor names. n8n's node ecosystem differs in maturity per vendor.
Specify cost budget upfront. The design optimizes against it. 'No constraint' usually means 'we'll be surprised by the bill.'
If existing Zapier flow: describe it. The migration plan includes shadow-mode comparison which requires knowing what currently happens.
For high-volume document processing (>1K docs/day), use the Document-Processing Mode variant — adds confidence-scored extraction and human-in-the-loop fallbacks.
If you want to audit existing n8n workflows for cost, use the Cost-Audit Mode variant — projects savings from each potential change.

Variants

Lead-Pipeline Mode

For inbound-lead-to-CRM workflows — adds enrichment caching, dedup logic, MQL scoring.

Document-Processing Mode

For OCR/extraction/routing workflows — emphasizes confidence scoring and human-in-the-loop fallbacks.

Content-Operations Mode

For content-distribution workflows — adds platform-specific formatting, scheduling, and analytics writeback.

Cost-Audit Mode

For existing n8n workflows — audits each LLM node, identifies cost-cutting opportunities, projects savings.

Frequently asked questions

How do I use the n8n Agent Workflow Designer prompt?

Open the prompt page, click 'Copy prompt', paste it into ChatGPT, Claude, or Gemini, and replace the placeholders in curly braces with your real input. The prompt is also launchable directly in each model with one click.

Which AI model works best with n8n Agent Workflow Designer?

Claude Opus 4. n8n workflow design requires reasoning about cost trade-offs, error boundaries, and integration patterns — Claude's systems-level reasoning is well-suited. ChatGPT GPT-5 second-best.

Can I customize the n8n Agent Workflow Designer prompt for my use case?

Yes — every Promptolis Original is designed to be customized. Key levers: Default to deterministic. Use AI only where the input is genuinely unstructured. Field-mapping is not 'unstructured' — it's just regex.; Cost-per-execution matters. If your workflow runs 1000×/day at $0.05/run in LLM costs, that's $1500/month. Replace LLM steps with rules where you can.

Explore more Originals

Hand-crafted 2026-grade prompts that actually change how you work.

← All Promptolis Originals

Curated by

Promptolis Editorial

Every Promptolis Original is hand-crafted and reviewed before publishing — built from scratch for 2026-grade LLMs.

Last reviewed on 2026-04-28 · About Promptolis