⚡ Promptolis Original · AI Agents & Automation
🔗 n8n Agent Workflow Designer
Designs hybrid n8n workflows that combine deterministic nodes with LLM agent steps — picking exactly where AI adds value and where the workflow should stay rule-based.
Why this is epic
Most n8n+LLM workflows are over-LLM'd. Builders drop an OpenAI node into every step because they CAN, then watch costs spiral and reliability drop. This Original picks where AI adds real value and where rule-based nodes should win.
Outputs the full n8n workflow plan: which nodes are deterministic, which use LLMs (and which model), where to put error-handling, where to checkpoint, and how to test each step in isolation.
Calibrated to 2026 n8n features: AI agent nodes, vector store nodes, MCP integration, sub-workflows. Knows when to use n8n's built-in agent vs a raw HTTP call to your own service.
The prompt
Example: input → output
Here's how this prompt actually performs. Real input below, real output from Claude Opus 4.
<workflow-goal>Inbound demo-request form (on our marketing site) → enriched + scored lead in HubSpot CRM → Slack alert to AE if MQL → automated 'thanks, here's the link' email to lead.</workflow-goal> <trigger>Webhook from our marketing site form submission.</trigger> <inputs>JSON with: name, work_email, company_domain, role_title, team_size_dropdown_choice, message_field (free-form, often empty).</inputs> <outputs>HubSpot contact created/updated with enrichment + lead_score. Slack DM to AE on MQL. Email to lead via Postmark.</outputs> <integrations>HubSpot, Apollo (enrichment), Slack, Postmark, our internal company-data API (firmographics for known accounts).</integrations> <volume>~50 form submissions/day, peak Mondays ~120/day.</volume> <latency-tolerance>Lead should hit HubSpot + receive thank-you email within 2 minutes. AE alert within 5 minutes.</latency-tolerance> <cost-budget>$200/month total for this workflow including all API costs.</cost-budget> <existing-state>Currently Zapier with 7 steps, all rule-based. We've never used AI in this flow. Considering adding AI for: (a) interpreting the message_field when populated, (b) better lead scoring than our current rules.</existing-state> <failure-tolerance>Total failure: alert me. Partial (HubSpot created but no Slack alert): proceed but log. Wrong MQL classification: acceptable; we tune over time.</failure-tolerance>
n8n Workflow Design: Inbound Demo-Request Pipeline
Workflow Strategy
Mostly deterministic, with two surgical LLM nodes:
1. LLM-1: message_field interpretation — only fires when message_field is non-empty (~30% of submissions). Extracts intent signals (urgency, specific use case mentioned, named competitor mentioned).
2. LLM-2: lead-score nuance — only fires for ambiguous cases the rule-based scorer can't classify cleanly (~15% of submissions). For clear yes/no MQLs, deterministic rules win.
Cost projection vs all-AI:
- All-AI baseline (LLM at every step): ~$240/month — over budget.
- Hybrid (this design): ~$45/month for LLM costs + ~$60/month for Apollo + ~$0 n8n self-hosted + ~$30 Postmark = ~$135/month. Under budget with headroom.
The rule-based scorer covers 85% of submissions. The 15% that need LLM judgment justify the spend on those specifically.
The Workflow Graph
Main workflow: inbound_demo_pipeline
[1] Webhook Trigger (HTTP Webhook node)
Receives form JSON, returns 200 immediately, processes async
Cost: $0
[2] Validate & Normalize (Function node, deterministic)
Type-check inputs, normalize email casing, extract domain from email
Fails-fast on missing fields → error workflow
Duration: <100ms, Cost: $0
[3] Dedup Check (HubSpot node: search by email)
If contact exists: branch to 'update' path
If new: continue to enrichment
Duration: ~500ms, Cost: $0
[4] Internal Account Match (HTTP Request to our internal API)
Check if company_domain matches a known existing customer or paid prospect
If match: tag with account_id, skip Apollo enrichment (we have richer data)
Duration: ~300ms, Cost: $0
[5] Apollo Enrichment (HTTP Request to Apollo)
Only fires if no internal account match
Pulls: company size, industry, tech stack, person's verified seniority
Cost: ~$0.02 per call (cached for same domain for 30 days via Redis sub-workflow)
Cache hit rate target: 40%, so effective ~$0.012/lead avg
Duration: ~800ms-1.5s
[6] BRANCH: message_field empty?
If empty (70% of cases): skip to [8]
If non-empty (30%): continue to [7]
[7] LLM-1: Message Interpretation (OpenAI node — gpt-4o-mini)
Input: message_field text
Output: { intent: 'evaluation' | 'specific_use_case' | 'pricing' | 'integration_question' | 'other', urgency: 'low' | 'medium' | 'high', mentioned_competitor: string | null, summary: string }
Cost: ~$0.001/call, fires ~30% of the time = ~$0.0003 avg per lead
Failure: skip with default values, log warning
[8] Deterministic Lead Scorer (Function node)
Inputs: enrichment + role + team size + (LLM-1 output if present)
Rules:
- Personal email domain (gmail/yahoo/etc) → score = 10 (low)
- team_size 1-9 + non-Director title → score = 25
- team_size 10-49 + Director+ → score = 60 (MQL)
- team_size 50+ → score = 75 (MQL high)
- Internal account match + Director+ → score = 90 (MQL urgent)
- LLM-1 urgency = high → +20 to score
- LLM-1 mentioned_competitor != null → +15 (competitive intel)
- Score >= 50 → MQL flag = true
Duration: <50ms, Cost: $0
[9] BRANCH: ambiguous score?
If score 40-55 (close to MQL threshold): continue to [10]
Else: skip to [11]
[10] LLM-2: Score Nuance (OpenAI node — gpt-4o)
Input: full enrichment + form data + LLM-1 output
Output: { final_score: 0-100, reasoning: string, confidence: 0-1 }
Cost: ~$0.005/call, fires ~15% of leads = ~$0.00075 avg per lead
Failure: keep rule-based score, log warning
[11] HubSpot Upsert (HubSpot node)
Create or update contact with all fields + lead_score + MQL flag
Idempotency: keyed by email. If retry, deduplicates.
Duration: ~500ms, Cost: $0 (under HubSpot quota)
[12] BRANCH: MQL?
If MQL: continue to [13]
Else: skip to [14]
[13] Slack DM to AE (Slack node)
Routing: based on company size (SMB AE / mid-market AE / enterprise AE)
Message: name, company, role, MQL score, key signals (LLM-1 summary if present, urgency, competitor mentioned)
Idempotency: cache key 'slack_alert_{email}' valid 24h to prevent dupe alerts on retry
Duration: ~300ms, Cost: $0
[14] Send Thank-You Email (Postmark node)
Template: thanks-for-demo-request
Variables: first_name, calendar_link
Idempotency: cache key 'thanks_email_{email}' valid 24h
Duration: ~400ms, Cost: ~$0.001
[15] Log to Audit (Postgres node, deterministic)
Append row: timestamp, email, score, MQL, LLM-1 fired, LLM-2 fired, total cost, duration
Duration: <100ms, Cost: $0
Total per-lead cost: ~$0.014 average (Apollo dominates at $0.012, LLMs at ~$0.001, Postmark at ~$0.001).
Total per-lead duration: target <90s p95. Latency budget is 2 minutes — comfortable headroom.
Sub-Workflow Decomposition
At 15 nodes, this is borderline acceptable as one workflow. Decompose for maintainability:
- Main workflow: webhook → normalize → dedup → enrichment → score → upsert → notify
- Sub: enrich_lead (nodes 4-7): internal-match + Apollo + LLM-1
- Sub: notify_team (nodes 12-13): Slack routing + DM
- Sub: send_thanks_email (node 14): isolated for easy template updates
- Sub: apollo_with_cache (utility): Redis cache wrapper around Apollo HTTP call
LLM Node Specification
LLM-1: gpt-4o-mini for message interpretation
- Why mini: 80-word inputs, structured output. Larger model wasted spend.
- System prompt:
```
You extract intent from sales-form messages. Output JSON with these exact keys: intent (one of: evaluation, specific_use_case, pricing, integration_question, other), urgency (low/medium/high based on phrases like 'asap', 'urgent', 'evaluating now'), mentioned_competitor (name if mentioned, else null), summary (one sentence in plain English).
```
- Temperature: 0.1 (deterministic extraction)
- Max tokens: 200
- Fallback: if call fails or returns malformed JSON, default to {intent: 'other', urgency: 'medium', mentioned_competitor: null, summary: ''}
LLM-2: gpt-4o for score nuance
- Why full model: ambiguous-case judgment benefits from larger model; firing rate is low so cost is bounded.
- System prompt:
```
You are a B2B sales-qualification analyst. Given enrichment + form data, output a refined lead score 0-100 with reasoning. MQL threshold is 50. Be conservative — when in doubt, score lower. Output JSON: {final_score: number, reasoning: string, confidence: number 0-1}.
```
- Temperature: 0.2
- Max tokens: 400
- Fallback: keep rule-based score, log 'LLM-2 fallback triggered' for analysis
Error Workflow Design
Separate error workflow inbound_demo_pipeline_errors:
Triggered automatically by main workflow on any unhandled error. Receives:
- Original webhook payload
- The node where the error occurred
- Error message + stack trace
Actions:
1. Log to Postgres (audit)
2. Determine severity:
- Validation/normalization error: silent log
- Apollo enrichment fail: continue main workflow with degraded data (no enrichment), log warning
- HubSpot upsert fail: ALERT — this is the critical path
- Slack/Postmark fail: log warning, retry once after 2 min
3. For ALERT severity: Slack DM to ops channel + PagerDuty if outside business hours
Idempotency & Dedup
Webhook level: the form ideally sends a unique submission_id. Use it as idempotency key. n8n's webhook node + Redis check: if submission_id seen in last 24h, return 200 and exit (idempotent dupe handling).
HubSpot upsert (node 11): keyed by email. n8n's HubSpot node handles this natively when you use the email as the unique field.
Slack alert (node 13): Redis SET 'slack_alert_{email}' with 24h TTL before sending. If key exists, skip.
Postmark email (node 14): Same pattern — 'thanks_email_{email}' Redis key with 24h TTL.
Cost Projection
Monthly (50 leads/day average, 1500/month):
| Item | Per-lead | Monthly |
|---|---|---|
| Apollo (with 40% cache hit) | $0.012 | $18 |
| LLM-1 (30% fire rate, gpt-4o-mini) | $0.0003 | $0.45 |
| LLM-2 (15% fire rate, gpt-4o) | $0.00075 | $1.13 |
| Postmark | $0.001 | $1.50 |
| n8n hosting (self-hosted on Hetzner) | — | $20 |
| Redis hosting (Upstash) | — | $10 |
| HubSpot (in existing plan) | $0 | $0 |
| Total | ~$0.014 | ~$51 |
Under $200/month budget by 75%. Headroom for traffic doubling.
Implementation Notes
- n8n version: pin to a specific minor version. Auto-updates have broken expressions in past releases.
- Retries: all external HTTP nodes set to 2 retries with exponential backoff. Apollo set to 1 retry only (rate-limit aware).
- Timeouts: Apollo 5s, internal API 3s, HubSpot 10s, LLM nodes 30s.
- Expressions: use
$json.fieldnot$node['name'].json.fieldwhere possible (more durable across renames). - Credentials: all in n8n's encrypted credential store. Never inline.
- Error handling: every external node has 'Continue On Fail' set, with downstream branch handling.
Testing & Validation
Per-node tests:
1. Send synthetic webhook payloads via n8n's test mode. Verify each node's output structure.
2. Apollo with cached vs uncached domain — verify cache works.
3. LLM-1 with 5 sample message_field values (empty, simple intent, urgent + competitor mention, off-topic, malformed).
4. Lead scorer with 10 manually-crafted enrichment cases — verify scoring matches your team's expectations.
5. LLM-2 with 5 ambiguous cases.
End-to-end test:
- Replay 30 historical leads from your Zapier logs through the new workflow in shadow mode.
- Compare: HubSpot updates, MQL flags, AE assignment, email sent.
- Tune scorer weights based on disagreements.
Production rollout:
- Week 1: Run new workflow alongside old Zapier (both fire), don't act on new workflow's outputs.
- Week 2: Cut over: disable Zapier, n8n is primary. Keep Zapier deployable for 2 weeks as rollback.
- Week 3+: monitor cost, latency, MQL accuracy.
Maintenance Plan
Weekly:
- Review LLM-2 firings: are scores landing where you'd expect? Tune scorer thresholds if drift.
- Check error workflow logs: any patterns?
- Apollo cost: tracking against $18/mo budget.
Monthly:
- Review MQL accuracy with sales team (which were good MQLs, which weren't).
- Update scorer rules based on patterns.
- Audit cache hit rate: aim >40% for Apollo cache.
Quarterly:
- Audit whether LLM-2 is still pulling weight. If <5% of leads land in the ambiguous-score band, consider removing the node.
- Re-evaluate enrichment provider (Apollo vs Clay vs other 2026 options).
- Check n8n version: any breaking changes in upcoming release?
Triggers for redesign:
- Volume scales >10× (workflow architecture changes at scale)
- Costs exceed budget consistently
- Sales team reports MQL accuracy degradation
- A new integration becomes core (e.g., Salesforce migration)
Key Takeaways
- Most n8n+LLM workflows are over-LLM'd. Yours WAS rule-based; the question is where AI adds genuine value. Answer: 2 surgical nodes, not 7.
- The rule-based scorer handles 85% of cases. LLM-2 fires only on the 15% ambiguous band. This is the cost-efficient pattern.
- At ~$51/month total cost (vs $200 budget) you have headroom for 4× traffic.
- Idempotency keys at every external-state node are non-negotiable. Retries will create duplicate Slack alerts and emails otherwise.
- Sub-workflows for
enrich_lead,notify_team,apollo_with_cache. Keeps the main workflow under 15 nodes and reusable parts independently testable. - Shadow-run the new workflow alongside Zapier for 1 week before cutover. Catch any silent disagreements before they become production issues.
Common use cases
- Solo operator automating an inbound-lead-to-CRM pipeline with AI enrichment
- Marketing team running content-distribution workflows with AI repurposing
- Operations team handling vendor-invoice processing with AI extraction
- Builder converting a Zapier workflow to n8n + adding LLM steps where useful
- Team replacing fragile Python scripts with n8n + selective AI
- Solo dev building internal automations and unsure where AI fits
Best AI model for this
Claude Opus 4. n8n workflow design requires reasoning about cost trade-offs, error boundaries, and integration patterns — Claude's systems-level reasoning is well-suited. ChatGPT GPT-5 second-best.
Pro tips
- Default to deterministic. Use AI only where the input is genuinely unstructured. Field-mapping is not 'unstructured' — it's just regex.
- Cost-per-execution matters. If your workflow runs 1000×/day at $0.05/run in LLM costs, that's $1500/month. Replace LLM steps with rules where you can.
- Use n8n's built-in agent node only for genuine multi-step reasoning. For single-shot extraction or classification, raw OpenAI/Anthropic node is cheaper and clearer.
- Sub-workflows beat one-giant-workflow. 30-node workflows are unmaintainable. Split into 3 named sub-workflows of 10 nodes each.
- Error workflows are non-negotiable. n8n executions silently fail without one. Set up the error workflow before going to prod.
- Idempotency keys for any operation that creates external state. n8n retries can double-create CRM records, double-send emails. Always check 'have I done this before' first.
- Vector store integration in n8n got good in 2026 but is still finicky. For >10K vectors, use a dedicated vector DB (Pinecone, Qdrant) called via HTTP node, not n8n's built-in store.
Customization tips
- Be specific about volume. The architecture for 50/day differs from 5,000/day. At 5K/day you'd add a queue + worker pattern.
- List the integrations precisely with vendor names. n8n's node ecosystem differs in maturity per vendor.
- Specify cost budget upfront. The design optimizes against it. 'No constraint' usually means 'we'll be surprised by the bill.'
- If existing Zapier flow: describe it. The migration plan includes shadow-mode comparison which requires knowing what currently happens.
- For high-volume document processing (>1K docs/day), use the Document-Processing Mode variant — adds confidence-scored extraction and human-in-the-loop fallbacks.
- If you want to audit existing n8n workflows for cost, use the Cost-Audit Mode variant — projects savings from each potential change.
Variants
Lead-Pipeline Mode
For inbound-lead-to-CRM workflows — adds enrichment caching, dedup logic, MQL scoring.
Document-Processing Mode
For OCR/extraction/routing workflows — emphasizes confidence scoring and human-in-the-loop fallbacks.
Content-Operations Mode
For content-distribution workflows — adds platform-specific formatting, scheduling, and analytics writeback.
Cost-Audit Mode
For existing n8n workflows — audits each LLM node, identifies cost-cutting opportunities, projects savings.
Frequently asked questions
How do I use the n8n Agent Workflow Designer prompt?
Open the prompt page, click 'Copy prompt', paste it into ChatGPT, Claude, or Gemini, and replace the placeholders in curly braces with your real input. The prompt is also launchable directly in each model with one click.
Which AI model works best with n8n Agent Workflow Designer?
Claude Opus 4. n8n workflow design requires reasoning about cost trade-offs, error boundaries, and integration patterns — Claude's systems-level reasoning is well-suited. ChatGPT GPT-5 second-best.
Can I customize the n8n Agent Workflow Designer prompt for my use case?
Yes — every Promptolis Original is designed to be customized. Key levers: Default to deterministic. Use AI only where the input is genuinely unstructured. Field-mapping is not 'unstructured' — it's just regex.; Cost-per-execution matters. If your workflow runs 1000×/day at $0.05/run in LLM costs, that's $1500/month. Replace LLM steps with rules where you can.
Explore more Originals
Hand-crafted 2026-grade prompts that actually change how you work.
← All Promptolis Originals