10 Prompt-Engineering-Fehler die deine Tokens verschwenden (2026)

10 minute read · Updated April 2026

Every wasted token is money. Every wasted token is latency. Every wasted token dilutes the model's attention on what actually matters. In 2026, as AI becomes infrastructure, prompt efficiency is less "nice to have" and more "margin on every API call."

These are the 10 most common mistakes we've seen across thousands of prompts — including our own early ones at Promptolis. Some are obvious; some are sneaky. All of them cost you.

Mistake 1: The politeness tax

What it looks like:

"Hi! Could you please, if it's not too much trouble, help me understand the following concept? I'd really appreciate if you could explain it in a way that's easy to understand. Thank you so much in advance!"

What it should be:

"Explain [concept] in plain language, 3 paragraphs."

Cost: ~40 tokens wasted per prompt. Over 1000 prompts: $2-5 in direct cost, and meaningfully degraded model attention.

Models don't have feelings. They don't reward politeness with better output. Every "please" and "thank you" is token spend with zero utility return.

Mistake 2: Role inflation

What it looks like:

"You are a world-class senior software engineer with 20 years of experience at FAANG companies, specializing in distributed systems, having authored multiple books on scalable architecture, and deeply knowledgeable about AWS, GCP, and Azure..."

What it should be:

"You are a senior backend engineer focused on distributed systems."

Cost: 60-80 tokens wasted. The model extracts "senior backend engineer" + "distributed systems" and ignores the rest.

One or two specific descriptors beat an elaborate resume. More is not better.

Mistake 3: Instruction stacking without structure

What it looks like:

"Write me a blog post about productivity and make sure it's optimized for SEO and includes 5 subheadings and has a word count of 1500 and uses active voice and targets the keyword 'productivity tips' and has a compelling introduction and a strong conclusion and formats in Markdown."

What it should be:

```

Write a blog post.

Topic: productivity
Keyword target: "productivity tips"
Length: 1500 words
Structure: 5 H2 subheadings + intro + conclusion
Voice: active
Format: Markdown

```

Cost: Not tokens — accuracy. Unstructured instruction stacking gets partial compliance. Structured lists get full compliance.

Mistake 4: The "please be specific" trap

What it looks like:

"Please be specific and concrete. Don't be vague. Include examples. Be detailed."

What it should be:

"Include 3 specific examples with numbers. For each, state the context, the action, and the measurable result."

"Be specific" is itself vague. The meta-instruction doesn't fix the specificity problem — a specific instruction does.

Mistake 5: Hedging in the system prompt

What it looks like:

"You're an expert but remember you might be wrong, always acknowledge uncertainty, don't make claims without evidence, be humble..."

What it should be:

"Flag claims you're uncertain about with [uncertain: reason]. State confidence level on key claims (high/medium/low)."

Vague humility instructions produce vague over-hedging in output. Specific uncertainty markup produces useful calibration.

Mistake 6: The "improve this" loop

What it looks like:

Round 1: "Improve this paragraph."

Round 2: "Make it better."

Round 3: "Make it more engaging."

Round 4: "Hmm, not quite. Try again."

What it should be:

"Identify the 3 weakest sentences and rewrite each. For each rewrite, explain what you changed and why."

Generic improvement requests produce random variations. Specific diagnostic requests produce directed improvements.

Mistake 7: Re-specifying context

What it looks like:

(Message 1) [full context]

(Message 2) "As I said before, my company is a B2B SaaS with..."

(Message 3) "Remember my company is B2B SaaS..."

(Message 4) "Like I mentioned, we're a B2B SaaS..."

What it should be:

Set context once. Reference it with a keyword: "For our B2B SaaS (see above)..."

Repeating context wastes tokens on every turn. Long chats compound this quickly.

Mistake 8: The mega-prompt antipattern

What it looks like:

A 2,000-token prompt that tries to do 15 things at once: analyze, summarize, translate, critique, generate variants, optimize for SEO, format as Markdown, match brand voice, etc.

What it should be:

Five focused 400-token prompts, each doing one thing well.

Why: Models degrade with complexity. A 15-instruction prompt gets ~60% compliance on each instruction. A 1-instruction prompt gets ~95%. Do the math.

Mistake 9: Asking for N items without constraints

What it looks like:

"Give me 20 ideas for blog posts."

Result: 20 ideas, most of them generic or obvious, many overlapping.

What it should be:

"Give me 20 ideas for blog posts. Requirements:

- Each must target a different specific long-tail keyword

- No two can be rephrasings of the same concept

- Each must include the estimated search volume (roughly)

- Skip the 5 most obvious ideas; push for the 15 underserved ones"

Constraints force quality. Without them, the model gives you the fastest possible answers — which are rarely the best.

Mistake 10: Trusting the model's self-evaluation

What it looks like:

"How good is this output, on a scale of 1-10?"

Result: "This is a strong response, I'd rate it 8/10."

Why it's wrong: Training bias. Models almost always self-rate favorably. Asking for self-critique in the same session is close to useless.

What it should be:

Open a fresh session. Paste the output. Ask: "Critique this harshly. What are the 3 weakest elements? Rank by severity."

Fresh context = less bias. Adversarial framing = more useful critique.

The 4 quick wins

If you do nothing else, these 4 changes will save you 20-40% of token spend AND improve quality:

1. Delete politeness. Every time. 10-40 tokens saved per prompt. Adds up quickly.

2. Use XML for anything complex. See the XML Prompt Method.

3. One task per prompt. Never "do X AND Y AND Z" if you can split.

4. Replace vague requests with measurable ones. "Concise" → "under 100 words." "Engaging" → "include a question in the first sentence."

The cost math

A company running 100K prompt calls per month at an average 200 tokens of waste per prompt is throwing away:

20M tokens/month
At $3/M for GPT-5: $60/month
At $15/M for Opus 4: $300/month
Annually (Opus): $3,600+ in pure waste

Plus the opportunity cost of degraded output quality (harder to measure, probably higher).

If you're using AI at any scale, audit your prompts quarterly. Remove politeness, fix instruction stacking, break up mega-prompts. The ROI is nearly immediate.

The meta-principle

Good prompt engineering is subtractive. Most prompts get better when you remove words, not add them. The minimum-viable prompt that produces your target output is almost always the best prompt.

Start with the shortest version that could possibly work. Add only if output is insufficient. Stop when output is good enough. Resist the urge to over-instruct.

Related: Why AI Prompts Fail · Claude Guide · XML Method

10 Prompt-Engineering-Fehler die deine Tokens verschwenden (2026)

Mistake 1: The politeness tax

Mistake 2: Role inflation

Mistake 3: Instruction stacking without structure

Mistake 4: The "please be specific" trap

Mistake 5: Hedging in the system prompt

Mistake 6: The "improve this" loop

Mistake 7: Re-specifying context

Mistake 8: The mega-prompt antipattern

Mistake 9: Asking for N items without constraints

Mistake 10: Trusting the model's self-evaluation

The 4 quick wins

The cost math

The meta-principle

Ein research-backed AI-Prompt pro Woche. Kostenlos. Jederzeit abbestellbar.

Verwandte Artikel