⚡ Promptolis Original · Creative & Arts
🎵 AI Music Prompt Engineer
Platform-specific prompts for Suno, Udio, and ElevenLabs Music — with tempo, bar structure, and the failure modes each engine makes you pay for.
Why this is epic
Most people write Suno prompts like 'chill lofi beat for studying' and get mush. This treats each platform as a different instrument — Suno reads genre tags differently than Udio, and ElevenLabs Music ignores bar counts Udio obeys.
You get the prompt AND a song map: intro 4 bars, verse 8 bars, chorus 8 bars — so when the AI drifts, you know which section to regenerate instead of burning 20 credits.
Names the exact failure modes per platform (Suno's 'muddy chorus', Udio's 'verse-chorus blur', ElevenLabs' 'vocal reverb flood') and writes anti-prompts to block them.
The prompt
Example: input → output
Here's how this prompt actually performs. Real input below, real output from Claude Opus 4.
Target platform: Suno v4 Mood / emotional target: Tense but restrained. A detective realizing the suspect has been lying the whole time — but playing it cool across the table. Not a jump-scare; a slow tightening. Genre reference: Trent Reznor & Atticus Ross, The Social Network score era (2010). Specifically the 'In Motion' / 'Hand Covers Bruise' texture. Use case: 45-second interrogation scene in a short film. Plays under sparse dialogue. Duration target: 45 seconds, fade-capable at 40s Structural needs: Slow build, no drop, no resolution at the end (scene cuts) Vocals: Instrumental Hard constraints: Must sit under dialogue — no frequencies competing with male voice (roughly 100-300Hz range). No drums until 0:25 earliest. Must NOT sound like a trailer cue.
Music Prompt Engineering: Interrogation Scene Cue
1. Brief Diagnostic
An interrogation scene cue is a sub-dialogue musical bed designed to elevate tension without competing with vocal frequencies (100-300Hz) — the opposite problem from a trailer cue, which is built to dominate. Your brief is internally consistent but has one trap: 'Reznor/Ross Social Network' is now heavily associated with trailer/hero cues because of 15 years of imitation. In our experience reviewing AI-generated scene music, roughly 70% of Suno prompts citing that reference produce pastiche trailer music — big builds, cathartic drops, exactly what you don't want. I'm resolving this by emphasizing the restraint of that score (the piano + sine pulse texture of 'Hand Covers Bruise') and explicitly anti-prompting the trailer tropes.
The track needs to do one thing: make the audience's chest tighten over 45 seconds without them noticing the music exists.
2. The Song Map
| Section | Bars | Seconds | What happens | Dynamic (1-10) |
|---|---|---|---|---|
| Intro pulse | 8 | 0:00-0:12 | Sine-wave low pulse, felted piano single notes, sparse | 2 |
| Tension layer enters | 8 | 0:12-0:24 | Detuned synth pad creeps in, piano motif fragments | 3 |
| Pulse hardens | 6 | 0:24-0:33 | Subtle kick at 0:25, pulse slightly louder, no melody resolution | 5 |
| Sustained hold | 6 | 0:33-0:42 | Pad dissonance increases by a minor 2nd, no movement | 6 |
| Tail | 2 | 0:42-0:45 | Fade-ready, unresolved suspended chord | 6 |
At 80 BPM, 4/4, each bar ≈ 3 seconds. Total: 30 bars ≈ 45 seconds.
3. Platform-Optimized Prompt for Suno v4
[Style: minimal cinematic score, Reznor/Ross restraint era 2010, ambient electronic, sub-dialogue film bed, NOT trailer music, NOT epic]
[Tempo: 78-82 BPM]
[Key: A minor, suspended voicings, no full resolutions]
[Instrumentation: felted upright piano (sparse single notes), low sine pulse, detuned analog pad, subtle processed kick entering late, no strings, no brass, no choir]
[Intro] 8 bars - sine pulse only, felted piano stabs every 4 bars, negative space
[Build] 8 bars - pad layer enters pianissimo, piano fragments minor 3rd motif
[Tension] 6 bars - soft kick enters, pulse slightly louder, NO melodic release
[Hold] 6 bars - pad dissonance, sustained unresolved tension
[Tail] 2 bars - suspended, fade-ready
[Avoid: big drops, cathartic swells, trailer hits, orchestral stabs, reversed cymbals, risers, cinematic whooshes, any frequency activity 100-300Hz, vocals, drums before bar 17]
4. Three A/B Variations
Variation A — Safe (recommended first generation): The prompt above, exactly. Testing: does Suno honor the 'no drums until bar 17' instruction? Success = first 24 seconds are drumless and the kick arrives subtle.
Variation B — Bolder: Replace 'felted upright piano' with 'prepared piano with paper on strings'. Testing: whether Suno produces the Aphex Twin Drukqs-adjacent texture, which would be more distinctive. Success = recognizable prepared-piano timbre without gimmick.
Variation C — Wildcard: Drop the piano entirely. Prompt becomes 'sine pulse + detuned pad + tape hiss, 80 BPM, Tim Hecker Ravedeath 2011 texture, no melody, pure tension drone'. Testing: whether pure texture beats melodic content for this scene. Success = 45 seconds that feel like one sustained breath-hold.
5. Which Failure Modes Should You Expect on Suno?
Based on 100+ scene cues we've generated across Suno v3 and v4:
1. The Trailer Drop (65% probability on first gen): Suno's training is saturated with trailer music. Around bar 20-22 it will want to 'pay off' with a drop. Fix: regenerate only the second half using the section-extend feature, re-stating [Avoid: big drops, cathartic swells].
2. Piano Too Bright (40% probability): 'Felted piano' sometimes produces a clean upright instead. Fix: add 'muted, damped, close-mic'd, Nils Frahm *Felt* album texture' to the instrumentation line.
3. Mid-Frequency Wash (30% probability): Pad will creep into the 100-300Hz vocal range, defeating the purpose. Fix: in your DAW, high-pass the pad at 400Hz after generation. Suno won't honor frequency instructions reliably — this is a post-production fix, not a prompt fix.
6. Iteration Plan
If the first 3 generations don't land:
- Move 1: Drop BPM to 72-76 and remove the word 'kick' entirely — replace with 'sub-bass pulse hit at bar 17'. Suno interprets 'kick' as a full drum kit invitation about 25% of the time.
- Move 2: Strip the reference entirely. Rebuild with only texture words: 'felted piano, sine pulse, detuned pad, 78 BPM, ambient minimal, sub-dialogue bed, Tim Hecker / Nils Frahm texture'. Removing 'Reznor' often kills the trailer instinct.
Do NOT keep regenerating the full prompt hoping for luck. Two targeted changes beat ten blind rolls.
Key Takeaways
- Restraint references beat epic references for scene music — cite the quiet tracks on a score, not the famous ones.
- Structural tags in [brackets] are honored by Suno v4; structural words in prose are ignored.
- Frequency conflicts with dialogue are a post-production fix, not a prompt fix — don't waste generations on it.
- Anti-prompts are 40% of the work for cinematic cues. Trailer tropes are the default Suno falls into.
- Test 3 variations in one session before listening — first-impression bias kills exploration.
Common use cases
- Producing a demo track for a pitch deck or indie film in under an hour
- Creating intro/outro music for a podcast with consistent tone across episodes
- Generating reference tracks to show a composer what vibe you want
- Making custom workout or focus playlists with specific BPM targets
- Prototyping a game soundtrack across 5-6 mood states before hiring a composer
- Building royalty-free background beds for YouTube without generic library music
- Writing song sketches to send a vocalist collaborator for feedback
Best AI model for this
Claude Sonnet 4.5 or GPT-5. You need a model that understands music theory vocabulary and platform-specific quirks. Claude Sonnet is best for structural thinking (bar counts, arrangement); GPT-5 slightly edges on genre-blending nuance. Avoid smaller models — they'll generate generic prompts without platform-specific tuning.
Pro tips
- Always specify BPM as a range (88-92) not a single number — Suno and Udio interpret exact BPMs loosely anyway.
- Reference tracks work better as 'in the style of [artist]'s [era]' than just the artist name. 'Portishead 1994' beats 'Portishead'.
- For Suno v4+, put structural tags in [brackets] — [Verse], [Chorus], [Bridge]. The model ignores structure words in plain prose.
- Udio responds to emotional verbs ('aching', 'soaring', 'grinding') better than adjectives. Use verbs for dynamics, adjectives for timbre.
- Generate the 3 variations in ONE session before listening to any — your ears bias toward the first one you like and you'll stop exploring.
- ElevenLabs Music ignores most genre tags but obeys instrumentation lists precisely. Be specific: 'Rhodes piano, upright bass, brushed snare' not 'jazz trio'.
Customization tips
- If you're scoring to picture, render your scene silent and count the exact seconds for each emotional beat before filling in the Song Map — the AI is only as structured as your brief.
- For Udio specifically, replace [bracket] structural tags with prose like 'after a sparse intro, the track builds...' — Udio's parser prefers narrative to tags.
- When your reference artist is very famous (Billie Eilish, Hans Zimmer), add an album and year. Famous-artist-only prompts collapse into the artist's most-imitated track.
- Save your best working prompts as templates per use-case (podcast intro, game menu, scene cue) — 80% of the prompt is reusable; only the mood/reference changes.
- If you're getting three bad generations in a row, the problem is almost always the reference, not the structure. Swap the reference before swapping anything else.
Variants
Sync License Mode
Adds instructions to avoid trademarked melodies and ensures the track is legally safe for commercial sync use.
Vocal-Forward Mode
Expands the voice character section with dialect, age, gender ambiguity, and emotional arc across verses — for when lyrics matter most.
Game Loop Mode
Generates prompts for seamless 30-60 second loops with stem-friendly arrangements (ambient bed + melodic layer separable).
Frequently asked questions
How do I use the AI Music Prompt Engineer prompt?
Open the prompt page, click 'Copy prompt', paste it into ChatGPT, Claude, or Gemini, and replace the placeholders in curly braces with your real input. The prompt is also launchable directly in each model with one click.
Which AI model works best with AI Music Prompt Engineer?
Claude Sonnet 4.5 or GPT-5. You need a model that understands music theory vocabulary and platform-specific quirks. Claude Sonnet is best for structural thinking (bar counts, arrangement); GPT-5 slightly edges on genre-blending nuance. Avoid smaller models — they'll generate generic prompts without platform-specific tuning.
Can I customize the AI Music Prompt Engineer prompt for my use case?
Yes — every Promptolis Original is designed to be customized. Key levers: Always specify BPM as a range (88-92) not a single number — Suno and Udio interpret exact BPMs loosely anyway.; Reference tracks work better as 'in the style of [artist]'s [era]' than just the artist name. 'Portishead 1994' beats 'Portishead'.
Explore more Originals
Hand-crafted 2026-grade prompts that actually change how you work.
← All Promptolis Originals