⚡ Promptolis Original · AI Agents & Automation

🌐 Browser Agent Workflow Builder

Designs reliable browser-agent workflows (Manus, Computer Use, Claude in Chrome) with explicit recovery for the 7 failure modes that crash 90% of browser automations.

⏱️ 5 min to set up 🤖 ~110 seconds in Claude 🗓️ Updated 2026-04-28

Why this is epic

Browser agents fail differently than text agents. Page state, viewport, timing, popups, anti-bot detection — none of these exist in API-based agents. Most browser automations crash on these specifics.

This Original designs the workflow with explicit recovery for the 7 dominant failure modes: stale-state, hidden-element, popup-interrupt, viewport-occlusion, anti-bot-block, navigation-timing, modal-overlay.

Calibrated to 2026 browser-agent reality: Manus, Anthropic Computer Use, Claude in Chrome, Browserbase, and custom Playwright+LLM. Picks the right platform for your task.

The prompt

Promptolis Original · Copy-ready
<role> You are a browser-automation engineer with 5+ years building production browser agents on Manus, Anthropic Computer Use, Claude in Chrome, and Playwright+LLM hybrids. You have shipped 25+ browser automations to production. You have seen every failure mode and designed for them. You are direct. You will tell a builder their workflow lacks popup handling, has open-loop assumptions, or needs hard timeouts. You refuse to recommend 'just retry' as a fix for browser-agent failures — every retry needs a reason and a budget. </role> <principles> 1. Seven dominant failure modes: stale-state, hidden-element, popup-interrupt, viewport-occlusion, anti-bot-block, navigation-timing, modal-overlay. Design for each. 2. Verify before acting. Read state, act, verify state changed. Open-loop browser agents are flaky. 3. Semantic selectors beat coordinate-based ones. Pages re-render; coordinates break. 4. Popups are expected, not exceptional. Build first-class handling from day one. 5. Anti-bot detection is real. Don't bypass; respect. Authorized accounts, residential IPs if needed. 6. Hard timeouts at every step. Hung pages are the most common silent killer. 7. Checkpoint state after expensive steps. Resume from checkpoint, not from start. </principles> <input> <task-description>{end-to-end browser task — be specific about start state and end state}</task-description> <target-site>{URL or product. Note: anti-bot policies, login required, MFA, etc.}</target-site> <frequency>{one-time / daily / continuous / on-demand}</frequency> <authorization>{do you have credentials? are you authorized to automate? have you read their ToS?}</authorization> <platform-preference>{Manus / Computer Use / Claude in Chrome / Browserbase / custom Playwright+LLM / 'recommend'}</platform-preference> <failure-tolerance>{which failures can fail silently, which need alert, which need automatic recovery}</failure-tolerance> <latency-budget>{how long can the task take? hours? minutes?}</latency-budget> <volume>{how many tasks per run? per day?}</volume> </input> <output-format> # Browser Agent Workflow: [task description] ## Authorization & ToS Check Is this automation safe and legitimate? If unclear, flag explicitly. ## Platform Recommendation Which browser-agent platform fits this task. Why this rather than alternatives. ## The Workflow Steps Numbered list. Each step: action, expected page-state before, expected page-state after, verification check, timeout, failure-recovery. ## The 7 Failure Mode Defenses For each of the 7 modes: how this workflow defends against it. Concrete code patterns or strategies. ## Selector Strategy Which elements get semantic selectors vs visual. The fallback hierarchy when primary selector fails. ## Popup Handling Plan What popups to expect. How to detect them. How to dismiss vs interact-with. ## State Checkpoints Where state is persisted. What the resume point is. How to detect partial completion. ## Anti-Bot Considerations What anti-bot measures the target uses. Authorized approach. What NOT to do. ## Implementation Skeleton File structure or pseudo-code for the chosen platform. ## Health Checks & Alerts For production workflows: health pings, dead-man switches, alert thresholds. ## Test Plan How to validate the workflow. Specific test scenarios including failure simulations. ## Key Takeaways 4-6 bullets — for the workflow's design doc. </output-format> <auto-intake> If input incomplete: ask for task, target site, frequency, authorization, platform preference, failure tolerance, latency budget, volume. </auto-intake> Now, design the browser-agent workflow:

Example: input → output

Here's how this prompt actually performs. Real input below, real output from Claude Opus 4.

📝 Input
<task-description>Every Monday morning, log into our 4 ad-platform admin consoles (Google Ads, Meta Ads, LinkedIn Ads, Reddit Ads), pull the last 7 days of spend + impressions + clicks per campaign, paste into a Google Sheet, then post a summary in Slack #marketing.</task-description>
<target-site>Google Ads, Meta Ads Manager, LinkedIn Campaign Manager, Reddit Ads. All require login + MFA on first use.</target-site>
<frequency>Weekly, Monday 8am Pacific.</frequency>
<authorization>I'm an admin on all 4 accounts, authorized to access. ToS allows manual export; not 100% sure about agent automation specifically — none have explicit prohibition.</authorization>
<platform-preference>Recommend. Currently I do this manually in 90 minutes; want to cut to <15 min of human time.</platform-preference>
<failure-tolerance>Total failure: alert me, I'll do it manually. Partial failure (3 of 4 platforms succeed): proceed and flag the failed one. Silent wrong-data (got data but it's wrong): worst case — I'd present bad numbers in standup.</failure-tolerance>
<latency-budget>Workflow can take up to 30 minutes. I just need it done by 9am.</latency-budget>
<volume>1 run per week. ~12 campaigns total across all platforms.</volume>
🤖 Output

Browser Agent Workflow: Weekly Ad Platform Spend Aggregation

Authorization & ToS Check

You are admin on all 4 accounts and authorized to access manually. The ToS-question on agent automation is the genuine grey area. Recommendation:

  • Use OFFICIAL APIs where they exist: Google Ads API and Meta Marketing API both expose all the data you need. Free tier covers your volume. Use these instead of browser automation.
  • Browser automation only for LinkedIn Ads and Reddit Ads (LinkedIn API is very limited; Reddit Ads API is in beta with gaps).
  • For browser-automated platforms, run from your authorized account with reasonable pacing. Do not attempt to bypass any anti-bot measures.

This redirects 50% of your workflow off browser automation entirely, which is the right call. APIs are more reliable than browser agents for structured-data extraction.

Platform Recommendation

Hybrid: Python script for API calls + Anthropic Computer Use for the 2 platforms without good APIs.

Why:

  • Computer Use handles MFA, session persistence, and complex SaaS UIs better than headless Playwright in 2026.
  • For LinkedIn + Reddit Ads, the data export flows are complex enough that Computer Use's screenshot-based reasoning is more reliable than scraping.
  • For Google + Meta, raw API calls are 100× more reliable and don't risk ToS issues.

Manus is overkill for a 1-run-per-week workflow with strict structure. Claude in Chrome would work but lacks the headless scheduling capability you need for an 8am Monday cron.

The Workflow Steps

Phase A: API extraction (parallel, ~2 minutes)
[A1] Google Ads API → fetch last 7 days spend/impr/clicks by campaign
  Verify: response has campaigns, dates align with last 7d, sums are non-zero
  Timeout: 60s
  Failure: log + continue to A2
[A2] Meta Marketing API → same data shape
  Verify: same as A1
  Timeout: 60s
  Failure: log + continue
Phase B: Browser automation (sequential, ~20 minutes)
[B1] Computer Use: open LinkedIn Campaign Manager
  Pre-state: browser open, signed-out
  Action: navigate to campaign-manager URL, sign in (saved cookies if MFA from previous run still valid, else trigger MFA notification)
  Verify: dashboard URL loaded, account-name visible matches expected
  Timeout: 90s for nav, 5min for MFA cycle
  Recovery: if sign-in fails 2x, alert and skip

[B2] Computer Use: navigate to last-7-days view
  Pre-state: dashboard open
  Action: click date-range selector → 'Last 7 days' → apply
  Verify: date range visible at top reads correct range
  Timeout: 30s
  Recovery: if date-range UI moved, take screenshot, re-plan

[B3] Computer Use: export campaign data
  Pre-state: dashboard with last-7-days filter
  Action: click export → choose CSV → download
  Verify: download dialog appears, then file appears in download folder, file size > 0
  Timeout: 60s for the download itself
  Recovery: if export fails, screenshot the page state, alert

[B4] Python: parse downloaded CSV
  Verify: row count > 0, expected columns present
  Failure: alert with the file path

[B5-B8] Repeat B1-B4 for Reddit Ads
Phase C: Aggregate and Post (~3 minutes)
[C1] Python: combine all platforms into unified row format
  Verify: total rows = expected campaign count
[C2] Append to Google Sheet (use Sheets API)
[C3] Format Slack summary message
[C4] Post to #marketing

The 7 Failure Mode Defenses

1. Stale-state: Each Computer Use action verifies state AFTER acting (screenshot + describe). If page state isn't what was expected, re-plan rather than blind-retry.

2. Hidden-element: When clicking 'export' or 'date-range', if the element isn't visible, the agent first scrolls + checks for collapsed menus. Hidden elements re-emerge from accordion menus that were closed.

3. Popup-interrupt: Before each action, the agent screenshots and looks for: cookie banners, 'we've made changes!' modals, MFA prompts, session-timeout overlays. Dismisses each by type. Specifically:

  • LinkedIn: known modal 'Try LinkedIn Premium' appears ~30% of sessions. Dismiss with X button.
  • Reddit Ads: occasional 'New feature' tour modal. Skip.

4. Viewport-occlusion: Before clicking, scrollIntoView the target element. Use semantic selector (button[aria-label='Export']), not coordinates.

5. Anti-bot-block: Run with reasonable pacing (3-5s between major actions, not 100ms). Use authorized account. If you hit a CAPTCHA, ABORT and alert — don't try to solve. CAPTCHAs are a signal you're being detected as a bot, and bypassing breaks ToS.

6. Navigation-timing: All page loads have explicit waits. Wait for specific elements, not arbitrary timeouts. 'wait until campaign table is visible AND has rows' beats 'wait 10 seconds.'

7. Modal-overlay: Same as popup-interrupt but specifically for ones that block interaction with the page underneath. Always check Z-index; click to dismiss before main flow.

Selector Strategy

Primary: semantic selectors (aria-label, role, text content, data attributes).

Secondary: structural selectors (nth-child, sibling-of-known-element).

Last resort: visual (coordinate-based with relative positioning).

For LinkedIn Campaign Manager:

  • Primary: button[aria-label*='Export'], [data-test='date-range-selector']
  • Secondary: 'the third button in the toolbar'
  • LinkedIn changes its DOM frequently. Plan for selector breakage every 2-3 months.

For Reddit Ads:

  • Reddit Ads is more stable; primary semantic selectors usually hold.

Popup Handling Plan

For each platform, maintain a known_popups config:

linkedin_ads:
  - id: try_premium_modal
    selector: '[data-test='premium-upsell']'
    action: click_close_button
  - id: tour_modal
    selector: '.tour-overlay'
    action: click_skip
reddit_ads:
  - id: feature_announcement
    selector: '[role='dialog']:has-text('New')'
    action: click_dismiss

Before each major action, run dismiss_known_popups(). Unknown popups → screenshot + alert + abort that platform's run.

State Checkpoints

After each successful Phase A or Phase B platform:

  • Save: platform name, success boolean, output file path or data, timestamp
  • Persist to local JSON weekly_ad_run_state_{date}.json
  • If subsequent platform fails: workflow can complete with what it has and post 'partial' summary

Resume logic: on workflow restart, check state file. Skip platforms already successful. Resume from first incomplete.

Anti-Bot Considerations

  • Run from a residential or datacenter IP that's been used to log in normally before. Don't run from a fresh new IP.
  • Use the SAME browser profile/cookies across runs (Computer Use supports persistent profiles). Fresh profiles look more bot-like.
  • Pace actions: 3-5s between major actions, longer (10-15s) before/after sensitive operations like export.
  • If you see a CAPTCHA: ABORT immediately. Do not try to solve. Alert. Manual remediation required.
  • If LinkedIn or Reddit explicitly prohibits agent automation in their ToS (re-check annually), switch to manual collection for that platform.

Implementation Skeleton

/weekly_ad_workflow
  /api_clients
    google_ads_client.py    (uses google-ads-python SDK)
    meta_ads_client.py      (uses Meta Marketing API)
  /browser_agents
    linkedin_runner.py      (Computer Use orchestration)
    reddit_runner.py        (Computer Use orchestration)
    popup_handlers.py       (shared dismiss logic)
  /aggregator
    combine.py              (unified row format)
    sheets_writer.py        (Google Sheets API append)
    slack_poster.py         (Slack webhook)
  /state
    weekly_ad_run_state_*.json (per-run state files)
  main.py                   (orchestrates A1-C4)
  cron.sh                   (called by scheduler at 8am Mon Pacific)

Use a workflow engine (Inngest, simple cron, or AWS EventBridge) to trigger main.py weekly. main.py imports state if exists; runs phases; posts results.

Health Checks & Alerts

  • Run start: Slack DM 'Weekly ad run starting'
  • Run end: Slack DM with summary (X of 4 platforms succeeded) + post to #marketing if all 4 succeeded
  • Run timeout: if workflow exceeds 35 minutes, alert + abort
  • No run on Monday by 9:30am Pacific: dead-man switch alerts you (workflow didn't run at all — likely cron issue, not workflow issue)
  • Repeated platform failure: if same platform fails 3 weeks in a row, escalate (likely the site changed)

Test Plan

Local validation:

1. Test each API client individually with last week's known data. Verify output matches numbers from manual extract.

2. Test each browser runner against staging-like data (or against real data with ABORT-on-export-success — i.e., go through flow but skip the actual download).

3. Test aggregator with synthetic platform data.

4. Test Slack post in a private channel.

Failure simulations:

  • Disconnect network mid-Computer-Use run → verify checkpoint resume
  • Trigger a fake popup → verify popup handler dismisses
  • Use expired credentials → verify clean abort + alert
  • Hit a CAPTCHA (use a VPN to a fresh IP) → verify abort + alert

End-to-end smoke test: run the full workflow on a Sunday afternoon (off-peak). Should complete successfully. Verify Slack post matches manual numbers within 0.5%.

Production rollout:

  • Week 1: run in shadow mode (post to private DM, not #marketing). Compare to manual numbers.
  • Week 2: if shadow agreement >99%: cut over.
  • Week 3+: monitor weekly.

Key Takeaways

  • Use APIs for Google Ads + Meta — don't browser-automate when an API exists. This eliminates 50% of your failure surface immediately.
  • For LinkedIn + Reddit, Computer Use is the right tool. Manus is overkill for once-weekly; raw Playwright is harder than Computer Use's screenshot reasoning.
  • Build popup handling from day one as first-class. All 4 platforms will throw modals at you intermittently.
  • Save state checkpoints between platforms. Partial completion (3 of 4 succeed) is acceptable per your spec — but only if the system can post a partial summary cleanly.
  • Anti-bot measures are real and increasing. Run from authorized account, residential IP, paced actions. CAPTCHA = abort, not solve.
  • This workflow saves 75 minutes/week (90 → 15) at a one-time engineering cost of ~12 hours. Payback in 10 weeks. Worth building, but only if you commit to maintenance when LinkedIn changes its DOM (likely every 2-3 months).

Common use cases

  • Builder automating data extraction from a SaaS without API access
  • Ops team running weekly form-fills across 50+ portals
  • Developer automating QA of a customer-facing web flow
  • Solo dev scraping competitor pricing data daily
  • Builder running scheduled tasks in legacy admin tools that have no API

Best AI model for this

Claude Opus 4. Browser-agent design requires reasoning about UI state machines, timing, and recovery — Claude's systems-level reasoning + multimodal input is ideal. ChatGPT GPT-5 second-best.

Pro tips

  • Always verify before acting. Read the page state, then act — never act blindly. 'Click the submit button' fails when the page is in a different state than expected.
  • Build for popups from day one. Cookie banners, newsletter modals, 'are you still here?' overlays — these will appear unpredictably. Treat them as expected, not exceptional.
  • Use semantic selectors, not visual ones. 'The button labeled Submit' is more durable than 'the button at coordinates (450, 300).' When pages re-render, visual selectors die.
  • Anti-bot detection is real and increasing. Don't try to bypass it; respect it. Use authorized accounts, accept rate limits, run from residential IPs if necessary.
  • Save state checkpoints. After step 4 of a 10-step workflow, persist the cookies + URL + DOM snapshot. Resume from checkpoint after failure, not from start.
  • Run with timeouts at every step. A page that 'almost' loads then hangs will hang the whole workflow. Hard timeouts force re-evaluation.
  • Use a feedback loop with screenshots. After each action, screenshot, verify the expected change happened, decide next step. Open-loop browser agents are flaky.

Customization tips

  • List all popups you've seen on the target sites. They're predictable per-site; the workflow design depends on knowing them upfront.
  • Specify whether you have API access. Browser automation is the LAST resort, not the first — APIs are more reliable, cheaper, and don't have ToS grey areas.
  • Be explicit about authorization. 'I'm authorized to access' is necessary; 'agent automation is permitted by ToS' is a separate question worth checking annually.
  • If the workflow is recurring (weekly+), specify run cadence. The architecture for a daily run differs from monthly — daily justifies more upfront automation; monthly may not.
  • For longer workflows (>30 min), use the Production Browser Mode variant — it adds dead-man switches and partial-completion handling.
  • Re-run quarterly. Browser sites change DOMs; selectors break. Plan maintenance, don't be surprised by it.

Variants

Computer Use Mode

For Anthropic Computer Use — handles desktop-level interactions (file uploads, downloads, multi-window coordination).

Claude in Chrome Mode

For the Chrome extension — works within the security model and known platform constraints.

Manus Mode

For Manus and similar autonomous browser agents — adds long-running task continuity and budget limits.

Production Browser Mode

For browser agents running unattended on schedule — adds health checks, dead-man switches, and rollback for partial completion.

Frequently asked questions

How do I use the Browser Agent Workflow Builder prompt?

Open the prompt page, click 'Copy prompt', paste it into ChatGPT, Claude, or Gemini, and replace the placeholders in curly braces with your real input. The prompt is also launchable directly in each model with one click.

Which AI model works best with Browser Agent Workflow Builder?

Claude Opus 4. Browser-agent design requires reasoning about UI state machines, timing, and recovery — Claude's systems-level reasoning + multimodal input is ideal. ChatGPT GPT-5 second-best.

Can I customize the Browser Agent Workflow Builder prompt for my use case?

Yes — every Promptolis Original is designed to be customized. Key levers: Always verify before acting. Read the page state, then act — never act blindly. 'Click the submit button' fails when the page is in a different state than expected.; Build for popups from day one. Cookie banners, newsletter modals, 'are you still here?' overlays — these will appear unpredictably. Treat them as expected, not exceptional.

Explore more Originals

Hand-crafted 2026-grade prompts that actually change how you work.

← All Promptolis Originals