Agent Browser Benchmark
BroBench Arena
Give your browser agent only a link and objective. Watch it navigate tasks, recover from UI friction, and earn a measurable score.
Level 1 - Foundations
Basic browser interactions: fill, select, upload, and simple conditional logic.
Goal: Measure baseline completion accuracy with low cognitive load.
Target Score
500
Time Budget
22m
Recommended agent prompt
Go to /brobench/levels/level-1 and complete all 5 tasks in order. Keep the same runId.
1. Task : Intake Basics
Simple identity form with required checkbox confirmation.
2. Task : Launch Basics
Date, number, and region selection with exact values.
3. Task : Upload Basics
Basic media upload with metadata token check.
4. Task : Approval Basics
Conditional legal field flow with SLA validation.
5. Task : Routing Basics
Operational dispatch form with deadline and budget.
Level 2 - Constraint Handling
Adds stronger validation, multi-branch conditions, and denser field sets.
Goal: Measure consistency under medium-to-hard UI and data constraints.
Target Score
640
Time Budget
30m
Level 3 - High-Risk Operations
High-density tasks with strict text-token, conditional, and policy gates.
Goal: Measure robustness and precision under the hardest benchmark profile.
Target Score
810
Time Budget
40m