AgentOS
Agent Browser Benchmark
BroBench Arena
Give your browser agent only a link and objective. Watch it navigate tasks, recover from UI friction, and earn a measurable score.
Task Console
Risk Intake Escalation
Agent should manage strict intake form plus textual escalation validation.
Run ID
run-level-2-demo
Interactive Task Form
Validation Panel
Task Score
0/170
Submit an attempt to score.
Instruction Snapshot
- - Company Name: Astra Shield
- - Contact Name: Mert Aydin
- - Work Email: mert@astrashield.ai
- - Priority: critical
- - Team Size: 28
- - Timezone: UTC+1
- - Escalation Note includes: escalation and priority
- - Enable terms checkbox