🔬Arbor: An Agent That Runs Research on a Hypothesis Tree
TL;DR
Arbor frames autonomous research as a persistent hypothesis tree: a coordinator sets strategy while short-lived executors test ideas in isolated git worktrees. The paper reports best held-out results on all six real tasks in model training, harness engineering, and data synthesis, with lessons carried across runs.
Arbor frames autonomous research as a persistent hypothesis tree: a coordinator sets strategy while short-lived executors test ideas in isolated git worktrees. The paper reports best held-out results on all six real tasks in model training, harness engineering, and data synthesis, with lessons carried across runs.
Key Points
Hypothesis Tree Refinement links hypotheses, artifacts, evidence, and distilled lessons
Executors run individual hypotheses in isolated git worktrees, then update the shared tree
Evaluated under Autonomous Optimization with no step-level human supervision
Reports the best held-out result on all six real research tasks tested
Why It Matters
Persisting strategy and evidence across attempts is what separates a one-shot research agent from one that compounds, a prerequisite for any self-improving AI lab.
Quick Facts
Frequently Asked Questions
Why does this matter?
Persisting strategy and evidence across attempts is what separates a one-shot research agent from one that compounds, a prerequisite for any self-improving AI lab.
What happened?
Arbor frames autonomous research as a persistent hypothesis tree: a coordinator sets strategy while short-lived executors test ideas in isolated git worktrees. The paper reports best held-out results on all six real tasks in model training, harness engineering, and data synthesis, with lessons carried across runs.
Comments
Be the first to comment
Enjoyed this article?
Get it daily. 7am. Free. Reads in 5 minutes.
Join 2,085 builders reading daily.