How the Pennyfarthing agent pipeline (TEA → Dev → Reviewer) performs on seeded-defect scenarios, across persona themes — weighted catch rate, per-finding hit rate, and phase attribution.
Ground truth = real findings external reviewers flagged on PRs the pipeline had already approved. See the
Peloton guide
for methodology. Data is a snapshot; regenerate with pf benchmark viz.