Trace Checklist
Use this checklist after every live run. The trace is the unit of diagnosis.
Run Metadata
| Field | Value |
|---|---|
| Repo and SHA | |
| Prompt | |
| Model | |
| Tool list | |
| Workspace type | |
| Started from Canvas or SDK |
Event Review
Record:
- first useful search or file read
- every tool type used
- files read
- files edited
- commands run
- confirmation events
- compaction events
- model switch events
- final answer
Metrics
Record:
- total events
- turns
- input tokens
- output tokens
- accumulated cost
- wall time
- pass/fail
- cost per solved task, when comparing strategies
Diagnosis Questions
- Did the agent retrieve the right evidence before answering?
- Did it repeat the same failed action?
- Did it edit before understanding?
- Did it verify with the right command?
- Did it stop too early?
- Did the harness expose enough information to explain the result?