benchcad leaderboard
Model performance across four matched programmatic-CAD tasks. Scoring is execution-grounded and objective — geometry by voxel IoU between executed STEP solids, QA by symmetric ratio accuracy. No LLM judge. Numbers are re-graded from submitted predictions, never self-reported.
reproduce this task — full split
Loading…
frontier (proprietary)
open weights
control / baseline
To add a model to the this leaderboard, run the command above and open a PR on the
code repo with your results.jsonl.
Submissions are re-graded before listing.