VLMs cannot automate construction takeoff. The data is not in the drawings (opens in new tab)
An empirical case study on the accuracy ceiling of vision-language models on real Australian construction takeoff. Given everything but the final numbers, a Claude Opus 4.7 / Sonnet 4.6 extractor peaked at 77.5%: because the per-job billing conventions that decide a takeoff are not present in the drawings.
Read the original article