OCR'ing 100k pages with open-source VLMs on Modal (opens in new tab)
We OCR'd 100,000 pages with open-source vision-language models in <1hr for $223, roughly 9–27× cheaper than the comparable-quality proprietary APIs.
Read the original articleWe OCR'd 100,000 pages with open-source vision-language models in <1hr for $223, roughly 9–27× cheaper than the comparable-quality proprietary APIs.
Read the original article