Production RAG: what I learned from processing 5M+ documents
blog.abdellatif.io·1d·
Discuss: Hacker News
Flag this post

October 20, 2025 • 3 min read

I’ve spent the past 8 months the RAG trenches, I want to share what actually worked vs. wasted our time. We built RAG for Usul AI (9M pages) and an unnamed legal AI enterprise (4M pages).

Langchain + Llamaindex

We started out with youtube tutorials. First Langchain -> Llamaindex. Got to a working prototype in a couple of days and were optimistic with the progress. We run tests on subset of the data (100 documents) and the results looked great. We spend the next few days running the pipeline on the production dataset and got everything working in a week — incredible.

Except it wasn’t, the results were subpar and only the end users could tell. We spent the following few months rewriting pieces of the system, one at a time, until the perfo…

Similar Posts

Loading similar posts...