6 min readJust now
–
Press enter or click to view image in full size
During the weekend, I scrolled through Twitter to see what was happening in the AI community. MIT has just released a groundbreaking paper that addresses a significant issue with large language models.
It sounds very academic, but here’s the simple version: essentially, if you have AI act a second time, the results can be remarkable.
Over the past two years, almost all mainstream large-scale models have been racing to expand their context windows. Gemini has increased its window size to the millions, the GPT series continues to increase its investment, and Llama has even proclaimed a goal of tens of millions of tokens.
On the surface, this is an arms race of “who can fill the most space.” But the problem …
6 min readJust now
–
Press enter or click to view image in full size
During the weekend, I scrolled through Twitter to see what was happening in the AI community. MIT has just released a groundbreaking paper that addresses a significant issue with large language models.
It sounds very academic, but here’s the simple version: essentially, if you have AI act a second time, the results can be remarkable.
Over the past two years, almost all mainstream large-scale models have been racing to expand their context windows. Gemini has increased its window size to the millions, the GPT series continues to increase its investment, and Llama has even proclaimed a goal of tens of millions of tokens.
On the surface, this is an arms race of “who can fill the most space.” But the problem is that increasing the context window does not mean that the model can actually “read in and remember” all the content.
Another popular approach is Retrieval-Augmented Generation (RAG), which first segments long documents into chunks and stores them in a vector database, then retrieves relevant segments based on the question and feeds them to the model.
This avoids having the model consume the entire long document at once, but its effectiveness is highly dependent on the quality of the retrieval, and it often struggles with questions that…