From Claude 100K to Gemini 10M, we are in the era of long-context large language models (LLMs).
Retrieval-Augmented Generation (RAG) is a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources. [^1] [^2]
The hype surrounding RAG is largely driven by its potential to address some of the limitations of language models by enabling the development of more accurate, contextually grounded, and creative language generation systems. However, it's essential to note that RAG is still a relatively new and evolving field, and there are several challenges and limitations that need to be addressed before it can be widely adopted. For example, RAG requires large amounts of high-quality training data, and it can be challenging to integrate the retrieval and generation components in a way that produces coherent and natural-sounding language.
LLMs such as Claude 100K and Gemini 10M support long contexts spanning tens