Member-only story
Don’t do “RAG” because now there is “CAG”
Attention Data Scientists, the new successor of RAG could be CAG!

For readers who can’t see the full story, click here.
In recent years, Large Language Models (LLM) have made great progress in knowledge-based tasks. The traditional Retrieval-Augmented Generation (RAG) method was able to produce better answers by feeding LLM models with external knowledge sources.
However, this method is limited by delays in information retrieval and selection errors.
This is where a new paradigm, Cache-Augmented Generation (CAG), comes in. CAG enables models with long context windows to preload all information. Thus, the information retrieval process is completely disabled.
Advantages of CAG:
- Faster response times
- Reducing the risk of error
- Simpler and simpler architecture
CAG Theory
LLMs have a specific context window. This window determines the maximum amount of information that can be given to the model at the same time. CAG loads all the necessary information into the context window in advance. Thus, the model does not need to dynamically fetch a separate source of information during the query.