Member-only story

Don’t do “RAG” because now there is “CAG”

Attention Data Scientists, the new successor of RAG could be CAG!

Berika Varol Malkoçoğlu
Towards AI
5 min readJan 13, 2025

Don’t do “RAG” because now there is “CAG”

For readers who can’t see the full story, click here.

In recent years, Large Language Models (LLM) have made great progress in knowledge-based tasks. The traditional Retrieval-Augmented Generation (RAG) method was able to produce better answers by feeding LLM models with external knowledge sources.

However, this method is limited by delays in information retrieval and selection errors.

This is where a new paradigm, Cache-Augmented Generation (CAG), comes in. CAG enables models with long context windows to preload all information. Thus, the information retrieval process is completely disabled.

Advantages of CAG:

  • Faster response times
  • Reducing the risk of error
  • Simpler and simpler architecture

CAG Theory

LLMs have a specific context window. This window determines the maximum amount of information that can be given to the model at the same time. CAG loads all the necessary information into the context window in advance. Thus, the model does not need to dynamically fetch a separate source of information during the query.

Create an account to read the full story.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Or, continue in mobile web

Already have an account? Sign in

Published in Towards AI

The leading AI community and content platform focused on making AI accessible to all. Check out our new course platform: https://academy.towardsai.net/courses/beginner-to-advanced-llm-dev

Written by Berika Varol Malkoçoğlu

PhD | Data Scientist | Lecturer | AI Researcher

No responses yet

Write a response