Retrieval-Augmented Generation

Also known as: RAG, retrieval augmented generation, retrieval-augmented generation

Giving a language model the right documents at answer time so it reasons over real, current sources instead of only its trained-in memory.

Retrieval-Augmented Generation

RAG retrieves relevant documents at query time and feeds them to a language model as context, so the answer is grounded in real, current sources rather than whatever the model happened to memorise in training. I architected a multi-modal RAG platform at North AI on AWS Bedrock, Aurora, and pgvector.

It’s the most practical lever I know for trust. A model left to its own memory confabulates fluently; the same model handed the right passage cites something you can check. The hard part is retrieval quality — bad context produces confident nonsense just as easily.