Develop a RAG (Retrieval Enhancement Generation) system using Vercel and Nextjs technology

4 min readJul 20, 2024

Baptiste Adrien introduced at X how he is developing a RAG (Retrieval Augmentation Generation) system using @vercel and @nextjs technologies. He shared the entire development process.

The basic principles and construction framework of RAG (Retrieval Enhanced Generation) are introduced in detail and intuitively.

1. Documentation

The first step in developing a RAG system is to prepare documents. These documents will serve as the basic data for the system.

2. Text Extraction

Next, the document is processed using an OCR (Optical Character Recognition) model, which can extract text from the image if needed.

3. Text chunking

Break the extracted text into smaller, more manageable parts. This chunking helps make subsequent processing and analysis more efficient.

4. Embedding Model

Each chunk of text is converted into a vector through an embedding model. These vectors are numerical representations that capture the semantic meaning of the text.

5. Vector Storage

The generated vectors are stored in a vector database. This database enables the system to efficiently retrieve related information based on semantic similarity.

6. User input issues

The user enters a question through the system. This question will be used to retrieve the most relevant information from the vector database.

7. Question embedding

The user-entered questions are processed using the same embedding model to ensure that the questions and text blocks are in the same vector space.

8. Vector Matching

The system matches the embedded questions with the vectors in the database based on similarity and retrieves the most similar chunks of text.

9. Information Processing

The system retrieves the most relevant documents based on the similarity scores. Then, the LLM (Large Language Model) processes this relevant information to generate detailed answers to the user’s questions.

10. Final Answer

The final answers are presented to the user. These answers are generated from the most relevant information in the retrieved documents, ensuring accuracy and relevance.

Vercel RAG Chatbot Guide

This guide details how to build a chatbot application based on Retrieval-Augmented Generation (RAG) technology. RAG technology enhances the generation ability of the Large Language Model (LLM) by providing specific information related to the prompt. The key steps include:

Basic concept of RAG : RAG is a process of providing specific contextual information to LLM to enhance its generative ability.
Importance : It solves the problem that LLM can only answer questions based on its training data, by retrieving relevant information and passing it to the model as context to provide accurate answers.
Embedding and vector databases : Use vectors to represent words, phrases, or images, and implement semantic search by calculating the similarity between vectors.
Content Chunking : Breaking down source material into smaller parts for embedding and then storing them in a database to improve the quality of the embedding.
Project setup : Use Next.js 14, Vercel AI SDK, OpenAI, Drizzle ORM, Postgres and pgvector, shadcn-ui and TailwindCSS technology stack for project development.

Details: https://sdk.vercel.ai/docs/guides/rag-chatbot