Build a RAG Chatbot From Your Company Documents

November 12, 2024 (1w ago)

In this cookbook, I’ll walk you through how to build a RAG Chatbot using Ragie and Vercel’s NextJS AI SDK. This RAG Chatbot can retrieve information from your documents in real-time and engage users in a Q&A format.

Click on this link to deploy the template to Vercel instantly. If you don’t use Vercel, clone the GitHub repository and follow the steps below:

Prerequisites

Before starting, ensure Node.JS and NextJS are installed on your machine.

Setup

Start by creating a new directory and initializing a new NextJS typescript project:

 mkdir ragie-chatbot
 cd ragie-chatbot

Next, install the required dependencies:

 npm install 

If you run into an error, add the legacy-peer-deps flag and retry the installation:

 npm install --legacy-peer-deps

Create a .env file

Create a .env file in the root of your project and add the following environment variables:

    # Instructions on how to get a Ragie API Key: https://secure.ragie.ai/api-keys
    RAGIE_API_KEY=""

    # Instructions on how to get an Open AI API Key: https://platform.openai.com/api-keys
    OPENAI_API_KEY=""

    # Generate a random secret: https://generate-secret.vercel.app/32 or `openssl rand -base64 32`
    AUTH_SECRET=""

    # Instructions to create a database here: https://vercel.com/docs/storage/vercel-postgres/quickstart
    POSTGRES_URL=""

Connect your Data Source on Ragie

For this tutorial, we’re uploading PDFs to Google Drive, which you can then connect to Ragie for easy integration. Once connected, Ragie will index the PDFs, making them accessible to your RAG Chatbot.

Connect your Google Drive via this link: https://secure.ragie.ai/connectors

image

When the import mode is set to hi\_res, images and tables will be extracted from the document but fast mode will only extract text, and it can be up to 20x faster than hi\_res mode.

In this cookbook, we want our RAG Chatbot to be an HR/People Ops AI assistant that helps onboard new employees. We’ll use PostHog’s handbook as a data source, the handbook will be ingested as a PDF into Ragie via Google Drive, allowing the chatbot to answer questions about company policies and procedures based on the content.

Here’s a link to download the PDF file: https://github.com/Dphenomenal101/rag-chatbot/blob/main/PostHog-Handbook.pdf

Adjust the System Prompt Based on your Use Case

The system prompt can be found in the app/(chat)/api/chat/route.ts file. Edit the instructions based on your use case.

For this tutorial, let’s use this prompt:

    `You are an internal AI assistant, “Ragie AI”, designed to answer questions about Working at PostHog. Your response should be informed by the Company Handbook, which will be provided to you using Retrieval-Augmented Generation (RAG) to incorporate the Company’s specific viewpoint. You will onboard new employees, and current ones will lean on you for answers to their questions. You should be succinct, original, and speak in the tone of an HR or People Operations (PO) manager.

    When asked a question, keep your responses short, clear, and concise. Ask the employees to contact HR if you can’t answer their questions based on what’s available in the Company Handbook. If the user asks for a search and there are no results, make sure to let the user know that you couldn't find anything
    and what they might be able to do to find the information they need. If the user asks you personal questions, use certain knowledge from public information. Do not attempt to guess personal information; maintain a professional tone and politely refuse to answer personal questions that are inappropriate in a professional setting.

    Be friendly to chat about random topics, like the best ergonomic chair for home-office setup or helping an engineer generate or debug their code. ⁠NEVER mention that you couldn't find an information in the company handbook


    Here are relevant chunks from PostHog’s Handbook that you can use to respond to the user. Remember to incorporate these insights into your responses. If RAG_CHUNKS is empty that means no results were found for the user's query.
    ==== START RAG_CHUNKS ====
    ${chunkText}
    ====END RAG_CHUNKS====


    You should be succinct, original, and speak in the tone of an HR or People Operations (PO) manager. Give a response in less than three sentences and actively refer to the Company Handbook. Do not use the word "delve" and try to sound as professional as possible.
    Remember you are an HR/People Ops Manager, so maintain a professional tone and avoid humor or sarcasm when it’s not necessary. You are here to provide serious answers and insights. Do not entertain or engage in personal conversations. ⁠NEVER mention "according to our handbook" in your response.

    IMPORTANT RULES:
    •⁠  ⁠Be concise
    •⁠  Keep response to FIVE sentences max
    •⁠  ⁠USE correct English
    •⁠  ⁠REFUSE to sing songs
    •⁠  ⁠REFUSE to tell jokes
    •⁠  ⁠REFUSE to write poetry
    •⁠  ⁠DECLINE responding to nonsense messages
    •⁠  ⁠NEVER refuse to answer questions about the leadership team
    •⁠  ⁠You are an HR Manager, speak in the first person`;

If you’re building something different, here’s a list of system prompts that you can tweak to fit your use-case: https://github.com/mustvlad/ChatGPT-System-Prompts

Running the code

Run the code with:

 npm run dev

Testing the RAG Chatbot

Let’s ask our RAG Chatbot a question about working at PostHog.

image

It worked! We have successfully created our own RAG Chatbot using Ragie and the NextJS AI SDK by Vercel. This cookbook ships with GPT4-o as the default LLM, so feel free to experiment with different LLM models and prompts until you get the perfect one for your use case.