Retrieval-Augmented Generation (RAG) Demo

Retrieval-Augmented Generation (RAG)

Groq (Serverless LLM) Inference API Hugging Face Inference API MongoDB Vector Search LangChain.js

Ingested Files

No files ingested yet.

You can add files to the rag_input folder on the server and process them for RAG.

Ask a Question

Try asking a question about information in the ingested RAG documents that was not part of the LLM's training data.

Example Questions:

How much did Amazon make in 2024?
How much debt did Amazon have at the end of 2024?
How much was Microsoft's advertising expense in fiscal year 2024?
What is the total amount of stock Microsoft gave to its employees in fiscal year 2024?

Boilerplate RAG with Semantic Caching

You can start with this base RAG implementation and extend it for your project, allowing you to focus on your business logic rather than boilerplate code. For this implementation, we are using:

LangChain.js 🦜️🔗 for the pipeline and integrations
BAAI/bge-small-en-v1.5 embedding model through the Hugging Face 🤗 Inference API
llama-3.3-70b-versatile 🦙 hosted by Groq as the LLM
PDF.js for extracting text from PDF files
MongoDB Atlas Vector Search as the vector DB for RAG document storage and LLM caching
MongoDB as the key-value store for embedding caching

The commented source code, including helper functions, is in controllers/ai.js