Leveraging MongoDB Atlas Vector Search for Semantic Search and RAG

Sahitya Arya Last Updated : 23 May, 2024
7 min read

Introduction

Did you know that MongoDB Atlas now provides powerful vector search capabilities? Yes, it now lets you perform semantic search on your data and implement retrieval-augmented generation (RAG) for large language model (LLM) applications. By integrating Atlas Vector Search with popular frameworks like LangChain, LlamaIndex, and client libraries, you can easily build advanced natural language processing (NLP) solutions. In this article, we will see how to leverage MongoDB Atlas Vector Search for semantic search and RAG.

Using MongoDB Atlas Vector Search for Semantic Search and RAG

Learning Objectives

  • Understand the concept of vector search and its applications in natural language processing and information retrieval.
  • Learn how to integrate MongoDB Atlas Vector Search with the LangChain framework for building retrieval-augmented generation (RAG) applications.
  • Develop the ability to construct RAG chains that combine vector search retrieval, prompt templates, and LLMs to generate context-aware responses.
  • Appreciate the benefits of using MongoDB Atlas Vector Search as a vector store, including efficiency, consistency, scalability, and simplicity.
  • Explore the flexibility and extensibility of LangChain by learning about customization options for retrieval processes, LLM providers, and more.

Vector search, also known as semantic search, is a technique that goes beyond traditional keyword-based searching. It employs machine learning models to transform data like text, audio, or images into high-dimensional vector representations called embeddings. These embeddings capture the semantic meaning of the data, allowing you to find similar content based on proximity in the vector space, even when the exact words don’t match.

The core benefit of vector search is its ability to understand the intent and context behind queries, making it incredibly useful for various applications, including search engines, recommendation systems, and language models.

MongoDB Atlas, the fully managed cloud database service, now supports vector search natively. By storing vector embeddings alongside your data in MongoDB, you can perform efficient semantic searches without the need for a separate vector store, ensuring data consistency and simplifying your application architecture.

The process typically involves:

  1. Loading your data into a MongoDB Atlas cluster.
  2. Generating vector embeddings for your data using pre-trained models like OpenAI’s text-embedding-ada-002.
  3. Storing the embeddings alongside your data in MongoDB.
  4. Creating an Atlas Vector Search index on the embedded fields.
  5. Running vector search queries using Atlas’s powerful $vectorSearch aggregation pipeline stage.

Prerequisites: To integrate Atlas Vector Search with LangChain, you need an Atlas cluster running MongoDB version 6.0.11, 7.0.2, or later, an OpenAI API key (or an alternative LLM provider), and a Python environment to run your project.

LangChain Integration

LangChain is an open-source framework written in Python that aims to simplify the development of applications powered by LLMs. It provides a modular and extensible architecture, allowing developers to build complex workflows by combining reusable components called “chains.”

One of the key features of LangChain is its support for retrieval-augmented generation (RAG), a technique that combines the power of LLMs with external data sources. By integrating MongoDB Atlas Vector Search with LangChain, developers can leverage MongoDB as a high-performance vector database, enabling efficient semantic search and RAG implementations.

MongoDB Atlas Vector Search integration using LangChain

The integration process typically involves the following steps:

Step 1: Set Up the Environment

  1. Install the required Python packages, including langchain, langchain-mongodb, and langchain-openai.
  2. Define environment variables, such as your OpenAI API key and Atlas cluster connection string.
import os
import getpass
from langchain_mongodb import MongoDBAtlasVectorSearch
from langchain_openai import OpenAIEmbeddings, ChatOpenAI

os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")
ATLAS_CONNECTION_STRING = getpass.getpass("MongoDB Atlas SRV Connection String:")

Step 2: Use Atlas as a Vector Store

  1. Connect to your Atlas cluster using the provided connection string.
  2. Load your data into Atlas, either by inserting documents directly or using LangChain’s built-in data loaders for various file formats (e.g., PDF, CSV, JSON).
  3. Split your data into smaller chunks or documents using LangChain’s text splitters.
  4. Instantiate Atlas as a vector store using the `MongoDBAtlasVectorSearch` class, specifying the collection and index name.
  5. Generate vector embeddings for your data using a pre-trained model like OpenAI’s text-embedding-ada-002.
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Load data from a PDF
loader = PyPDFLoader("https://example.com/document.pdf")
data = loader.load()

# Split data into documents
text_splitter = RecursiveCharacterTextSplitter(chunk_size=200, chunk_overlap=20)
docs = text_splitter.split_documents(data)

Now, we’re done with the first phase.

Step 3: Create the Atlas Vector Search Index

  1. Define the Atlas Vector Search index schema, specifying the vector field (e.g., “embedding”) and any additional filter fields.
  2. Create the index on your Atlas collection using the Atlas UI or the MongoDB Atlas Search API.
# Instantiate Atlas as a vector store
client = MongoClient(ATLAS_CONNECTION_STRING)
atlas_collection = client["langchain_db"]["documents"]
vector_search = MongoDBAtlasVectorSearch.from_documents(
    documents=docs,
    embedding=OpenAIEmbeddings(),
    collection=atlas_collection,
    index_name="vector_index"
)

Step 4: Run Vector Search Queries

  1. Use LangChain’s `MongoDBAtlasVectorSearch.as_retriever` method to instantiate Atlas Vector Search as a retriever for semantic search.
  2. Perform various types of vector search queries, such as basic semantic search, search with relevance scores, or search with metadata filtering.
query = "MongoDB Atlas security"
results = vector_search.similarity_search(query)

This marks the completion of the second phase.

Step 5: Implement RAG

  1. Define a prompt template that instructs the LLM to use the retrieved documents as context for generating a response.
  2. Construct a RAG chain by combining the Atlas Vector Search retriever, the prompt template, and an LLM like OpenAI’s ChatGPT.
  3. Prompt the RAG chain with your query, and it will retrieve relevant documents from Atlas, pass them to the LLM, and generate a context-aware response.
from langchain.prompts import PromptTemplate
from langchain.chains import RetrievalQA

# Define the prompt template
template = """Use the following context to answer the question:
Context: {context}
Question: {question}"""
prompt = PromptTemplate(template=template, input_variables=["context", "question"])

# Create the RAG chain
rag = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(),
    chain_type="stuff",
    retriever=vector_search.as_retriever(),
    prompt=prompt)

# Ask a question
query = "How can I secure my MongoDB Atlas cluster?"
result = rag({"query": query})
print(result['result'])

LangChain provides a high degree of flexibility and extensibility, allowing developers to customize the integration with Atlas Vector Search to suit their specific requirements. For example, you can fine-tune the retrieval process by adjusting parameters like the number of documents to retrieve, the relevance score threshold, or the similarity metric used for ranking.

While this integration focuses on MongoDB Atlas Vector Search, LangChain supports various vector databases and search engines, including Chroma, Weaviate, and Pinecone, among others. Additionally, LangChain supports various LLM providers, such as OpenAI, Anthropic, Cohere, and more, allowing you to leverage different language models for your RAG implementations easily.

By combining the power of LangChain’s modular architecture with MongoDB Atlas Vector Search’s efficient semantic search capabilities, developers can build sophisticated natural language processing applications that can understand context, retrieve relevant information, and generate informed responses, all while leveraging the scalability and consistency of MongoDB’s document database.

LlamaIndex Integration

LlamaIndex is another open-source framework designed to simplify the integration of custom data sources with LLMs. It provides tools for loading and preparing vector embeddings, enabling RAG implementations. By integrating Atlas Vector Search with LlamaIndex, you can use MongoDB as a vector store and retrieve semantically similar documents to augment your LLM’s knowledge.

MongoDB Atlas Vector Search integration using LlamaIndex

The process involves setting up your Atlas cluster, loading data into a LlamaIndex index, and storing the vector embeddings in MongoDB using the MongoDBAtlasVectorSearch vector store. You can then run semantic searches using LlamaIndex’s VectorIndexRetriever and leverage a query engine to generate context-aware responses based on the retrieved documents.

Client Library Integration

In addition to popular frameworks, you can also integrate Atlas Vector Search directly into your applications using MongoDB’s official client libraries. This approach involves generating vector embeddings for your data (e.g., using the OpenAI API), storing them in MongoDB, creating a vector search index, and running $vectorSearch queries from your application code.

For example, with the Node.js client library, you can set up an Atlas trigger to automatically generate embeddings for new documents using the OpenAI API. It can then create a vector search index, and perform semantic searches using the $vectorSearch aggregation pipeline stage.

Integrating vector search capabilities with MongoDB Atlas offers several key benefits:

  • Efficiency: By storing vectors alongside your data, you avoid the need to sync between your application database and a separate vector store. This improves performance and simplifies your architecture.
  • Consistency: Storing embeddings with the original data ensures that vectors are always associated with the correct data, even if the vector generation process changes over time.
  • Scalability: MongoDB Atlas provides horizontal and vertical scalability, allowing you to handle demanding vector search workloads seamlessly.
  • Simplicity: With a single database for your data and vector embeddings, you reduce the complexity of your application and potential points of failure.
  • Managed Service: MongoDB Atlas is a fully managed cloud database service. So, it offloads the operational burden and allows you to focus on building your applications.

Use Cases of Vector Search and RAG

Vector search and RAG have numerous applications across various industries and domains, including:

  1. Intelligent search engines: Provide more relevant and contextual search results, even when users’ queries are ambiguous or imprecise.
  2. Customer support: Build chatbots and virtual assistants that can understand natural language queries and provide accurate, context-aware responses by leveraging relevant knowledge bases.
  3. E-commerce and recommendations: Improve product recommendations by understanding user preferences and finding semantically similar items.
  4. Content analysis: Identify similar content across large datasets. This helps in tasks like plagiarism detection, content deduplication, and topic clustering.
  5. Biomedical research: Accelerate drug discovery and medical research by finding relevant scientific literature and data based on semantic similarity.

Conclusion

MongoDB Atlas Vector Search opens up exciting possibilities for building advanced NLP applications that can understand context and intent. By integrating with popular frameworks like LangChain and LlamaIndex, or leveraging client libraries, you can easily implement semantic search and RAG capabilities. Go ahead, try it out, and unlock new levels of intelligence and relevance in your applications!

Frequently Asked Questions

Q1. What is retrieval-augmented generation (RAG)?

A. RAG is a technique that combines the power of large language models (LLMs) with external data sources. It involves retrieving relevant information from a data source and using it as context for the LLM to generate more informed and accurate responses.

Q2. Why integrate MongoDB Atlas Vector Search with LangChain?

A. By integrating Atlas Vector Search with LangChain, developers can leverage MongoDB as a high-performance vector database for efficient semantic search and RAG implementations. This integration provides benefits such as data consistency, scalability, and a simplified application architecture.

Q3. How to integrate MongoDB Atlas Vector Search with LangChain?

A. The integration process typically begins with setting up the environment and loading data into Atlas. This is followed by creating a vector search index and running semantic search queries using LangChain’s `MongoDBAtlasVectorSearch` module. Finally, RAG chains are constructed to combine vector search retrieval, prompt templates, and LLMs.

Q4. How does the performance of MongoDB Atlas Vector Search compare to other vector databases?

A. MongoDB Atlas Vector Search is designed to provide efficient and scalable vector search capabilities, using MongoDB’s distributed architecture. Its performance can be comparable or superior to other vector databases, on dataset size, query complexity, and hardware resources.

I'm Sahitya Arya, a seasoned Deep Learning Engineer with one year of hands-on experience in both Deep Learning and Machine Learning. Throughout my career, I've authored more than three research papers and have gained a profound understanding of Deep Learning techniques. Additionally, I possess expertise in Large Language Models (LLMs), contributing to my comprehensive skill set in cutting-edge technologies for artificial intelligence.

Responses From Readers

Clear

We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.

Show details