Leveraging MongoDB Atlas Vector Search for Semantic Search and RAG

Sahitya Arya Last Updated : 23 May, 2024

7 min read

Introduction

Did you know that MongoDB Atlas now provides powerful vector search capabilities? Yes, it now lets you perform semantic search on your data and implement retrieval-augmented generation (RAG) for large language model (LLM) applications. By integrating Atlas Vector Search with popular frameworks like LangChain, LlamaIndex, and client libraries, you can easily build advanced natural language processing (NLP) solutions. In this article, we will see how to leverage MongoDB Atlas Vector Search for semantic search and RAG.

Using MongoDB Atlas Vector Search for Semantic Search and RAG

Learning Objectives

Understand the concept of vector search and its applications in natural language processing and information retrieval.
Learn how to integrate MongoDB Atlas Vector Search with the LangChain framework for building retrieval-augmented generation (RAG) applications.
Develop the ability to construct RAG chains that combine vector search retrieval, prompt templates, and LLMs to generate context-aware responses.
Appreciate the benefits of using MongoDB Atlas Vector Search as a vector store, including efficiency, consistency, scalability, and simplicity.
Explore the flexibility and extensibility of LangChain by learning about customization options for retrieval processes, LLM providers, and more.

What is Vector Search?
Integrating MongoDB Atlas Vector Search
Benefits of Using MongoDB for Vector Search
Use Cases of Vector Search and RAG
Frequently Asked Questions

What is Vector Search?

Vector search, also known as semantic search, is a technique that goes beyond traditional keyword-based searching. It employs machine learning models to transform data like text, audio, or images into high-dimensional vector representations called embeddings. These embeddings capture the semantic meaning of the data, allowing you to find similar content based on proximity in the vector space, even when the exact words don’t match.

The core benefit of vector search is its ability to understand the intent and context behind queries, making it incredibly useful for various applications, including search engines, recommendation systems, and language models.

Integrating MongoDB Atlas Vector Search

MongoDB Atlas, the fully managed cloud database service, now supports vector search natively. By storing vector embeddings alongside your data in MongoDB, you can perform efficient semantic searches without the need for a separate vector store, ensuring data consistency and simplifying your application architecture.

The process typically involves:

Loading your data into a MongoDB Atlas cluster.
Generating vector embeddings for your data using pre-trained models like OpenAI’s text-embedding-ada-002.
Storing the embeddings alongside your data in MongoDB.
Creating an Atlas Vector Search index on the embedded fields.
Running vector search queries using Atlas’s powerful $vectorSearch aggregation pipeline stage.

Prerequisites: To integrate Atlas Vector Search with LangChain, you need an Atlas cluster running MongoDB version 6.0.11, 7.0.2, or later, an OpenAI API key (or an alternative LLM provider), and a Python environment to run your project.

LangChain Integration

LangChain is an open-source framework written in Python that aims to simplify the development of applications powered by LLMs. It provides a modular and extensible architecture, allowing developers to build complex workflows by combining reusable components called “chains.”

One of the key features of LangChain is its support for retrieval-augmented generation (RAG), a technique that combines the power of LLMs with external data sources. By integrating MongoDB Atlas Vector Search with LangChain, developers can leverage MongoDB as a high-performance vector database, enabling efficient semantic search and RAG implementations.

MongoDB Atlas Vector Search integration using LangChain

The integration process typically involves the following steps:

Step 1: Set Up the Environment

Install the required Python packages, including langchain, langchain-mongodb, and langchain-openai.
Define environment variables, such as your OpenAI API key and Atlas cluster connection string.

import os
import getpass
from langchain_mongodb import MongoDBAtlasVectorSearch
from langchain_openai import OpenAIEmbeddings, ChatOpenAI

os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")
ATLAS_CONNECTION_STRING = getpass.getpass("MongoDB Atlas SRV Connection String:")

Step 2: Use Atlas as a Vector Store

Connect to your Atlas cluster using the provided connection string.
Load your data into Atlas, either by inserting documents directly or using LangChain’s built-in data loaders for various file formats (e.g., PDF, CSV, JSON).
Split your data into smaller chunks or documents using LangChain’s text splitters.
Instantiate Atlas as a vector store using the `MongoDBAtlasVectorSearch` class, specifying the collection and index name.
Generate vector embeddings for your data using a pre-trained model like OpenAI’s text-embedding-ada-002.

from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Load data from a PDF
loader = PyPDFLoader("https://example.com/document.pdf")
data = loader.load()

# Split data into documents
text_splitter = RecursiveCharacterTextSplitter(chunk_size=200, chunk_overlap=20)
docs = text_splitter.split_documents(data)

Now, we’re done with the first phase.

Step 3: Create the Atlas Vector Search Index

Define the Atlas Vector Search index schema, specifying the vector field (e.g., “embedding”) and any additional filter fields.
Create the index on your Atlas collection using the Atlas UI or the MongoDB Atlas Search API.

# Instantiate Atlas as a vector store
client = MongoClient(ATLAS_CONNECTION_STRING)
atlas_collection = client["langchain_db"]["documents"]
vector_search = MongoDBAtlasVectorSearch.from_documents(
    documents=docs,
    embedding=OpenAIEmbeddings(),
    collection=atlas_collection,
    index_name="vector_index"
)

Step 4: Run Vector Search Queries

Use LangChain’s `MongoDBAtlasVectorSearch.as_retriever` method to instantiate Atlas Vector Search as a retriever for semantic search.
Perform various types of vector search queries, such as basic semantic search, search with relevance scores, or search with metadata filtering.

query = "MongoDB Atlas security"
results = vector_search.similarity_search(query)

This marks the completion of the second phase.

Step 5: Implement RAG

Define a prompt template that instructs the LLM to use the retrieved documents as context for generating a response.
Construct a RAG chain by combining the Atlas Vector Search retriever, the prompt template, and an LLM like OpenAI’s ChatGPT.
Prompt the RAG chain with your query, and it will retrieve relevant documents from Atlas, pass them to the LLM, and generate a context-aware response.

from langchain.prompts import PromptTemplate
from langchain.chains import RetrievalQA

# Define the prompt template
template = """Use the following context to answer the question:
Context: {context}
Question: {question}"""
prompt = PromptTemplate(template=template, input_variables=["context", "question"])

# Create the RAG chain
rag = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(),
    chain_type="stuff",
    retriever=vector_search.as_retriever(),
    prompt=prompt)

# Ask a question
query = "How can I secure my MongoDB Atlas cluster?"
result = rag({"query": query})
print(result['result'])

LangChain provides a high degree of flexibility and extensibility, allowing developers to customize the integration with Atlas Vector Search to suit their specific requirements. For example, you can fine-tune the retrieval process by adjusting parameters like the number of documents to retrieve, the relevance score threshold, or the similarity metric used for ranking.

While this integration focuses on MongoDB Atlas Vector Search, LangChain supports various vector databases and search engines, including Chroma, Weaviate, and Pinecone, among others. Additionally, LangChain supports various LLM providers, such as OpenAI, Anthropic, Cohere, and more, allowing you to leverage different language models for your RAG implementations easily.

By combining the power of LangChain’s modular architecture with MongoDB Atlas Vector Search’s efficient semantic search capabilities, developers can build sophisticated natural language processing applications that can understand context, retrieve relevant information, and generate informed responses, all while leveraging the scalability and consistency of MongoDB’s document database.

LlamaIndex Integration

LlamaIndex is another open-source framework designed to simplify the integration of custom data sources with LLMs. It provides tools for loading and preparing vector embeddings, enabling RAG implementations. By integrating Atlas Vector Search with LlamaIndex, you can use MongoDB as a vector store and retrieve semantically similar documents to augment your LLM’s knowledge.

MongoDB Atlas Vector Search integration using LlamaIndex

The process involves setting up your Atlas cluster, loading data into a LlamaIndex index, and storing the vector embeddings in MongoDB using the MongoDBAtlasVectorSearch vector store. You can then run semantic searches using LlamaIndex’s VectorIndexRetriever and leverage a query engine to generate context-aware responses based on the retrieved documents.

Client Library Integration

In addition to popular frameworks, you can also integrate Atlas Vector Search directly into your applications using MongoDB’s official client libraries. This approach involves generating vector embeddings for your data (e.g., using the OpenAI API), storing them in MongoDB, creating a vector search index, and running $vectorSearch queries from your application code.

For example, with the Node.js client library, you can set up an Atlas trigger to automatically generate embeddings for new documents using the OpenAI API. It can then create a vector search index, and perform semantic searches using the $vectorSearch aggregation pipeline stage.

Benefits of Using MongoDB for Vector Search

Integrating vector search capabilities with MongoDB Atlas offers several key benefits:

Efficiency: By storing vectors alongside your data, you avoid the need to sync between your application database and a separate vector store. This improves performance and simplifies your architecture.
Consistency: Storing embeddings with the original data ensures that vectors are always associated with the correct data, even if the vector generation process changes over time.
Scalability: MongoDB Atlas provides horizontal and vertical scalability, allowing you to handle demanding vector search workloads seamlessly.
Simplicity: With a single database for your data and vector embeddings, you reduce the complexity of your application and potential points of failure.
Managed Service: MongoDB Atlas is a fully managed cloud database service. So, it offloads the operational burden and allows you to focus on building your applications.

Use Cases of Vector Search and RAG

Vector search and RAG have numerous applications across various industries and domains, including:

Intelligent search engines: Provide more relevant and contextual search results, even when users’ queries are ambiguous or imprecise.
Customer support: Build chatbots and virtual assistants that can understand natural language queries and provide accurate, context-aware responses by leveraging relevant knowledge bases.
E-commerce and recommendations: Improve product recommendations by understanding user preferences and finding semantically similar items.
Content analysis: Identify similar content across large datasets. This helps in tasks like plagiarism detection, content deduplication, and topic clustering.
Biomedical research: Accelerate drug discovery and medical research by finding relevant scientific literature and data based on semantic similarity.

Conclusion

MongoDB Atlas Vector Search opens up exciting possibilities for building advanced NLP applications that can understand context and intent. By integrating with popular frameworks like LangChain and LlamaIndex, or leveraging client libraries, you can easily implement semantic search and RAG capabilities. Go ahead, try it out, and unlock new levels of intelligence and relevance in your applications!

Frequently Asked Questions

Q1. What is retrieval-augmented generation (RAG)?

A. RAG is a technique that combines the power of large language models (LLMs) with external data sources. It involves retrieving relevant information from a data source and using it as context for the LLM to generate more informed and accurate responses.

Q2. Why integrate MongoDB Atlas Vector Search with LangChain?

A. By integrating Atlas Vector Search with LangChain, developers can leverage MongoDB as a high-performance vector database for efficient semantic search and RAG implementations. This integration provides benefits such as data consistency, scalability, and a simplified application architecture.

Q3. How to integrate MongoDB Atlas Vector Search with LangChain?

A. The integration process typically begins with setting up the environment and loading data into Atlas. This is followed by creating a vector search index and running semantic search queries using LangChain’s `MongoDBAtlasVectorSearch` module. Finally, RAG chains are constructed to combine vector search retrieval, prompt templates, and LLMs.

Q4. How does the performance of MongoDB Atlas Vector Search compare to other vector databases?

A. MongoDB Atlas Vector Search is designed to provide efficient and scalable vector search capabilities, using MongoDB’s distributed architecture. Its performance can be comparable or superior to other vector databases, on dataset size, query complexity, and hardware resources.

Sahitya Arya

I'm Sahitya Arya, a seasoned Deep Learning Engineer with one year of hands-on experience in both Deep Learning and Machine Learning. Throughout my career, I've authored more than three research papers and have gained a profound understanding of Deep Learning techniques. Additionally, I possess expertise in Large Language Models (LLMs), contributing to my comprehensive skill set in cutting-edge technologies for artificial intelligence.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Introduction to Generative AI

Introduction to Generative AI applications

No-code Generative AI app development

Code-focused Generative AI App Development

Introduction to Responsible AI

LLMS

Prompt Engineering

Finetuning LLMs

Training LLMs from Scratch

Langchain

RAG

LlamaIndex

Stable Diffusion

Leveraging MongoDB Atlas Vector Search for Semantic Search and RAG

Introduction

Learning Objectives

Table of Contents

What is Vector Search?

Integrating MongoDB Atlas Vector Search

LangChain Integration

Step 1: Set Up the Environment

Step 2: Use Atlas as a Vector Store

Step 3: Create the Atlas Vector Search Index

Step 4: Run Vector Search Queries

Step 5: Implement RAG

LlamaIndex Integration

Client Library Integration

Benefits of Using MongoDB for Vector Search

Use Cases of Vector Search and RAG

Conclusion

Frequently Asked Questions

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt

lang

s_tp