Imagine chatting with a virtual assistant that remembers not just your last question but the entire flow of your conversation—personal details, preferences, even follow-up queries. This memory transforms chatbots from simple Q&A machines into sophisticated conversational partners, capable of handling complex topics over multiple interactions. In this article, we dive into the fascinating world of conversational memory in Retrieval-Augmented Generation (RAG) systems, exploring the techniques that allow chatbots to hold onto context, personalize responses, and manage multi-step queries seamlessly. You’ll learn about different memory strategies, their advantages and limitations, and even get hands-on with Python and LangChain to see how these concepts work in real time.
This article was published as a part of the Data Science Blogathon.
Conversational memory is crucial in chatbots and conversational agents because it enables the system to maintain context over extended interactions, making responses more relevant and personalized. In chatbot based applications, especially when the conversation spans complex topics or multiple queries, memory helps by:
There are multiple ways we can incorporate conversational memory in retrieval augmented generation. In LangChain, all these techniques can be executed through ConversationChain.
We’ll dive into implementing conversational memory using Python and LangChain, setting up essential components to enable chatbots to remember and refer back to previous exchanges. We’ll cover everything from creating memory types to enhancing response relevance, allowing you to build chatbots that handle extended, context-rich conversations smoothly.
To get started, we’ll install and import the necessary libraries for building conversational memory with Python and LangChain. This setup will provide the tools required to implement and test memory functions effectively.
!pip -q install openai langchain huggingface_hub transformers
!pip install langchain_community
!pip install langchain_openai
from langchain_openai import ChatOpenAI
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory
import os
os.environ['OPENAI_API_KEY'] = ''
We will explore how to implement Conversation Buffer Memory, which stores the complete interaction history between the user and the system. This memory type helps retain all prior exchanges, ensuring that the chatbot can maintain context throughout the conversation, though it may lead to higher token usage. We’ll walk through the process of setting it up and explain how it enhances the chatbot’s ability to respond with greater relevance.
#Defining the LLM
llm = ChatOpenAI(temperature=0, model="gpt-4o", max_tokens=1000)
conversation = ConversationChain(
llm=llm,
verbose=True,
memory=ConversationBufferMemory()
)
conversation.predict(input="Hi there! I am in Miami Today with my fiancee and want to go for shopping.")conversation.predict(input="How are you today?")
conversation.predict(input="Can you tell me some shopping malls?")
conversation.predict(input="Can you tell me who I am with in Miami?")#import csv
Output:
> Entering new ConversationChain chain... Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is
talkative and provides lots of specific details from its context. If the AI does not
know the answer to a question, it truthfully says it does not know. Current
conversation:
Human: Hi there! I am in Miami Today with my fiancee and want to go for shopping.
AI: Hello! That sounds like a fun day ahead. Miami is a fantastic place for shopping
with a variety of options to explore. If you're looking for Luxury brands and high-
end fashion, you might want to check out the Bal Harbour Shops. I If you're
interested in a more eclectic mix of shops, the Miami Design District is a great
choice. It offers a blend of fashion, art, and design with boutiques like Louis
Vuitton, Dior, and unique art galleries.
For a more traditional mall experience, you could visit the Aventura Mall,
which is one of the largest malls in the U.S. It has a wide range of stores from Zara and H&M
to Nordstrom and Bloomingdale's.
And if you're looking for something more budget-friendly, the Dolphin Mall is an
outlet mall with a variety of stores offering discounts on popular brands.
No matter where you choose to go, I'm sure you'll have a great time shopping in
Miami! Let me know if you need more information or have any other questions. Human:
Can you tell me some shopping malls?
AI: Of course! Here are some popular shopping malls in Miami that you might want to
consider:
1. **Aventura Mall**: As I mentioned earlier, this is one of the largest malls in
the U.S. It features over 300 stores, including popular brands like Apple, Zara, and
Bloomingdale's. It also has a variety of dining options and an imp 2. **Dolphin
Mall**: This is a great option for outlet shopping, with over 240 stores offering
discounts on brands like Nike, Calvin Klein, and Tommy Hilfiger. It also has a
range of dining and entertainment options, including a movi 3. **Bal Harbour
Shops**: Known for its luxury shopping experience, this open-air mall features
high-end brands like Chanel, Gucci, and Prada. It's set in a beautiful tropical
environment, making it a pleasant place to stroll and shop 4. **The Shops at
Merrick Park**: Located in Coral Gables, this upscale shopping center offers a mix
of Luxury and contemporary brands such as Neiman Marcus, Nordstrom, and Tiffany &
Co. It also has a selection of fine dining restaur 5. **Brickell City Centre**:
Situated in the heart of Miami's financial district, this modern shopping center
offers a mix of high-end and contemporary brands, including Saks Fifth Avenue,
AllSaints, and Sephora. It also features a vi 6. **Lincoln Road Mall**: While not a
traditional mall, this pedestrian street in South Beach is lined with shops, cafes,
and restaurants. It's a great place to enjoy the Miami weather while shopping at
stores like H&M, Anthropologie Each of these malls offers a unique shopping
experience, so it depends on what you're looking for. Enjoy your shopping adventure
in Miami!
Human: Can you tell me who I am with in Miami?
AI:
> Finished chain.
'You mentioned earlier that you are in Miami with your fiancée. I hope you both have
a wonderful time exploring the city and enjoying your shopping trip! If there's
anything else you'd like to know or any other way I can assist you, feel free to
ask.
Let’s check what is stored in buffer using this function.
print(conversation.memory.buffer)
Output:
Human: Hi there! I am in Miami Today with my fiancee and want to go for shopping. AI: Hello! That sounds like a fun day ahead. Miami is a great place for shopping with a variety of options to explore. If you're looking for high-end fashion and luxury brands, you might want to check out Human: Can you tell me some shopping malls? AI: Of course! Here are some popular shopping malls in Miami that you might want to visit: 1. **Aventura Mall**: As one of the largest malls in the United States, Aventura Mall offers a vast selection of stores, including both high-end and more affordable brands. You'll find everything from Nords 2. **Dolphin Mall**: This is a great spot for outlet shopping, with a wide range of stores offering discounts on popular brands. It's a bit more budget-friendly and includes stores like Nike, Calvin Klein, 3. **Brickell City Centre**: Located in the heart of Miami's financial district, this modern shopping center features luxury brands like Saks Fifth Avenue, as well as a variety of dining options and a ciner 4. **The Falls**: This is an open-air shopping center with a beautiful setting, featuring a waterfall and tropical landscaping. It has a mix of popular retailers like Macy's and specialty stores. 5. **Dadeland Mall**: Known for its large selection of department stores, including Macy's, JCPenney, and Nordstrom, Dadeland Mall also offers a variety of specialty shops and dining options. Each of these malls offers a unique shopping experience, so you can choose based on your preferences and what you're looking to buy. Enjoy your time in Miami! Human: Can you tell me who I am with in Miami? AI: You mentioned earlier that you are in Miami with your fiancée. I hope you both have a wonderful time exploring the city and enjoying your shopping adventure! If there's anything else you'd like to know As we can see, conversational buffer memory saves every interaction in the chat history directly. While storing everything gives the LLM the maximum amount of information, more tokens mean slowing response times and higher costs.
Using ConversationBufferMemory, we very quickly use a lot of tokens and even exceed the context window limit of even the most advanced LLMs available today. To avoid excessive token usage, we can use ConversationSummaryMemory. As the name would suggest, this form of memory summarizes the conversation history.
from langchain.chains.conversation.memory import ConversationSummaryMemory
llm = ChatOpenAI(temperature=0, model="gpt-4o", max_tokens=1000)
conversation = ConversationChain(
llm=llm,
memory=ConversationSummaryMemory(llm=llm)
)
conversation.predict(input="Hi there! I am in Miami Today with my fiancee and want to go for shopping.")
conversation.predict(input="Can you tell me some shopping malls?")
conversation.predict(input="Can you tell me who I am with in Miami?")
print(conversation.memory.buffer)
Output:
The human is in Miami with their fiancée and wants to go shopping. The AI suggests
several shopping destinations in Miami, including Bal Harbour Shops for luxury
brands, the Miami Design District for a mix
We pass the LLM to the ConversationSummaryMemory function as the LLM helps summarize the previous contexts. Let us check out the prompt that is passed to the LLM for summarizing the historical contexts.
print(conversation.memory.prompt.template)
Output:
Human: Hi there! I am in Miami Today with my fiancee and want to go for shopping.
AI: Hello! That sounds like a fun day ahead. Miami is a great place for shopping
with a variety of options to explore. If you're looking for high-end fashion and
luxury brands, you might want to check out Human: Can you tell me some shopping
malls? AI: Of course! Here are some popular shopping malls in Miami that you might
want to visit:
1. **Aventura Mall**: As one of the largest malls in the United States, Aventura
Mall offers a vast selection of stores, including both high-end and more affordable
brands. You'll find everything from Nords 2. **Dolphin Mall**: This is a great spot
for outlet shopping, with a wide range of stores offering discounts on popular
brands. It's a bit more budget-friendly and includes stores like Nike, Calvin Klein,
3. **Brickell City Centre**: Located in the heart of Miami's financial district, this modern shopping center features luxury brands like Saks Fifth Avenue, as well
as a variety of dining options and a ciner 4. **The Falls**: This is an open-air
shopping center with a beautiful setting, featuring a waterfall and tropical
landscaping. It has a mix of popular retailers like Macy's and specialty stores. 5.
**Dadeland Mall**: Known for its large selection of department stores, including
Macy's, JCPenney, and Nordstrom, Dadeland Mall also offers a variety of specialty
shops and dining options.
Each of these malls offers a unique shopping experience, so you can choose based on
your preferences and what you're looking to buy. Enjoy your time in Miami!
Human: Can you tell me who I am with in Miami?
AI: You mentioned earlier that you are in Miami with your fiancée. I hope you both
have a wonderful time exploring the city and enjoying your shopping adventure! If
there's anything else you'd like to know
Progressively summarize the lines of conversation provided, adding onto the previous
summary returning a new summary.
EXAMPLE
Current summary:
The human asks what the AI thinks of artificial intelligence. The AI thinks
artificial intelligence is a force for good.
New lines of conversation:
Human: Why do you think artificial intelligence is a force for good?
AI: Because artificial intelligence will help humans reach their full potential.
New summary:
The human asks what the AI thinks of artificial intelligence. The AI thinks
artificial intelligence is a force for good because it will help humans reach their
full potential. END OF EXAMPLE
Current summary: {summary}
New lines of conversation:
{new_lines}
New summary:
While the advantage of using ConversationSummaryMemory is that it reduces the number of tokens for long conversations, the con is that the whole memory is dependent on the saved summarized version of the conversation whose quality again varies with the summarization capability of the LLM used.
The Conversation Buffer Window Memory is similar to buffer memory but with a pre-defined window added to the memory. This means we only ask the model to remember ‘n’ number of previous interactions thereby reducing the total number of tokens utilized as compared the ConversationBufferMemory.
from langchain.chains.conversation.memory import ConversationBufferWindowMemory
llm = ChatOpenAI(temperature=0, model="gpt-4o", max_tokens=1000)
conversation = ConversationChain(llm=llm,memory=ConversationBufferWindowMemory(k=3))
conversation.predict(input="Hi there! I am in Miami Today with my fiancee and want to go for shopping.")
conversation.predict(input="Can you tell me some shopping malls?")
conversation.predict(input="Can you tell me who I am with in Miami?")
Output:
'You mentioned earlier that you are in Miami with your fiancée. I hope you both have
a fantastic time exploring the city and enjoying all the shopping and attractions
it has to offer! If there's anything e lse you'd like to know or need help with,
feel free to ask."
As we can see, with ‘k’ set as 3, the model is able to remember the last 3 conversations and hence can remember that the person is with their fiancee in Miami.
If only want our chatbot to remember a number of recent conversations, selecting this model is a good choice. However, this option cant help the chatbot remember very distant interactions.
The ConversationSummaryBufferMemory is a combination of ConversationSummaryMemory and ConversationBufferWindowMemory. This memory system saves recent interactions in a buffer and combines older ones into a summary, keeping both stored for use. Rather than removing older interactions just based on their count, it now clears them out based on the total token length.
from langchain.chains.conversation.memory import ConversationSummaryBufferMemory
llm = ChatOpenAI(temperature=0, model="gpt-4o", max_tokens=1000)
conversation_sum_bufw = ConversationChain(
llm=llm, memory=ConversationSummaryBufferMemory(
llm=llm,
max_token_limit=650
))
conversation_sum_bufw.predict(input="Hi there! I am in Miami Today with my fiancee and want to go for shopping.")
conversation_sum_bufw.predict(input="Can you tell me some shopping malls?")
conversation_sum_bufw.predict(input="Can you tell me who I am with in Miami?")
Output:
'You are in Miami with your fiancée. If you need any more information or
recommendations for your trip, feel free to ask!'
Let us now check how the memory is saved in the buffer for this technique.
print(conversation_sum_bufw.memory.buffer)
Output:
System: The human is in Miami with their fiancée and wants to go shopping. The AI
suggests several shopping destinations, including Bal Harbour Shops, the Miami
Design District, Aventura Mall, and Wynwood. Human: Can you tell me who I am with
in Miami? AI: You are in Miami with your fiancée. If you need any more information
or recommendations for your trip, feel free to ask!
As we can see in the output above, the buffer memory has a mix of summary of previous distant conversations along with actual interactions saved for the more recent conversations.
The ConversationSummaryBufferMemory requires extra adjustments to decide what to summarize and what to keep in the buffer, but it provides great flexibility in retaining distant interactions while keeping recent interactions in their original, most detailed form.
In this technique, LangChain builds a mini knowledge graph of connected information by identifying key entities and their relationships, helping the model better understand and respond to different situations.
from langchain.chains.conversation.memory import ConversationKGMemory
from langchain.prompts.prompt import PromptTemplate
llm = ChatOpenAI(temperature=0, model="gpt-4", max_tokens=1000)
template = """The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context.
If the AI does not know the answer to a question, it truthfully says it does not know. The AI ONLY uses information contained in the "Relevant Information" section and does not hallucinate.
Relevant Information:
{history}
Conversation:
Human: {input}
AI:"""
prompt = PromptTemplate(
input_variables=["history", "input"], template=template
)
conversation_with_kg = ConversationChain(
llm=llm,
verbose=True,
prompt=prompt,
memory=ConversationKGMemory(llm=llm)
)
conversation_with_kg.predict(input="Hello, My name is Myra")
conversation_with_kg.predict(input="I am in Miami and need some assistance in booking hotels.")
conversation_with_kg.predict(input="I need hotel recommendations near Miami Beaches")
As seen in the code above, the ConversationChain function is passed with a defined prompt which asks the LLM to focus only on the relevant information for an asked query and not hallucinate.
import networkx as nx
import matplotlib.pyplot as plt
print(conversation_with_kg.memory.kg.get_triples())
Output:
[('Myra', 'name', 'is'), ('Myra', 'Miami', 'is in'), ('Myra', 'booking hotels',
'needs assistance in'), ('Human', 'hotel recommendations near Miami Beaches',
'need')]
As can be seen from the output above, in the memory, key entities and their relationships are saved. Hence, structured information can be very easily extracted using this technique.
Entity memory, like Knowledge Graph memory, pulls specific details from conversations, such as names, objects, or places. This targeted method helps the model respond to user questions with more accuracy.
from langchain.chains.conversation.memory import ConversationEntityMemory
from langchain.chains.conversation.prompt import ENTITY_MEMORY_CONVERSATION_TEMPLATE
## The propmpt
print(ENTITY_MEMORY_CONVERSATION_TEMPLATE.template)
Output:
'You are an assistant to a human, powered by a large language model trained by
OpenAI. \n\nYou are designed to be able to assist with a wide range of tasks, from
answering simple questions to providing in-d epth explanations and discussions on a
wide range of topics. As a language model, you are able to generate human-like text
based on the input you receive, allowing you to engage in natural-sounding convers
ations and provide responses that are coherent and relevant to the topic at hand.
\n\nYou are constantly learning and improving, and your capabilities are constantly
evolving. You are able to process and un derstand large amounts of text, and can use
this knowledge to provide accurate and informative responses to a wide range of
questions. You have access to some personalized information provided by the human
in the Context section below. Additionally, you are able to generate your own text
based on the input you receive, allowing you to engage in discussions and provide
explanations an
The above output shows the prompt given to the LLM. Let us now see how the ConversationEntityMemory can be implemented taking into account the above prompt template.
llm = ChatOpenAI(temperature=0, model="gpt-4o", max_tokens=200)
conversation = ConversationChain(
llm=llm,
verbose=True,
prompt=ENTITY_MEMORY_CONVERSATION_TEMPLATE,
memory=ConversationEntityMemory(llm=llm)
)
conversation.predict(input="Hello, My name is Myra")
conversation.predict(input="I am in Miami and need some assistance in booking hotels.")
conversation.predict(input="I need hotel recommendations near Miami Beaches")
from pprint import pprint
pprint(conversation.memory.entity_store)
Output:
InMemoryEntityStore (store={"Myra': "Myra's name is Myra.", 'Miami': 'Miami is a
city where Myra is currently located and is seeking assistance in booking hotels.',
'Miami Beaches': 'Miami Beaches is a popul
As can be seen from the output above, all the relevant entities are identified mapped with the associated details like “’Miami is a city where Myra is currently located and is seeking assistance in booking hotels.” is mapped to the entity “Miami”.
In Retrieval-Augmented Generation (RAG) systems, conversational memory is vital for maintaining context and improving relevance. It also helps in personalizing responses and handling multi-step queries. Unlike naive RAG, which processes each query independently, conversational memory builds a continuous experience. Techniques like Conversation Buffer Memory store full interaction histories but may increase token usage. On the other hand, Conversation Summary Memory reduces token usage by summarizing past interactions.
Conversation Buffer Window Memory retains only a set number of recent exchanges, and Conversation Summary Buffer Memory combines recent conversations with summaries of older ones, balancing context and efficiency. More advanced methods, such as Conversation Knowledge Graph Memory, structure information as interconnected entities, while Entity Memory captures specific details like names or locations for precise responses. Together, these techniques enable RAG systems to offer contextually rich and adaptive interactions tailored to user needs.
A. Conversational memory helps RAG systems remember past interactions, making responses more relevant and context-aware.
A. Memory enhances continuity, personalization, and relevance in chatbot conversations, especially in multi-step or complex interactions.
A. Conversation Buffer Memory saves all interactions, while Summary Memory condenses past conversations to save token usage.
A. It creates a mini knowledge graph of key entities and their relationships, helping the chatbot understand complex queries better.
A. Conversation Summary Buffer Memory is ideal as it combines recent interactions with summaries of older ones for efficient context retention.
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.