Create a Powerful Chatbot with ChatGPT Using Your Documents

Adnan Last Updated : 14 Sep, 2023
6 min read

Introduction

Today, we will build a ChatGPT based chatbot that reads the documents provided by you and answer users questions based on the documents. Companies in today’s world are always finding new ways of enhancing clients’ service and engagement. Making a chatbot that can rapidly and accurately respond to client inquiries is one method to do this. This post will demonstrate how to build a chatbot using the papers from your own business using prompt engineering.

In this post, we’ll look at a few of them, along with their benefits and drawbacks. We’ll discuss fine-tuning GPT-3, direct prompt-engineering, and integrating vector-index with GPT-3 API. You must be chit-chatting using ChatGPT as a hobby and doing research on new concepts to realize that it is entertaining and educational.

Read: https://www.analyticsvidhya.com/blog/2022/12/chatgpt-unlocking-the-potential-of-artificial-intelligence-for-human-like-conversation/

Learning Objectives

  • Understand the process of building a chatbot using ChatGPT and document-based question answering.
  • Gain knowledge about the importance of enhancing client service and engagement through chatbot technology.
  • Explore various techniques for building chatbots, including fine-tuning GPT-3, direct prompt-engineering, and integrating vector-index with GPT-3 API.

This article was published as a part of the Data Science Blogathon.

Effectiveness of ChatGPT as a Chatbot

What more effective uses could we give it? We are now capable of much more than idle conversation. Thanks to OpenAI’s latest release of the GPT 3.5 series API Product (openai.com). Question and answer is a particularly effective use case for both corporate and personal use. You ask a question in plain language about your own documents or data, and it responds promptly by gathering the necessary information.

ChatGPT Based Chatbot

ChatGPT Based Chatbot Applications

ChatGPT based chatbot is used for a variety of things, including managing your personal knowledge and synthesizing user research. In this article, I’ll discuss how to create your own Q&A chatbot using your own data, explain why some methods won’t work, and provide a step-by-step tutorial for effectively using llama-index and the GPT API to create a document Q&A chatbot.

ChatGPT Based Chatbot | Chatbot Applications | prompt engineering

Think Out of the Box

As a product manager, I spend a significant portion of my time reading internal documents and customer reviews. I immediately considered employing ChatGPT as a personal assistant to help me compile client feedback or locate relevant older product documentation for the feature.

To achieve the goal, I initially considered modifying the GPT model using my own data. However, fine-tuning is quite expensive and necessitates a sizable dataset with examples. Additionally, every time you make changes to the document, it is impossible to make final adjustments.

Prompt Engineering

Prompting is a way of providing tasks and instructions to an AI to carry out a task. The AI performs the work once we give it a set of commands (the prompt). Prompts have different types and this depends upon their application. They can be small instructions, questions, passages, or polls. Prompts can be easy or complex. Prompt Engineering is the practice of putting the prompts aptly for AI tools to answer queries precisely.

Learn Prompting: Your Guide to Communicating with AI.

Prompt engineering, which includes context in the prompts, is the second strategy that springs to mind. For instance, I could insert the original document’s text before the question itself instead of asking it directly. However, the GPT model has a short attention span and can only process the first 4,000 tokens or three thousand words of the prompt.

Given that we have tens of thousands of emails from customers providing feedback and hundreds of product documentation, it is impossible to convey all the context in the prompt. Because the cost depends on the quantity of tokens you use, it is also expensive if you enter in a lengthy context to the API.

I’ll walk you through the process of utilizing LlamaIndex & GPT to create a Q&A chatbot using your own data in the part that follows.

Q&A Chatbot Development Using Your Documents

With the help of LlamaIndex and GPT, we will create a Q&A chatbot (text-davinci-003) in this section that will allow you to ask the chatbot questions about your document and receive responses in natural language.

Prerequisites

We need to get ready for the tutorial before we begin by:

  • You may locate your OpenAI API Key at https://platform.openai.com/account/api-keys.
  • A repository for your documents. LlamaIndex supports numerous different data sources, including Notion, Google Docs, Asana, etc. We’ll only use a plain text file for demonstration purposes in this article.
  • A local Python setting, like Jupyter Notebook.

Steps to Follow:

The process is simple and just requires a few steps:

  • Create a document data index with LlamaIndex.
  • Use natural language to search the index.
  • Retrieve the pertinent components by LlamaIndex and pass it to the GPT prompt.
  • Ask GPT and create a response using the pertinent context

Transform your original document data by LlamaIndex into a vectorized index, which can be queried very quickly. Based on how closely the query and the data match, it will utilize this index to find the most pertinent sections. We load the information into the prompt that is sent to GPT so that GPT has the background necessary to respond to your question.

Setting up Python Environment

Installing the libraries is the first step. Simply enter the following command on Juptyer notebook. Install LlamaIndex and OpenAI using these commands.

!pip install openai #install openAI
!pip install llama-index #install llama index

Imported the libraries into Python, and create a new.py file to set up your OpenAI API key.

# Import necessary packages
from llama_index import GPTSimpleVectorIndex, Document, SimpleDirectoryReader
import os

os.environ['OPENAI_API_KEY'] = 'sk-YOUR-API-KEY'#import csv

Creating and Storing the Index

After we have imported the necessary libraries and installed them, we must create an index for your document.

You can load your document from strings or by using the SimpleDirectoryReader method offered by LllamaIndex.

#import c# Loading from a directory
documents = SimpleDirectoryReader('your_directory').load_data()

# Loading from strings, assuming you saved your data to strings text1, text2, ...
text_list = [text1, text2, ...]
documents = [Document(t) for t in text_list]sv

Having loaded the documents, we’ll next easily create the index with:

Use the following techniques to store the index and retrieve it later.

#import# Save your index to a index.json file
index.save_to_disk('index.json')
# Load the index from your saved index.json file
index = GPTSimpleVectorIndex.load_from_disk('index.json') csv

Fetching the Index Question and Receiving Response

Searching the index is easy.

# Querying the index
response = index.query("What features do users want to see in the app?")
print(response)
prompt engineering
image.png

LlamaIndex will internally receive your prompt, search the index for pertinent chunks, and then pass both your prompt and the pertinent chunks to GPT. We can see that the bot has correctly answered our query as it correctly identified the author of the document.

Way Forward

As we saw above, the output we got from the bot was correct, as it correctly identified the author of the document. We can enter any questions related to the company or the author and the bot will correctly answer it, thanks to the power llama_index library.

 Model's output | prompt engineering
Model’s Output

The procedures above simply demonstrate a very basic first use of LlamaIndex and GPT for answering questions. However, there is much more you can do. You can actually set up LlamaIndex to utilize an alternate large language model (LLM), use another type of index for a variety of jobs, replace an existing index with a new index, and more.

Read the documents here: https://gpt-index.readthedocs.io/en/latest/index.html.

Conclusion

This post has demonstrated how to use Python and several potent AI technologies to build a virtual assistant that relies on your own business papers. With the help of this bot, you can quickly and accurately respond to your clients’ inquiries, enhancing engagement and customer service. You can design a chatbot that meet your particular demands by personalizing the bot’s answers based on the distinctive resources of your business.

Key takeaways from the article are:

  • Chatbots now have high attention from companies and businesses and thanks to Open-AI’s simple libraries, creating a chatbot is super-easy now
  • We can also train the chatbot to read the documents provided to it and answer the queries based on the documents
  • These chatbots can help business save on cost and help them improve their efficiencies.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Frequently Asked Questions

Q1. Can I use ChatGPT as a chatbot?

A. Yes, ChatGPT can be used as a chatbot. It’s designed to generate human-like responses and engage in conversation with users.

Q2. Does ChatGPT have an app?

A. Yes, the ChatGPT app is a free app without any ads. It is available on both iOS and Android.

Q3. Is ChatGPT available for free?

A. OpenAI offers ChatGPT both as a free service and as a subscription-based service called ChatGPT Plus, which provides additional benefits like faster response times and priority access.

Q4. What are the basics of ChatGPT?

A. The basics of ChatGPT involve providing text prompts or messages to the model, which then generates a response based on the input received. It uses a large language model trained on a diverse range of internet text to generate human-like and contextually relevant responses.

Competent and passionate professional holding over 3 years of Python, Data Science, Data Analytics, and ML experience with recent experience in Prompt Engineering. I love writing and one of my blogs at Analytics Vidhya was among the top-3 winners of the Data Science Blogathon, read by 700+ users.

Responses From Readers

We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.

Show details