Creating a Chatbot with FalconAI, LangChain, and Chainlit

Ajay Last Updated : 08 Jul, 2024

10 min read

Introduction

Generative AI, especially the Generative Large Language Models, have taken over the world since their birth. This was only possible because they could integrate with different applications, from generating working programmable codes to creating fully GenerativeAI-managed Chat Support Systems. But most of the Large Language Models in the Generative AI space have been closed to the public; most were not open-sourced. While there do exist a few Open Source models, but are nowhere near the closed-source Large Language Models. But recently, FalconAI, an LLM, was released, which topped the OpenLLM leaderboard and was made Open Sourced. With this model in this guide, we will create a chat application with Falcon AI, LangChain, and Chainlit.

In this article you will get to know about the chainlit vs streamlit whts the differences and about the benefits of it.

Learning Objectives

To leverage Falcon Model in Generative AI Applications
To build UI for Large Language Models with Chainlit
To work with Inference API to access pre-trained models in Hugging Face
To chain Large Language Models and Prompt Templates with LangChain
To integrate LangChain Chains with Chainlit for building UI Applications

This article was published as a part of the Data Science Blogathon.

Learning Objectives
What is Falcon AI?
What is Chainlit?
Generating HuggingFace Inference API
Preparing the Environment
Creating the Chat Application
Chainlit Vs Streamlit
Chainlit – UI for Large Language Models
- Steps
- Let’s Run the Code!
Frequently Asked Questions

What is Falcon AI?

In the Generative AI field, Falcon AI is one of the recently introduced Large Language Models known for taking first place in the OpenLLM Leaderboard. Falcon AI was introduced by UAE’s Technology Innovation Institute (TII). Falcon AI’s architecture is designed in a way that is optimized for Inference. When it was first introduced, Falcon AI topped the OpenLLM Leaderboard by moving ahead of state-of-the-art models like Llama, Anthropic, DeepMind, etc. The model was trained on AWS Cloud with 384 GPUs attached continuously for two months.

Currently, it consists of two models, Falcon 40B(40 Billion Parameters) and Falcon 7B(7 Billion Parameters). The main part is that the Falcon AI makers have mentioned that the model will be Open Sourced, thus allowing developers to work with it for commercial use without restrictions. Falcon AI even provides the Instruct models, the Falcon-7B-Instruct and Falcon-40B-Instruct, with which we can quickly get started to build chat applications. In this guide, we will work with the Falcon-7B-Instruct model.

What is Chainlit?

Chainlit library is similar to Python’s Streamlit Library. But the intended purpose of this Chainlit library is to build chat applications with Large Language Models quickly, i.e., to create a UI similar to ChatGPT. Developing conversational chat applications within minutes with the Chainlit package is possible. This library is seamlessly integrated with LangFlow and LangChain(the library to build applications with Large Language Models), which we will do later in this guide.

Chainlit even allows for visualizing multi-step reasoning; it lets us see the intermediate results to know how the Large Language Model reached the output to a question. So you can clearly see the chain of thoughts of the model through the UI itself to understand how the LLM concluded the given question. Chainlit is restricted to a text conversation and allows for sending and receiving Images to and from the respective Generative AI models. It even lets us update the Prompt Template in the UI instead of returning to the code and changing it.

Generating HuggingFace Inference API

There are two ways to work with the Falcon-7B-Instruct model. One is the traditional way, where we download the model to the local machine and then use it directly. But because this is a Large Language Model, it will need high GPU memory to make it work. Hence we go with the other option, calling the model directly through the Inference API. Inference API is a HuggingFace API token with which we can access all the transformer models in the HuggingFace.

To access this token, we need to create an Account in HuggingFace, which we can do by going to the official HuggingFace website. After logging in/signing in with your details, go to your profile and click on the Settings section. The process from there will be

HuggingFace | FalconAI | Chainlit | GenerativeAI | Langchain

So in Settings, go to Access Tokens. You will create a new token, which we must work with the Falcon-7B-Instruct model. Click on the New Token to create the new token. Enter a name for the token and set the Role option to Write. Now click on Generate to generate our new Token. With this token, we can access the Falcon-7B-Instruct model and build applications.

Preparing the Environment

Before we dive into our application, we will create an ideal environment for the code to work. For this, we need to install the necessary Python libraries needed. Firstly, we will start by installing the libraries that support the model. For this, we will do a pip install of the below libraries.

$ pip install huggingface_hub 
$ pip install transformers

These commands will install the HuggingFace Hub and the Transformers libraries. These libraries call the Falcon-7B-Instruct model, which resides in the HuggingFace. Next, we will be installing the LangChain library for Python.

$ pip install langchain

This will install the LangChain Package for Python, which we will work with to create our chat application with the Falcon Large Language Model. Finally, without the UI, the conversational application is not done. So for this, we will be downloading the chainlit library.

$ pip install chainlit

This will install the Chainlit library for Python. With the help of this library, we will be building the UI for our conversational chat application. After installing chainlit, we need to test the package. For this, use the below command in the terminal.

chainlit hello

FalconAI | Chainlit | GenerativeAI | Langchain

After entering this command, a new window with the address localhost and PORT 8000 will appear. The UI will then be visible. This tells that the chainlit library is installed properly and ready to work with other libraries in Python.

Creating the Chat Application

In this section, we will start building our application. We have all the necessary libraries to go forward to build our very own conversational chat application. The first thing we will be doing is importing the libraries and storing the HuggingFace Inference API in an environmental object.

import os
import chainlit as cl
from langchain import HuggingFaceHub, PromptTemplate, LLMChain

os.environ['API_KEY'] = 'Your API Key'

So we start by importing the os, chainlit and langchain libraries.
From langchain, we have imported the HuggingFaceHub. This HuggingFaceHub will let us call the Falcon-7B-Instruct model through the Inference API and receive the responses generated by the model.
The PromptTemplate is one of the elements of LangChain, necessary for building applications based on the Large Language Model. It defines how the model should interpret the user’s questions and in what context it should answer them.
Finally, we even import the LLMChain from LangChain. LLMChain is the module that chains different LangChain components together. Here we will be chaining our Falcon-7B-Instruct Large Language Model with the PromptTemplate.
Then we store our HuggingFace Inference API in an environment variable, that is, os.environ[‘API_KEY’]

Instruct the Falcon Model

Now we will be inferring the Falcon Instruct model through the HuggingFaceHub module. For this, first, we must provide the path to the model in the Hugging Face. The code for this will be

model_id = 'tiiuae/falcon-7b-instruct'

falcon_llm = HuggingFaceHub(huggingfacehub_api_token=os.environ['API_KEY'],
                            repo_id=model_id,
                            model_kwargs={"temperature":0.8,"max_new_tokens":2000})

First, we must give the id of the model we will work with. For us, it will be the Falcon-7B-Instruct model. The id of this model can be found directly on the HuggingFace website, which will be ‘tiiuae/falcon-7b-instruct’.
Now we call the HuggingFaceHub module, where we pass the API token, assigned to an environment variable, and even the repo_id, i.e., the id of the model we will be working with.
Also, we provide the model parameters, like the temperature and the maximum number of new tokens. Temperature is how much the model should be creative, where 1 means more creativity, and 0 tells no creativity.

Now we have clearly defined what model we will be working with. And the HuggingFace API will let us connect to this model and run our queries to start building our application.

Prompt Template

After the model selection, the next is defining the Prompt Template. The Prompt Template tells how the model should behave. It tells how the model should interpret the question provided by the user. It even tells how the model should conclude to give the output to the user’s query. The code for defining our Prompt Template would be

template = """

You are an AI assistant that provides helpful answers to user queries.

{question}

"""
prompt = PromptTemplate(template=template, input_variables=['question'])

The above template variable defines and sets the context of the Prompt Template for the Falcon model. The context here is simple, the AI needs to provide helpful answers to user queries, followed by the input variable {question}. Then this template, along with the variables defined in it, is given to the PromptTemplate function, which is then assigned to a variable. This variable is now the Prompt Template, which will later be chained together with the model.

Chain Both Models

Now we have both the Falcon LLM and the Prompt Template ready. The final part will be chaining both these models together. We will work with the LLMChain object from the LangChain library for this. The code for this will be

falcon_chain = LLMChain(llm=falcon_llm,
                        prompt=prompt,
                        verbose=True)

With the help of LLMChain, we have chained the Falcon-7B-Instruct model with our very own PromptTemplate that we have created. We have even set the verbose = True, which is helpful to know what happens when the code is being run. Now let’s test the model by giving a query to it

print(falcon_chain.run("What are the colors in the Rainbow?"))

Here, we have asked the model what the rainbow colors are. The rainbow contains VIBGYOR (Violet, Indigo, Blue, Green, Yellow, Orange, and Red) colors. The output generated by the Falcon 7B Instruct model is spot on to the question asked. Setting the verbose option lets us see the Prompt after formatting and tells us where the chain starts and ends. Finally, we are ready to create a UI for our conversational chat application.

Chainlit Vs Streamlit

Chainlit: Built specifically for chat rooms. Makes conversations flow smoothly and remembers past talks. Has cool features like showing the “thought process” behind answers and handling pictures. Might be a bit new and have less help available, but perfect for creating chatbots or talking to big language models.

Streamlit: is Like a toolbox with lots of pieces (charts, buttons) for any kind of room (data dashboards, reports). Easy to use, even for beginners. Rebuilds the whole house every time you change a piece, which can be slow for big houses.

Chainlit – UI for Large Language Models

In this section, we will work with Chainlit Package to create the UI for our application. Chainlit is a Python library that lets us build Chat Interfaces for Large Language Models in minutes. It is integrated with LangFlow and even LangChain, the library we previously worked on. Creating the Chat Interface with Chainlit is simple. We have to write the following code:

@cl.langchain_factory(use_async=False)

def factory():

    prompt = PromptTemplate(template=template, input_variables=['question'])
    falcon_chain = LLMChain(llm=falcon_llm,
                        prompt=prompt,
                        verbose=True)

    return falcon_chain

Steps

First, we start with the decorators from Chainlit for LangChain, the @cl.langchain_factory.
Then we define a factory function that contains the LangChain code. The code here we need is the Prompt Template and the LLMChain module of LangChain, which builds and chains our Falcon LLM.
Finally, the return variable must be a LangChain Instance. Here, we return the final chain created, i.e., the LLMChain Instance, the falcon_chain.
The use_async = False tells the code not to use the async implementation for the LangChain agent.

Let’s Run the Code!

That’s it. Now when we run the code, a Chat Interface will be visible. But how is this possible The thing is, Chainlit takes care of everything. Behind the scenes, it manages the webhook connections, it is responsible for creating a separate LangChain Instance(Chain, Agent, etc) for each user that visits the site. To run our application, we type the following in the terminal.

$ chainlit run app.py -w

The -w indicates auto-reload whenever we make changes live in our application code. After entering this, a new tab gets opened with localhost:8000

This is the opening page, i.e., the welcome screen of Chainlit. We see that Chainlit builds an entire Chat Interface for us just with a single decorator. Let’s try interacting with the Falcon Model through this UI

HuggingFace | FalconAI | Chainlit | GenerativeAI | Langchain | Chatbot

We see that the UI and the Falcon Instruct model are working perfectly fine. The model can provide swift answers to the questions asked. It really tried to explain the second question based on the user’s context (explain to a 5-year-old). This is the beginning of what we can achieve with these Open Sourced Generative AI models. With little to few modifications, we can be able to create much more problem-oriented, real scenario-based applications.

As the Chat Interface is a website, it is completely possible to host it on any of the cloud platforms. We can containerize the application, then try to deploy it in any container-based services in Google Cloud, AWS, Azure, or other cloud services. With that, we can share our applications with the outside world.

Conclusion

In this walkthrough, we have seen how to build a simple Chat Application with the new Open Source Falcon Large Language Model, LangChain, and Chainlit. We have leveraged these three packages and have interconnected them to create a full-fledged solution from Code to Working Application. We have even seen how to obtain the HuggingFace Inference API Key to access thousands of pre-trained models from the HuggingFace library. With the help of LangChain, we chained the LLM with custom Prompt Templates. Finally, with Chainlit, we could create a Chat Application Interface around our LangChain Falcon model within minutes. Hope you like this article, and get the difference about the chainlit vs streamlit and whts the differences and whts the uses in it.

Some of the key takeaways from this guide include:

Falcon is an Open Source model and is one of the powerful LLm, which is presently at the top of the OpenLLM Leaderboard.
With Chainlit, it is possible to create UI for LLM within minutes.
Inference API lets us connect to many different models present in the HuggingFace.
LangChain helps in building custom Prompt Templates for the Large Language Models.
Chainlit’s seamless integration with LangChain allows it to build LLM applications quicker and error-free.
You will get the differences about the streamlit Vs chainlit, falcon chatbot and their benefits and hws chainlit vs streamlit gives the perfernces.

Frequently Asked Questions

Q1. What is HuggingFace Inference API?

A. The Inference API is created by HuggingFace, allowing you to access thousands of pre-trained models in the HuggingFace library. With this API, you can access a variety of models, including Generative AI models, Natural Language Processing Models, Audio Classification, and Computer Vision models.

Q2. Are Falcon models powerful?

A. They are. Especially the Falcon 40B(40 Billion Parameters) model. This model has surpassed other state-of-the-art models like Llama and DeepMind and acquired the top position in the OpenLLM Leaderboard.

Q3. What is Chainlit Package?

A. Chainlit is a Python Library that is developed for creating UI. With Chainlit, creating ready-to-work Chat Interfaces for Large Language Models within minutes is possible. The Chainlit Package seamlessly integrates with LangFlow and LangChain, other packages that are worked with to create applications with Large Language Models.

Q4. Are Falcon Models Open-Sourced?

A. Yes. The Falcon 40B(40 Billion Parameters) and the Falcon 7B(7 Billion Parameters) are Open Sourced. This states that anyone can work with these models to create commercial applications without restrictions.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Ajay

I work as a Developer in the field of Data Science. I constantly spend time learning new things be it related to AI, DataSceine, and CyberSecurity. Deep learning and machine learning are two topics that I find particularly fascinating, and Python is my preferred language for programming. Cyber Security is another field that I'm touching upon recently. I have experience with large-scale data analysis, and I have a solid grasp of a variety of deep learning and machine learning approaches, including neural networks, regression models, and natural language processing. I'm eager to take on new challenges and make a meaningful contribution to the industry, so I'm constantly seeking for ways to enlarge and deepen my knowledge and skills in the subject.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Introduction to Generative AI

Introduction to Generative AI applications

No-code Generative AI app development

Code-focused Generative AI App Development

Introduction to Responsible AI

LLMS

Prompt Engineering

Finetuning LLMs

Training LLMs from Scratch

Langchain

RAG

LlamaIndex

Stable Diffusion

Creating a Chatbot with FalconAI, LangChain, and Chainlit

Introduction

Learning Objectives

Table of contents

What is Falcon AI?

What is Chainlit?

Generating HuggingFace Inference API

Preparing the Environment

Creating the Chat Application

Instruct the Falcon Model

Prompt Template

Chain Both Models

Chainlit Vs Streamlit

Chainlit – UI for Large Language Models

Steps

Let’s Run the Code!

Conclusion

Frequently Asked Questions

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt

lang

s_tp