In today’s digital age, platforms like Twitter, Goodreads, and Amazon overflow with people’s opinions, making it crucial for organizations to extract insights from this massive volume of data. Sentiment Analysis in Python offers a powerful solution to this challenge. This technique, a subset of Natural Language Processing (NLP), involves classifying texts into sentiments such as positive, negative, or neutral. By employing various Python libraries and models, analysts can automate this process efficiently. Let’s delve into how to perform sentiment analysis in Python and explore some examples of its application.So , Basically we share information regarding this in the article about sentiment analysis using python and how to do sentiment analysis in pyhton.
In this article, you will get a clear understanding of the Sentiment Analysis Model, including its applications and a practical Sentiment Analysis examplee.
This article was published as a part of the Data Science Blogathon.
Sentiment Analysis is a use case of Natural Language Processing (NLP) and comes under the category of text classification. To put it simply, Sentiment Analysis involves classifying a text into various sentiments, such as positive or negative, Happy, Sad or Neutral, etc. Thus, the ultimate goal of sentiment analysis is to decipher the underlying mood, emotion, or sentiment of a text. This is also referred to as Opinion Mining.
Let us look at how a quick google search defines Sentiment Analysis:
Also,Read this article 10 Youtube Channels to Master Python
Sentiment analysis in Python typically works by employing natural language processing(NLP) techniques to analyze and understand the sentiment expressed in text. The process involves several steps:
Various types of sentiment analysis can be performed, depending on the specific focus and objective of the analysis. Some common types include:
Also, Checkout this article Comphrensive overview about Sentiment Analysis
Sentiment analysis is a valuable tool for organizations to understand customer sentiment and make informed decisions. For example, a perfume company selling online can use sentiment analysis to determine popular fragrances and offer discounts on unpopular ones. By analyzing customer reviews, the company can identify popular fragrances and make informed decisions. However, due to the vast number of fragrances available, it can be challenging to analyze all of them in one lifetime.
You simply gather all the reviews in one place and apply sentiment analysis to it. The following is a schematic representation of sentiment analysis on the reviews of three fragrances of perfumes — Lavender, Rose, and Lemon. (Please note that these reviews might have incorrect spellings, grammar, and punctuations as it is in the real-world scenarios)
From these results, we can clearly see that:
This was just a simple example of how sentiment analysis can help you gain insights into your products/services and help your organization make decisions.
We just saw how sentiment analysis can empower organizations with insights that can help them make data-driven decisions. Now, let’s peep into some more use cases of sentiment analysis:
Python is one of the most powerful tools when it comes to performing data science tasks — it offers a multitude of ways to perform sentiment analysis in Python. The most popular ones are enlisted here:
Let’s dive deep into them one by one.
Note: For the purpose of demonstrations of methods 3 & 4 (Using Bag of Words Vectorization-based Models and Using LSTM-based Models) sentiment analysis has been used. It comprises more than 5000 text labelled as positive, negative or neutral. The dataset lies under the Creative Commons license.
Text Blob is a Python library for Natural Language Processing. Using Text Blob for sentiment analysis is quite simple. It takes text as an input and can return polarity and subjectivity as outputs.
Here is Steps to perform sentiment analysis using python and putting sentiment analysis code in python.
pip install textblob
from textblob import TextBlob
Writing code for sentiment analysis using TextBlob is fairly simple. Just import the TextBlob object and pass the text to be analyzed with appropriate attributes as follows:
from textblob import TextBlob
text_1 = "The movie was so awesome."
text_2 = "The food here tastes terrible."
#Determining the Polarity
p_1 = TextBlob(text_1).sentiment.polarity
p_2 = TextBlob(text_2).sentiment.polarity
#Determining the Subjectivity
s_1 = TextBlob(text_1).sentiment.subjectivity
s_2 = TextBlob(text_2).sentiment.subjectivity
print("Polarity of Text 1 is", p_1)
print("Polarity of Text 2 is", p_2)
print("Subjectivity of Text 1 is", s_1)
print("Subjectivity of Text 2 is", s_2)
Output
Polarity of Text 1 is 1.0
Polarity of Text 2 is -1.0
Subjectivity of Text 1 is 1.0
Subjectivity of Text 2 is 1.0
VADER (Valence Aware Dictionary and Sentiment Reasoner) is a rule-based sentiment analyzer that has been trained on social media text. Just like Text Blob, its usage in Python is pretty simple. We’ll see its usage in code implementation with an example in a while.
pip install vaderSentiment
vaderSentiment.vaderSentiment import
SentimentIntensityAnalyzer
Firstly, we need to create an object of the SentimentIntensityAnalyzer class; then we need to pass the text to the polarity_scores() function of the object as follows:
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
sentiment = SentimentIntensityAnalyzer()
text_1 = "The book was a perfect balance between wrtiting style and plot."
text_2 = "The pizza tastes terrible."
sent_1 = sentiment.polarity_scores(text_1)
sent_2 = sentiment.polarity_scores(text_2)
print("Sentiment of text 1:", sent_1)
print("Sentiment of text 2:", sent_2)
Output:
Sentiment of text 1: {'neg': 0.0, 'neu': 0.73, 'pos': 0.27, 'compound': 0.5719} Sentiment of text 2: {'neg': 0.508, 'neu': 0.492, 'pos': 0.0, 'compound': -0.4767}
As we can see, a VaderSentiment object returns a dictionary of sentiment scores for the text to be analyzed.
In the two approaches discussed as yet i.e. Text Blob and Vader, we have simply used Python libraries to perform sentiment analysis. Now we’ll discuss an approach wherein we’ll train our own model for the task. The steps involved in performing sentiment analysis using the Bag of Words Vectorization method are as follows:
Code for Sentiment Analysis using Bag of Words Vectorization Approach:
To build a sentiment analysis in python model using the BOW Vectorization Approach we need a labeled dataset. As stated earlier, the dataset used for this demonstration has been obtained from Kaggle. We have simply used sklearn’s count vectorizer to create the BOW. After, we trained a Multinomial Naive Bayes classifier, for which an accuracy score of 0.84 was obtained.
Dataset can be obtained from here.
#Loading the Dataset
import pandas as pd
data = pd.read_csv('Finance_data.csv')
#Pre-Prcoessing and Bag of Word Vectorization using Count Vectorizer
from sklearn.feature_extraction.text import CountVectorizer
from nltk.tokenize import RegexpTokenizer
token = RegexpTokenizer(r'[a-zA-Z0-9]+')
cv = CountVectorizer(stop_words='english',ngram_range = (1,1),tokenizer = token.tokenize)
text_counts = cv.fit_transform(data['sentences'])
#Splitting the data into trainig and testing
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(text_counts, data['feedback'], test_size=0.25, random_state=5)
#Training the model
from sklearn.naive_bayes import MultinomialNB
MNB = MultinomialNB()
MNB.fit(X_train, Y_train)
#Caluclating the accuracy score of the model
from sklearn import metrics
predicted = MNB.predict(X_test)
accuracy_score = metrics.accuracy_score(predicted, Y_test)
print("Accuracuy Score: ",accuracy_score)
Output:
Accuracuy Score: 0.9111675126903553
The trained classifier can be used to predict the sentiment of any given text input.
Though we were able to obtain a decent accuracy score with the Bag of Words Vectorization method, it might fail to yield the same results when dealing with larger datasets. This gives rise to the need to employ deep learning-based models for the training of the sentiment analysis in python model.
For NLP tasks, we generally use RNN-based models since they are designed to deal with sequential data. Here, we’ll train an LSTM (Long Short Term Memory) model using TensorFlow with Keras. The steps to perform sentiment analysis using LSTM-based models are as follows:
Here, we have used the same dataset as we used in the case of the BOW approach. A training accuracy of 0.90 was obtained.
#Importing necessary libraries
import nltk
import pandas as pd
from textblob import Word
from nltk.corpus import stopwords
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import classification_report,confusion_matrix,accuracy_score
from keras.models import Sequential
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
from keras.layers import Dense, Embedding, LSTM, SpatialDropout1D
from sklearn.model_selection import train_test_split
#Loading the dataset
data = pd.read_csv('Finance_data.csv')
#Pre-Processing the text
def cleaning(df, stop_words):
df['sentences'] = df['sentences'].apply(lambda x: ' '.join(x.lower() for x in x.split()))
# Replacing the digits/numbers
df['sentences'] = df['sentences'].str.replace('d', '')
# Removing stop words
df['sentences'] = df['sentences'].apply(lambda x: ' '.join(x for x in x.split() if x not in stop_words))
# Lemmatization
df['sentences'] = df['sentences'].apply(lambda x: ' '.join([Word(x).lemmatize() for x in x.split()]))
return df
stop_words = stopwords.words('english')
data_cleaned = cleaning(data, stop_words)
#Generating Embeddings using tokenizer
tokenizer = Tokenizer(num_words=500, split=' ')
tokenizer.fit_on_texts(data_cleaned['verified_reviews'].values)
X = tokenizer.texts_to_sequences(data_cleaned['verified_reviews'].values)
X = pad_sequences(X)
#Model Building
model = Sequential()
model.add(Embedding(500, 120, input_length = X.shape[1]))
model.add(SpatialDropout1D(0.4))
model.add(LSTM(704, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(352, activation='LeakyReLU'))
model.add(Dense(3, activation='softmax'))
model.compile(loss = 'categorical_crossentropy', optimizer='adam', metrics = ['accuracy'])
print(model.summary())
#Model Training
model.fit(X_train, y_train, epochs = 20, batch_size=32, verbose =1)
#Model Testing
model.evaluate(X_test,y_test)
Transformer-based models are one of the most advanced Natural Language Processing Techniques. They follow an Encoder-Decoder-based architecture and employ the concepts of self-attention to yield impressive results. Though one can always build a transformer model from scratch, it is quite tedious a task. Thus, we can use pre-trained transformer models available on Hugging Face. Hugging Face is an open-source AI community that offers a multitude of pre-trained models for NLP applications. You can use these models as they are or fine-tune them for specific tasks.
pip install transformers
import transformers
To perform any task using transformers, we first need to import the pipeline function from transformers. Then, an object of the pipeline function is created and the task to be performed is passed as an argument (i.e sentiment analysis in our case). We can also specify the model that we need to use to perform the task. Here, since we have not mentioned the model to be used, the distillery-base-uncased-finetuned-sst-2-English mode is used by default for sentiment analysis. You can check out the list of available tasks and models here.
from transformers import pipeline
sentiment_pipeline = pipeline("sentiment-analysis")
data = ["It was the best of times.", "t was the worst of times."]
sentiment_pipeline(data)
Output
[{'label': 'POSITIVE', 'score': 0.999457061290741}, {'label': 'NEGATIVE', 'score': 0.9987301230430603}]
No single best library for sentiment analysis in Python, depends on your needs. Here’s a quick comparison:
NLTK: Powerful, versatile, good for multiple NLP tasks, but complex for sentiment analysis.
TextBlob: Beginner-friendly, simple interface for sentiment analysis (polarity, subjectivity).
Pattern: More comprehensive analysis (comparatives, superlatives, fact/opinion), steeper learning curve.
Polyglot: Fast, multilingual support (136+ languages), ideal for multiple languages.
Sentiment analysis in Python offers powerful tools and methodologies to extract insights from textual data across diverse applications. Through this article, we have explored various approaches such as Text Blob, VADER, and machine learning-based models for sentiment analysis. We have learned how to preprocess text data, extract features, and train models to classify sentiments as positive, negative, or neutral. Additionally, we delved into advanced techniques including LSTM and transformer-based models, highlighting their capabilities in handling complex language patterns.
These methods enable organizations to monitor brand perception, analyze customer feedback, and even predict market trends based on sentiment. As sentiment analysis continues to evolve with advancements in natural language processing, mastering these techniques in Python will prove invaluable for making data-driven decisions in today’s digital age.Hope you like this article and clear your doubts regarding sentiment analysis in python and how to use also telling putting sentiment analysis code in python.
I hope you understand sentiment analysis better now. A sentiment analysis model looks at text to see if it shows positive, negative, or neutral feelings. For example, it can check customer reviews to find out if people like or dislike a product.
A. Sentiment analysis means extracting and determining a text’s sentiment or emotional tone, such as positive, negative, or neutral.
A. Sentiment analysis helps with social media posts, customer reviews, or news articles. For example, analyzing Twitter data to determine the overall sentiment towards a particular product or tracking customer sentiment in online reviews.
A. The two types of sentiment analysis are (1) Document-level sentiment analysis, which analyzes the sentiment of an entire document, and (2) Sentence-level sentiment analysis, which focuses on analyzing the sentiment of individual sentences within a document.
A. Best Python Libraries for Sentiment Analysis
NLTK: Versatile, powerful, but complex.
TextBlob: User-friendly, built on NLTK, good for beginners.
VADER: Excellent for social media sentiment, fast, and accurate.
spaCy: Efficient, accurate, and offers advanced NLP features.