At 6:30 AM on a Thursday, I received a call from my senior regarding a sentiment analysis project. Still in the haze of sleep, I relied on the call recorder to recap the requirements and deadline. I was tasked with analyzing review data and categorizing them as Negative, Positive, or Neutral, with a delivery deadline of the upcoming Monday. Given my keen interest in Natural Language Processing (NLP), sentiment analysis intrigued me incredibly. Aware of Lexicon-based sentiment analyzers, particularly TextBlob, I opted for it due to time constraints. TextBlob, developed by Steven Loria, is a Python library that leverages the Natural Language Toolkit for various tasks. I’ve observed numerous projects employing TextBlob as a sentiment analyzer, often analyzing Twitter data or movie reviews.
This article was published as a part of the Data Science Blogathon.
Reviews are super important. They tell us what people think about stuff like products or movies. Whether someone loves something or hates it, their review helps others decide. For example, “The Shawshank Redemption” got a really high score, while “Student of the Year 2” didn’t do so well.
I recently looked at reviews for mobile phones from different places like online stores and social media. First, I had to collect all the data, which took a bit of work. Then, I had to clean it up because there was a lot of extra stuff that didn’t matter. I used some tools to help with that. Now, the data is all set for me to analyze and see what people are saying.
TextBlob is a Python library used for Natural Language Processing (NLP). It relies on NLTK (Natural Language Toolkit). When you give it a sentence, it gives back two things: polarity and subjectivity.
The polarity score ranges from -1 to 1. A score of -1 means the words are super negative, like “disgusting” or “awful.” A score of 1 means the words are super positive, like “excellent” or “best.”
Subjectivity score, on the other hand, goes from 0 to 1. If it’s close to 1, it means the sentence has a lot of personal opinion instead of just facts.
For my project, I was mostly interested in the polarity score because I wasn’t focusing on facts. TextBlob can do a lot of other things too, like figuring out noun phrases, tagging parts of speech, breaking down words, and more. So, I didn’t use the subjectivity score in my project.
To start working with TextBlob it requires preinstalled python, and configured pip. The pip installation command for TextBlob is:
pip install textblob
To import TextBlob we need to write as
from textblob import TextBlob
TextBlob syntax to get polarity score:
res = TextBlob(sentence)
print(res.sentiment.polarity)
As TextBlob is a Lexicon-based sentiment analyzer It has some predefined rules or we can say word and weight dictionary, where it has some scores that help to calculate a sentence’s polarity. That’s why the Lexicon-based sentiment analyzers are also called “Rule-based sentiment analyzers”.
Let’s check some random sentences’ polarity with TextBlob, The beauty of TextBlob is it has a very easy syntax.
We get the polarity values as 0.85, -0.69, 0.73 respectively. In the above data, we have a negative sentence “This movie is badly directed” which has a polarity score of -0.69 which resembles one of the most negative sentences,
Let’s change the word “badly” to “amazingly”.
res = TextBlob("This movie is amazingly directed")
print(res.sentiment.polarity)
The output comes as 0.6000000000000001.
Here, TextBlob works amazingly as a sentiment analyzer. And I was successful in delivering my project next Monday and got appreciation as well from my colleagues.
The next day I was just looking at the result files and some particular sentence caught my attention.
It was “no slow-motion camera”
As I told that my domain was mobile phone review analysis so if anyone writes this sentence it’s a negative one, but TextBlob classified it as positive with a polarity score of 0.15. That made me curious and forced me to do some more exploration on how TextBlob works and the finding was when any negation is added with any sentence it simply multiplies -0.5 to the polarity score of the word. In my case, it was the word “slow” which was a negative word and have a polarity score of -0.3 so when it multiplies -0.5 then the resulting polarity of the sentence becomes positive 0.15.
Another issue I faced with TextBlob was when the negation word is added somewhere in between i.e. not adjacent to the word which has some polarity other than 0.
In the above example if we see the word “best”, it has a polarity score of 1.0 however in the second sentence it should multiply -0.5 to 1.0 and the value should appear as -0.5 but this is not the case. The answer here is TextBlob considers “not best” differently from “not the best” and that creates the issue. These things need to be changed as it was impacting the overall sentiment on the product.
I delved back into exploring sentiment analyzers and came across a research paper by Eric Gilbert and C. Hutto introducing VADER (Valence Aware Dictionary and Sentiment Reasoner). VADER, like TextBlob, is a lexicon-based sentiment analyzer with predefined rules for words or lexicons. However, what sets VADER apart is its ability to not only classify words as positive, negative, or neutral but also evaluate the overall sentiment of a sentence.
The output from VADER is presented in a Python dictionary format, consisting of four keys: ‘neg’ for negative, ‘neu’ for neutral, ‘pos’ for positive, and ‘compound’. The compound score is particularly noteworthy as it represents the overall sentiment of the sentence by normalizing the other three scores (negative, neutral, and positive) between -1 and +1. Similar to TextBlob, a score of -1 indicates the most negative sentiment, while a score of +1 indicates the most positive sentiment.
It works differently than TextBlob. I took some of the problematic sentences and executed them with VADER and the output was correct.
To start working on VADER we need to install it with pip.
pip install vaderSentiment
We need to import and initialize it as:
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
sid_obj= SentimentIntensityAnalyzer()
I checked with my problematic sentence:
print(sid_obj.polarity_scores("no slow motion camera"))
The output of the above sentence is a compound score of -0.296.
I analyzed the whole corpus with Vader and TextBlob. The output brought me to the conclusion that TextBlob was struggling with negative sentences, particularly negations.
The scatter plot above illustrates the Pearson correlation coefficient between VADER and TextBlob. Analyzing the graph, we observe that while VADER classified certain sentences as negative, TextBlob identified them mostly as positive. In the first and third quadrants, both algorithms show agreement. However, in the second and fourth quadrants, there is noticeable inconsistency, particularly in the fourth quadrant, where contradictory data is prevalent. Here, TextBlob labels sentences as positive, whereas VADER categorizes them as negative.
To mitigate my bias towards TextBlob, I sought further evidence to ascertain whether VADER indeed outperforms TextBlob in my project. Additional experiments were necessary to validate this hypothesis.
As said by Richard Feynman “It doesn’t matter how beautiful
your theory is, it doesn’t matter how smart you are. If it doesn’t agree with
experiment, it’s wrong.
The most effective approach was to compare the two algorithms I had, but the challenge was determining the benchmark for comparison. I needed a reliable method for evaluating sentiment accurately. Initially, I considered personally annotating all sentiments, but after some research, I discovered the concept of the “Wisdom of Crowds.” Described in James Surowiecki’s book, “The Wisdom of Crowds,” this concept suggests that the collective knowledge of a group of people, expressed through their combined opinions, can be as reliable as that of an expert. Therefore, I chose to rely on the collective judgments of a group of individuals to establish the correct sentiment.
I selected 20 people for this task, of which 10 had expertise in the mobile domain while the rest did not. I gave them 150 random sentences to mark as Positive, Negative, and Neutral. Then from the output provided by each individual, I took the average of 20 people and gave a final correct sentiment rating. That was the gold standard.
Now we can compare TextBlob and Vader. To get the accuracy of an algorithm as compared to human analyzed sentences, I created confusion matrices with both the algorithm versus crowdsourcing data.
The result is very convincing that VADER outperforms TextBlob when it comes to negative polarity
detection. In the above-mentioned confusion matrices VADER gets an overall accuracy of 63.3% however TextBlob gets an accuracy of 41.3%.
It depends on the requirement of the user. My answer is No, VADER is not better than TextBlob in all It depends on the user’s needs. In my opinion, VADER is not universally better than TextBlob. However, it does excel particularly in classifying negative sentiment.
In the above-mentioned table the f1 score of VADER is 0.80 when it comes to negative polarity detection and for TextBlob it comes as 0.56. From this, we can conclude that VADER does better sentiment analysis when it comes to negative polarity detection.
In conclusion, my exploration of sentiment analysis compared TextBlob and VADER algorithms. While TextBlob struggled with negative sentences, VADER outperformed in detecting negative polarity. However, VADER isn’t universally superior; its strength lies in negative sentiment classification. Ultimately, the choice between the two depends on specific project needs, highlighting the importance of understanding algorithm capabilities for effective sentiment analysis.
A. TextBlob’s sentiment analysis works by using a trained machine learning model to classify the sentiment of a given text. It considers the words and their arrangement to assign a polarity (positive, negative, or neutral) and subjectivity score to the text.
A. TextBlob is a Python library that simplifies text processing, including tasks like part-of-speech tagging, noun phrase extraction, and sentiment analysis. It provides a simple API for diving into common natural language processing tasks.
A. TextBlob uses a Naive Bayes classifier for sentiment analysis. It is trained on a labeled dataset containing examples of text with associated sentiment labels (positive, negative, or neutral).
A. The sentiment score in TextBlob’s polarity ranges from -1 to 1, where -1 represents a highly negative sentiment, 0 is neutral, and 1 indicates a highly positive sentiment. The subjectivity score ranges from 0 to 1, with 0 being objective and 1 being subjective.
This is an interesting topic that I have not come across before. It's great to learn about the use of textblob and vader libraries in sentiment analysis for python. Can you provide more information on how these libraries can help us analyze the sentiment of texts?
This is an interesting topic that I have not come across before. It's great to learn about the use of textblob and vader libraries in sentiment analysis for python. Can you provide more information on how these libraries can help us analyze the sentiment of texts?
Great blog, I have gained useful insights from this thank you very much!!