How to Summarize Text with Transformer-based Models?

Janvi Kumari Last Updated : 30 May, 2024

9 min read

Introduction

One of the most important tasks in natural language processing is text summarizing, which reduces long texts to brief summaries while maintaining important information. This subject has been transformed by Transformers, which are sophisticated deep learning models that provide unmatched performance in extractive and abstractive summarization techniques. Their cutting-edge skills and contextual knowledge power a wide range of applications, from document management to news aggregation. Implementing text summarization with ease using Transformers and Python modules creates new opportunities for efficient information processing and decision-making.

What is Text Summarization?
How Text Summarization is Performed Using Transformers?
Why Should You Use Transformers to Summarize Text?
Summary of the Coding Procedure
Steps to Summarize Text with Transformer-based Models

Frequently Asked Questions

What is Text Summarization?

Text summarization is about taking all long document and making it in shorter version that captures all the important points present in the document. The goal is extract the most important information present in the document in clear and concise manner. News aggregation, content analysis, and information retrieval are among the uses for text summarization.

How Text Summarization is Performed Using Transformers?

There are two ways to summarize text using transformer:

Extractive Summarization: Extractive summarization involves identifying important sections from text and generating them verbatim which produces a subset of sentences from the original text. Transformers improve this procedure by using text processing to extract features, which they then use to rank sentences according to these attributes. The primary actions consist of:

Text Processing: Transformers examine the text to determine its context and the connections among its various sections.
Feature Extraction: The text takes key words and phrases, along with other significant properties.
Sentence Ranking: The order of sentences is determined by how closely they relate to the main idea of the document.
Summary Generation: A logical summary is created by combining the sentences that scored highest.

Abstractive Summarization : Abstractive summarization uses natural language techniques to interpret and understand the important aspects of a text and generate a more “human” friendly summary. This summarizes a text in a manner similar to that of a person. Here, methods like encoder-decoder models are used, where:

Encoder : Processes the input text to understand and extract its features.
Decoder : Generates the summary by creating new sentences that encapsulate the essence of the original text.

In this architecture, transformers can function as the encoder, the decoder, or both. In addition to offering greater freedom, this approach frequently results in summaries that are simpler to read and seem more natural.

Transformers are trained on enormous volumes of textual data for both extractive and abstractive summarization. Their in-depth training makes them especially adept at summarizing assignments since it teaches them intricate patterns and connections between words, sentences, and entire papers.

Why Should You Use Transformers to Summarize Text?

In today’s fast growing world, the information is constantly growing be it from news articles ,research papers or any other source in these cases text summarization comes in handy as it reduces large amounts of information into or short readable format

High Accuracy and Context Awareness

Transformers are designed to understand context at a deep level. Unlike traditional methods, they don’t just pick out keywords; they grasp the nuances and meaning of the entire text. This means the summaries they produce are more accurate and retain the essential information without losing the context.

Handling Complex and Varied Content

Whether you’re dealing with news stories, customer feedback, legal documents, or academic papers, transformers can handle it all. They are versatile and capable of summarizing various types of content effectively. This makes them ideal for applications across different fields, from marketing and research to corporate and legal settings.

Efficiency and Time-Saving

Manually summarizing documents can take a lot of time and labor. Transformers automate this process, delivering concise summaries in seconds. This allows you to quickly grasp the main points and make informed decisions without reading all the papers present in the document.

Improved Information Retrieval

In the digital age, search engines and digital libraries are essential tools. By summarizing search results, transformers help users find the most relevant information faster. This improves the overall effectiveness of information retrieval systems and enhances user experience.

Enhanced Document Management

Managing long documents, especially in corporate, legal, and academic environments, can be hectic. Transformers help by breaking down long papers into manageable chunks, making them easier to organize and reference. This streamlines workflow and boosts productivity.

Better Customer Insights

For businesses, understanding customer feedback is crucial. Transformers can summarize vast amounts of feedback to highlight common themes and issues. This helps companies quickly identify areas for improvement and enhance their products and services.

Clarity in Legal Contracts

Legal contracts can be dense and difficult to understand. Transformers can summarize these documents, providing a clear overview of key terms and conditions. This makes it easier for stakeholders to comprehend and compare different contracts.

Streamlined Customer Service

In customer service, quickly identifying the root cause of an issue is vital. Transformers can summarize customer support requests, helping service teams resolve problems more efficiently. This leads to faster response times and improved customer satisfaction.

Transformers are quite useful for text summarization since they provide a number of important benefits.

Contextual Understanding: To comprehend the context of words, sentences, and documents, transformers make use of attention mechanisms. Accurately determining the most significant information within a text document depends on this. Transformers’ self-attention mechanism enables them to concentrate on various textual elements and comprehend the connections between disparate sections.

Large Language Models:Transformers have a profound grasp of linguistic relationships and patterns since they have been educated on enormous volumes of textual data. They perform exceptionally well on text summarizing assignments that call for a thorough command of language thanks to their substantial training.
Scalability: Transformers are ideal for summarizing lengthy papers or massive volumes of text data because they can handle enormous amounts of text data in simultaneously. The summarization process is accelerated dramatically by this parallel processing capacity.
End-to-End Training: By training transformers on text summarizing tasks from beginning to end, we can tailor their performance to the particular task at hand. Thus, they can acquire the ability to produce
State-of-the-Art: Text summarization is just one of the many natural language processing tasks that Transformers have accomplished state-of-the-art results on. Their reputation for producing top-notch summaries has earned them the preference in numerous summarizing apps.

Summary of the Coding Procedure

Let’s now examine the code!

The first step in putting these ideas into effect is to acquire the BBC news dataset. Long articles in this dataset make excellent candidates for summarization assignments. We will go over each stage of preparing the data, creating summaries, and training a Transformer model.

A high-level summary of the coding procedure is as follows:

Download the Dataset: Access the BBC news dataset, which contains a number of long stories that can be summarized.
Preprocess the Data: Tokenize and eliminate any extraneous information from the text data in order to make it clean and ready for training.
Train the Model: To learn from the dataset, apply a Transformer model. For abstractive summarizing, this entails configuring the encoder-decoder architecture; for extractive summarization, it requires feature extraction and rating.
Create Summaries: Use the model to create summaries for newly published articles after training, and assess the coherence and quality of the created summaries.
Evaluate and Improve: Using metrics like ROUGE scores, evaluate the summarization model’s performance and make necessary adjustments to improve it.

Let’s dive into the coding part and see how we can implement text summarization using Transformers with the BBC news dataset.

The command will download the file from the URL .

Steps to Summarize Text with Transformer-based Models

Let us now dive deeper into the steps that we need to follow to summarize text with transformer-based model.

Step1: Install Transformers

!pip install transformers

Step2: Importing the pipeline Module from the transformers Library

from transformers import pipeline

Step3: Importing the textwrap Library

import textwrap

The textwrap library is a standard Python library used for text formatting. It provides functionalities to format and manipulate text, such as wrapping text to a certain width, indenting text, and filling text paragraphs. This is particularly useful when you need to display text in a more readable format, especially when working with long strings of text data.

Step4: Importing the numpy Library

import numpy as np

numpy is a fundamental package for numerical computing in Python. It provides support for arrays, matrices, and many mathematical functions to operate on these data structures. In the context of NLP and data manipulation, numpy is often used to handle numerical operations, create arrays for data processing, and perform statistical analysis.

Step5: Importing the pandas Library

import pandas as pd

Step6: Importing the pprint Function from the pprint Library

from pprint import pprint

The pprint module stands for “pretty-print” and is used to display data structures in a more readable and organized way. This is particularly helpful when you need to print large dictionaries or nested data structures in a human-readable format.

Step7: Loading the Dataset into a DataFrame

After importing the necessary libraries, the next step is to load the dataset into a pandas DataFrame. Here’s how you can do it:

df = pd.read_csv('bbc_text_cls.csv?dl=0')

Step8: Display the first few rows of the DataFrame to ensure it loaded correctly

pprint(df.head())

In this section of the code:

The pd.read_csv() function from the pandas library is used to read the dataset from the specified URL and load it into a DataFrame. This function automatically handles the process of downloading the file and parsing its contents into a structured format.

We use the df.head() method to display the first few rows of the DataFrame. This is a quick way to verify that the dataset has been loaded correctly. The pprint function is used here to print the DataFrame in a more readable format.

Step9: Selecting a Business News Article from the DataFrame

doc = df[df.labels == 'business']['text'].sample(random_state=42)

DataFrame Filtering: df[df.labels == ‘business’] filters the DataFrame to include only the rows where the ‘labels’ column is equal to ‘business’.
Selecting the ‘text’ Column: [‘text’] extracts the ‘text’ column from the filtered DataFrame.
Random Sampling: .sample(random_state=42) randomly selects one row from the ‘text’ column. Setting the random_state=42 parameter ensures reproducible sampling, meaning we will select the same row each time we run the code with this seed value.

Step10: Defining the Text Wrapping Function

def wrap(x):
  return textwrap.fill(x, replace_whitespace=False, fix_sentence_endings=True)

Function Definition: def wrap(x): defines a function named wrap that takes a single parameter x.
Text Wrapping with textwrap.fill: return textwrap.fill(x, replace_whitespace=False, fix_sentence_endings=True) calls the textwrap.fill function on x with specific parameters to format the text.
Replace_whitespace Parameter: We set this boolean parameter to False, meaning that we will preserve consecutive whitespace characters in the input string x rather than replacing them with a single space.
Fix_sentence_endings Parameter: We set this boolean parameter to True, indicating that the function will attempt to end wrapped lines at sentence boundaries (i.e., after a period) when possible.

The wrap function inserts line breaks into the input string x, ensuring each line is no longer than a specified number of characters (default is 70), and returns the modified version.

Step11: Printing the Wrapped News Article

print(wrap(doc.iloc[0]))

To access the selected article, we use doc.iloc[0] to retrieve the first (and in this case, the only) element from the doc Series. We use iloc to access elements by their integer-location based index.
Applying the wrap Function: wrap(doc.iloc[0]) calls the wrap function with the selected article text as its argument. This formats the text according to the specified wrapping rules.
Printing the Formatted Text: print(wrap(doc.iloc[0])) prints the wrapped text, making it more readable by ensuring that each line does not exceed a certain length and preferably ends at a sentence boundary.

Step12: Creating the Summarization Pipeline

summarizer = pipeline('summarization')

This line creates a summarization pipeline using the pipeline function from the transformers library. The argument ‘summarization’ specifies the task we will use the pipeline for.

By default, the pipeline utilizes the distilbart-cnn-12–6 model for abstractive summarization.

Step13: Selecting an Article and Generating a Summary

doc = df[df.labels == 'business']['text'].sample(random_state=42)

summarizer(doc.iloc[0].split('\n',1)[1])

The first line randomly selects an article from the ‘business’ category in the DataFrame df.

The second line applies the summarization pipeline to the selected article. We split the article text into two parts using the split method with ‘\n’ as the separator. We then pass the second part, representing the main body of the article, to the summarization pipeline.

The summarization pipeline generates a condensed summary of the article.

Step14: Printing the Summarized Text

print(summarized_text)

This line prints the summarized text generated by the summarization pipeline.

Step15: Repeating the Process for Another Article

doc = df[df.labels == 'entertainment']['text'].sample(random_state=50)

summarizer(doc.iloc[0].split('\n',1)[1])

These lines select and summarize an article from the ‘entertainment’ category in a similar manner as above.

Conclusion

Transformers-powered text summarization marks a substantial development in natural language processing, making it possible to extract crucial information from massive amounts of text with unmatched precision and effectiveness. Transformers’ adaptability and efficiency in extractive and abstractive summarization methods have opened up new avenues for creative applications in content analysis, news aggregation, and information retrieval, among other fields. Organizations may improve decision-making processes, optimize information processing workflows, and extract new insights from textual data by utilizing Python modules like `pandas` and `transformers`. We expect the influence of Transformers in this sector to rise as text summarization progresses due to advances in deep learning and NLP, providing intriguing potential for additional study.

Frequently Asked Questions

Q1.What is text summarization?

A. Text summarization is the process of condensing a large text document into a shorter version while preserving its key information and meaning.

Q2. What are Transformers in the context of text summarization?

A. Advanced deep learning models, Transformers, have demonstrated remarkable performance in various natural language processing tasks, including text summarization. They utilize attention mechanisms to understand the context of words, sentences, and documents, making them well-suited for summarization tasks.

Q3. What are the two main approaches to text summarization using Transformers?

A. The two main approaches are extractive summarization and abstractive summarization. Extractive summarization involves selecting and combining important sentences or phrases from the original text, while abstractive summarization generates new sentences to convey the main ideas of the text.

Q4. What are some common applications of text summarization?

A. Text summarization has various applications, including news aggregation, content analysis, information retrieval, document management, meeting minutes, customer feedback analysis, legal contract summarization, and customer service optimization.

Q5. Why are Transformers preferred for text summarization tasks?

A. We prefer transformers for text summarization because they understand context, train extensively on large datasets, scale effectively, allow for end-to-end training, and consistently deliver state-of-the-art results.

Q6. How can I implement text summarization with Transformers in Python?

A. You can implement text summarization with Transformers by using libraries such as transformers and pandas in Python. These libraries provide high-level APIs for loading pre-trained models, preprocessing data, training summarization models, and generating summaries.

Janvi Kumari

Hi, I am Janvi, a passionate data science enthusiast currently working at Analytics Vidhya. My journey into the world of data began with a deep curiosity about how we can extract meaningful insights from complex datasets.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Introduction to NLP

Text Pre-processing

NLP Libraries

Regular Expressions

String Similarity

Spelling Correction

Topic Modeling

Text Representation

Information Retrieval System

Word Vectors

Word Senses

Dependency Parsing

Language Modeling

Getting Started with RNN

Different Variants of RNN

Machine Translation and Attention

Self Attention and Transformers

Transfomers and Pretraining

Question Answering

Text Summarization

Named Entity Recognition

Coreference Resolution

Audio Data

ASR

Audio Separation

Chatbot

Auto NLP

How to Summarize Text with Transformer-based Models?

Introduction

Table of contents

What is Text Summarization?

How Text Summarization is Performed Using Transformers?

Why Should You Use Transformers to Summarize Text?

High Accuracy and Context Awareness

Handling Complex and Varied Content

Efficiency and Time-Saving

Improved Information Retrieval

Enhanced Document Management

Better Customer Insights

Clarity in Legal Contracts

Streamlined Customer Service

Summary of the Coding Procedure

Steps to Summarize Text with Transformer-based Models

Step1: Install Transformers

Step2: Importing the pipeline Module from the transformers Library

Step3: Importing the textwrap Library

Step4: Importing the numpy Library

Step5: Importing the pandas Library

Step6: Importing the pprint Function from the pprint Library

Step7: Loading the Dataset into a DataFrame

Step8: Display the first few rows of the DataFrame to ensure it loaded correctly

Step9: Selecting a Business News Article from the DataFrame

Step10: Defining the Text Wrapping Function

Step11: Printing the Wrapped News Article

Step12: Creating the Summarization Pipeline

Step13: Selecting an Article and Generating a Summary

Step14: Printing the Summarized Text

Step15: Repeating the Process for Another Article

Conclusion

Frequently Asked Questions

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID