Summarizing long pieces of text is a challenging problem. Summarization is done primarily in two ways: extractive approach and abstractive approach. In this work, we break down the problem of meeting summarization into extractive and abstractive components which further collectively generate a summary of the conversation.
What is Dialogue Summarization?
Humans are social animals, we exchange ideas, share information, and make plans with each other. Text and Speech are the two common conversation mediums, but mostly it’s speech. With the abundance of digital conversation happening over online messaging, IRC, meeting platforms, and the ubiquity of automatic speech recognition systems come vast amounts of meeting transcripts.
Therefore, the need to succinctly summarize the content of the conversation naturally arises. Several methods of generating summaries have been proposed. A very standard and crucial application of conversational dialogue summarization is in meeting summary generation.
Traditional methods used different extractive summarization approaches which were good to extract only important phrases within the document. Some new techniques came up with transformer-based models that were capable of generating coherent and subjective summaries.
Seeing this availability of research and lack of capabilities of the current model, we present a novel hierarchical model, an extractive to abstractive summarization approach that does extractive then abstractive summarization and gave the highest ROUGE scores.
Our best performing extractor is a two-step hierarchical model that encodes each sentence with a pre-trained BERT model and then applies a bidirectional LSTM to create each utterance’s sentence embedding and later passed to an unsupervised clustering mechanism to detect key sentences. For the abstractive module, our best-performing model is a fine-tuned PEGASUS Model to generate an abstractive summary.
In summary, our contributions are as follows:
We present a new novel approach to work on long summaries
We beat the state-of-the-art (SOTA) results on the AMI meeting dataset.
Let’s Dive into the Methodology
The methodology used is a two-step hierarchical extractive to abstractive summary generation methodology in which transformer-based architecture, PEGASUS as in cite{zhang2020pegasus} is used where the base architecture of the model is a standard transformer encoder-decoder with a novel pre-training technique where whole sentences are masked with [MASK] tokens along with few other tokens randomly.
The input to the encoder-decoder model is given in 512 length embedding which is generated after processing the AMI input from a BERT-based extractive summarizer where we apply k-means on a BERT sentence-level embedding. We use two variants of this approach, with and without fine-tuning.
Our baseline model is also a transformer model where pre-trained encoder BERT is used to produce extractive summaries and further these extractive summaries are passed with a GPT2 decoder model to construct abstractive and meaningful summaries. (see figure below).
Extractive Summarization Approach
Extractive summary generation is itself performed in two steps where input text is converted into the BERT sentence embedding and is further passed through an unsupervised algorithm cite{Jin2010} to cluster the most important sentences. These clusters are basically a collection of sentences from the input text and represent the most relevant sentences from the input.
The unsupervised extractive summary generation technique has been attempted previously and it was shown how clustering techniques can help to select key components of the text. Since this extractive technique is followed in an unsupervised fashion, it becomes worth mentioning that the need for parallel annotated data suddenly gets eliminated and could be trained on large corpora. As shown in the figure. 1, the left section shows the extractive summary generator and which is passed further to the abstractive summary generator on the right.
Abstractive Summarization Approach
The abstractive summarizer is an encoder-decoder-based language model, PEGASUS to generate a semantically good summary. PEGASUS is a pre-training technique introducing gap sentences masking and summary generation.
Typically the architecture of the PEGASUS model contains 15 layers of encoder and 15 layers of a decoder which collectively consider text documents as input after masking. Their hypothesis states that the closer the pre-training self-supervised objective is to the final down-stream task, the better the fine-tuning performance. In the discussed approach, pre-training, several sentences are removed from documents and the model is tasked with recovering them. Example input for pre-training is a document with missing sentences, while the output consists of the missing sentences concatenated together.
This is an incredibly difficult task that may seem impossible, even for people, and we don’t expect the model to solve it perfectly. However, It is a problematic task that encourages the model to get to know about language as well as facts about the world, as well as how to filter out information considered throughout the document to produce output that marginally relates to the fine-tuning our task. The pros of this self-supervision are that we could generate as many similar examples as there are documents, without any annotation, which is often the narrow section in supervised systems.
Results
About the Metric: ROUGE 1 and ROUGE 2
These metrics are respectively based on unigram, bigram, and unigram overlap with a maximum skip distance of 4, and be highly correlated with human evaluations. ROUGE-2 scores can be seen as a measure of summary readability. Also, there is one opposite way to evaluate the performance of the approach and that is the human evaluation metric.
But it is slow and very expensive. Although, empirical studies show that the model’s performance judgment can be accurately done using ROGUE metric approaches in both automatic metrics and human evaluation. Table 1 gives a summary of all the experiments performed on the summarization task on the AMI dataset. The results clearly show that the results were improved by a considerable margin.
Results(table)
So in the end, let’s see if this work was new or something found on the first to last page of Google search!
So far the work on meeting summarization is not done on a large scale and there is a lot of scope for improvement in this research area. Since most of the summarization tasks are performed over document or new article data format. To make good use of the previous state-of-the-art models, we converted the conversation from dialogue format to article format to generate summaries of conversation in the news article summary generation methodology.
Our key approach extractive to abstractive was fundamentally the novel idea of combining two different steps to generate the novel idea. Since it is hard to train transformers on longer sentences. This two-step summarization technique involves the state of the art implementation of the PEGASUS model which is finetuned on our data.
The media shown in this article are not owned by Analytics Vidhya and is used at the Author’s discretion.
We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.
Show details
Powered By
Cookies
This site uses cookies to ensure that you get the best experience possible. To learn more about how we use cookies, please refer to our Privacy Policy & Cookies Policy.
brahmaid
It is needed for personalizing the website.
csrftoken
This cookie is used to prevent Cross-site request forgery (often abbreviated as CSRF) attacks of the website
Identityid
Preserves the login/logout state of users across the whole site.
sessionid
Preserves users' states across page requests.
g_state
Google One-Tap login adds this g_state cookie to set the user status on how they interact with the One-Tap modal.
MUID
Used by Microsoft Clarity, to store and track visits across websites.
_clck
Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.
_clsk
Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.
SRM_I
Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.
SM
Use to measure the use of the website for internal analytics
CLID
The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.
SRM_B
Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.
_gid
This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.
_ga_#
Used by Google Analytics, to store and count pageviews.
_gat_#
Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.
collect
Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.
AEC
cookies ensure that requests within a browsing session are made by the user, and not by other sites.
G_ENABLED_IDPS
use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.
test_cookie
This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.
_we_us
this is used to send push notification using webengage.
WebKlipperAuth
used by webenage to track auth of webenagage.
ln_or
Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.
JSESSIONID
Use to maintain an anonymous user session by the server.
li_rm
Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.
AnalyticsSyncHistory
Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.
lms_analytics
Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.
liap
Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.
visit
allow for the Linkedin follow feature.
li_at
often used to identify you, including your name, interests, and previous activity.
s_plt
Tracks the time that the previous page took to load
lang
Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings
s_tp
Tracks percent of page viewed
AMCV_14215E3D5995C57C0A495C55%40AdobeOrg
Indicates the start of a session for Adobe Experience Cloud
s_pltp
Provides page name value (URL) for use by Adobe Analytics
s_tslv
Used to retain and fetch time since last visit in Adobe Analytics
li_theme
Remembers a user's display preference/theme setting
li_theme_set
Remembers which users have updated their display / theme preferences
We do not use cookies of this type.
_gcl_au
Used by Google Adsense, to store and track conversions.
SID
Save certain preferences, for example the number of search results per page or activation of the SafeSearch Filter. Adjusts the ads that appear in Google Search.
SAPISID
Save certain preferences, for example the number of search results per page or activation of the SafeSearch Filter. Adjusts the ads that appear in Google Search.
__Secure-#
Save certain preferences, for example the number of search results per page or activation of the SafeSearch Filter. Adjusts the ads that appear in Google Search.
APISID
Save certain preferences, for example the number of search results per page or activation of the SafeSearch Filter. Adjusts the ads that appear in Google Search.
SSID
Save certain preferences, for example the number of search results per page or activation of the SafeSearch Filter. Adjusts the ads that appear in Google Search.
HSID
Save certain preferences, for example the number of search results per page or activation of the SafeSearch Filter. Adjusts the ads that appear in Google Search.
DV
These cookies are used for the purpose of targeted advertising.
NID
These cookies are used for the purpose of targeted advertising.
1P_JAR
These cookies are used to gather website statistics, and track conversion rates.
OTZ
Aggregate analysis of website visitors
_fbp
This cookie is set by Facebook to deliver advertisements when they are on Facebook or a digital platform powered by Facebook advertising after visiting this website.
fr
Contains a unique browser and user ID, used for targeted advertising.
bscookie
Used by LinkedIn to track the use of embedded services.
lidc
Used by LinkedIn for tracking the use of embedded services.
bcookie
Used by LinkedIn to track the use of embedded services.
aam_uuid
Use these cookies to assign a unique ID when users visit a website.
UserMatchHistory
These cookies are set by LinkedIn for advertising purposes, including: tracking visitors so that more relevant ads can be presented, allowing users to use the 'Apply with LinkedIn' or the 'Sign-in with LinkedIn' functions, collecting information about how visitors use the site, etc.
li_sugr
Used to make a probabilistic match of a user's identity outside the Designated Countries
MR
Used to collect information for analytics purposes.
ANONCHK
Used to store session ID for a users session to ensure that clicks from adverts on the Bing search engine are verified for reporting purposes and for personalisation
We do not use cookies of this type.
Cookie declaration last updated on 24/03/2023 by Analytics Vidhya.
Cookies are small text files that can be used by websites to make a user's experience more efficient. The law states that we can store cookies on your device if they are strictly necessary for the operation of this site. For all other types of cookies, we need your permission. This site uses different types of cookies. Some cookies are placed by third-party services that appear on our pages. Learn more about who we are, how you can contact us, and how we process personal data in our Privacy Policy.
Hi, Nice article. Will it be possible to share some code base related to the methodology described above? Thanks, Sanjeev