Comparison of Text Generations from GPT and GPT-2

Drishti Last Updated : 04 Dec, 2022

4 min read

This article was published as a part of the Data Science Blogathon.

Source: Canva

Introduction

The real-world data can be very messy and skewed, which can mess up the effectiveness of the predictive model if it is not addressed correctly and in time.

The consequences of skewness become more pronounced when a large model is trained on a skewed dataset, and it is often not practical to retrain that model from scratch. Besides that, if those models are placed into production immediately, we must be ready for the implications.

This article will test the genre skewness of GPT and GPT-2 models. I came across this interesting stuff while going through NLP with Transformers book (which I heartily recommend), so I thought of documenting my own experience and sharing it with you all.

Now, let’s begin!

Task Overview

We will make use of GPT (openai-gpt) and GPT-2 pre-trained models from the Hugging Face hub. We will also use Hugging Face’s text-generation pipeline to detect if skewness (due to over or under-representation) is evident in GPT and GPT-2 text generations.

Datasets Used for Training GPT and GPT-2

GPT is trained on the BooksCorpus dataset, which consists of about 7000 unpublished books, while GPT-2 was trained on WebText, which is linked to Reddit.

But before we compare, let’s make sure that the two models we are comparing have the same model size in order to have a fair comparison.

Ensuring That we Are Comparing Similar-Sized Versions of Both Models

For this, first off, we will install transformers and import the necessary libraries.

!pip install transformers

from transformers import pipeline, set_seed

Next, we will define the name of the models we will use for drawing comparison.

model_name1 = “openai-gpt”

model_name2 = “gpt2”

Following that, we will set up a pipeline for the text-generation task for each model.

text_generation_gpt = pipeline(“text-generation”, model = model_name1)

text_generation_gpt2 = pipeline(“text-generation”, model = model_name2)

Now, we will define a model for calculating the number of parameters in each model.

def model_size(model):
  return sum(params.numel() for params in model.parameters())

Printing the number of parameters in GPT and GPT-2.

print(f"Number of Parameters in GPT: {model_size(text_generation_gpt.model)/1000**2:.1f}M parameters")
print(f"Number of Parameters in GPT-2: {model_size(text_generation_gpt2.model)/1000**2:.1f}M parameters")

>> Output:

Number of Parameters in GPT: 116.5M parameters

Number of Parameters in GPT-2: 124.4M parameters

Hence, both of these models are similar-sized versions.

Comparison of Text Generated by GPT and GPT-2

Now we will define a function to generate completions from each model.

def enum_pipeline_outputs(pipe, prompt, num_return_sequences):
  out = pipe(prompt, num_return_sequences = num_return_sequences, clean_up_tokenization_spaces = True)
  return "n".join(f"{i+1}." + s["generated_text"] for i,s in enumerate(out))

We will use a prompt for generating four text completions to draw comparisons between the generated text from both models.

prompt = "Before they left for the supermarket"

I) Generating four output text completions for GPT

print("Text Generated by GPT for the given prompt:n" + enum_pipeline_outputs(text_generation_gpt, prompt, 4))

>> Output of GPT model:

Text Generated by GPT for the given prompt:
1.Before they left for the supermarket. 
 as she was preparing a pot of coffee the telephone rang. she put it to her ear. " hi, it's me. " 
 " you've got a visitor. we got the new computer i'm
2.Before they left for the supermarket. " but since he was still holding her captive, and he hadn't released her yet, she didn't understand why he felt the need to keep all her plans a secret from her. 
 he let go of the
3.Before they left for the supermarket. " 
 i was shocked. " he's... he's not in love with you. " 
 " he never was. he never will be again. it's over and over. this is the end for both
4.Before they left for the supermarket. i've already eaten breakfast now and i think i 'll put in a few hours in the gym this morning just to give myself time to go to the bathroom and clean up and get the better of it, but i

II) Generating four output text completions for GPT-2

print("Text Generated by GPT-2 for the given prompt:n" + enum_pipeline_outputs(text_generation_gpt2, prompt, 4))

>> Output of GPT-2 model:

Text Generated by GPT-2 for the given prompt:

1. Before they left for the supermarket, the family returned to the warehouse to check on them. According to the police, there were three suspicious items on the shelves and an object that looked like a toy or a piece of glass.

2. Before they left for the supermarket, Gai said that when he first came up in this world, it was like, “I don’t know, the world is coming to me, but it’s not coming from the home.” That made me feel more alive

3. Before they left for the supermarket, he opened the door and opened the door a little deeper. When they stopped, he said, they made a couple of attempts to get away – and I said my name just so I could hear them – then one

4. Before they left for the supermarket, I knew that it was impossible to see the other side of the house and that it was just as bad as the pictures make it sound. At the supermarket, there was a little window leading out onto a very small street and

Observation: So by comparing just a handful of GPT and GPT-2 outputs, we can clearly sense some genre skewness toward romance from the text outputs produced by GPT! Moreover, this highlights our challenges while creating a large text corpus. Also, the biases in the model’s behavior need to be considered when it comes to the target audience interacting with the model.

Conclusion

This article presents a comparison of text generations from GPT and GPT-2 to test if genre skewness is evident in the text outputs generated by both models, i.e., GPT and GPT-2.

To summarize, the key takeaways from this article are:

1. In GPT, there’s a Genre skew toward “romance” due to a strong overrepresentation of romance novels in BookCorpus. It often imagines a romantic interaction between a man and a woman.

2. GPT-2 was trained on data from Reddit. Hence it mostly adopts the neutral “they” in its text generations which has blog-like or adventure-like elements.

3. The results highlight the challenges we can face and which should rather be addressed while creating a large text corpus. Moreover, the biases in the behavior of the model need to be considered when it comes to the target audience interacting with the model.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Drishti

I'm a Researcher who works primarily on various Acoustic DL, NLP, and RL tasks. Here, my writing predominantly revolves around topics related to Acoustic DL, NLP, and RL, as well as new emerging technologies. In addition to all of this, I also contribute to open-source projects @Hugging Face.
For work-related queries please contact: [email protected]

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

Reading list

Introduction to NLP

Text Pre-processing

NLP Libraries

Regular Expressions

String Similarity

Spelling Correction

Topic Modeling

Text Representation

Information Retrieval System

Word Vectors

Word Senses

Dependency Parsing

Language Modeling

Getting Started with RNN

Different Variants of RNN

Machine Translation and Attention

Self Attention and Transformers

Transfomers and Pretraining

Question Answering

Text Summarization

Named Entity Recognition

Coreference Resolution

Audio Data

ASR

Audio Separation

Chatbot

Auto NLP

Comparison of Text Generations from GPT and GPT-2

Introduction

Task Overview

Datasets Used for Training GPT and GPT-2

Ensuring That we Are Comparing Similar-Sized Versions of Both Models

Comparison of Text Generated by GPT and GPT-2

Conclusion

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at