Navigating the dense jungle of academic research can be a daunting task. With their intricate arguments and specialized language, research papers often leave readers needing help to grasp the core message. This is where AI steps in, offering tools like the GPT-powered assistant – a powerful ally in conquering the research landscape.
This article was published as a part of the Data Science Blogathon.
Researchers face several hurdles when dealing with research papers:
Take your AI innovations to the next level with GenAI Pinnacle. Fine-tune models like Gemini and unlock endless possibilities in NLP, image generation, and more. Dive in today! Explore Now
The GPT Assistant, built on OpenAI’s Assistants API, tackles these challenges head-on, offering a suite of functionalities to streamline research and unlock the insights hidden within papers:
import base64
import sys
import json
from openai import OpenAI, AsyncOpenAI
import asyncio
client = AsyncOpenAI(api_key = "")
Let’s delve into the key steps of the GPT Assistant’s operation:
async def create_file(paper):
file = await client.files.create(
file=open(paper, "rb"),
purpose="assistants"
)
print("File created and uploaded, id: ", file.id)
return file
async def create_assistant(file):
assistant = await client.beta.assistants.create(
name="Research Assistant 1",
instructions="""You are a machine learning researcher. Answer
questions based on the research paper. Only focus on the details
and information mentioned in the paper and don not consider any
information outside the context of the research paper.""",
model="gpt-3.5-turbo-1106",
tools=[{"type": "retrieval"}],
file_ids=[file.id]
)
print("Assistant created, id: ", assistant.id)
return assistant
async def create_thread():
thread = await client.beta.threads.create()
print("Thread created, id: ", thread.id)
return thread
async def create_message(thread, content):
message = await client.beta.threads.messages.create(
thread_id=thread.id,
role="user",
content=content
)
print("User message sent!")
async def run_assistant(thread, assistant):
run = await client.beta.threads.runs.create(
thread_id=thread.id,
assistant_id=assistant.id,
)
print("Assistant Running, id: ", run.id)
return run
async def extract_run(thread, run):
while run.status != "completed":
run = await client.beta.threads.runs.retrieve(
thread_id=thread.id,
run_id=run.id
)
print("Extracting run, status: ", run.status)
print("Extracted run, status: ", run.status)
async def extract_result(thread):
messages = await client.beta.threads.messages.list(
thread_id=thread.id
)
return messages
if __name__ == "__main__":
async def main():
paper = sys.argv[1]
file = await create_file(paper)
assistant = await create_assistant(file)
thread = await create_thread()
content1 = """Please provide the abstract of the research paper.
The abstract should be concise and to the point. Only consider the
context of the research paper and do not consider any information
not present in it."""
message1 = await create_message(thread, content1)
run1 = await run_assistant(thread, assistant)
run2 = await extract_run(thread, run1)
messages1 = await extract_result(thread)
for message in list(messages1.data):
if message.role == "assistant":
print("Abstract : " + message.content[0].text.value)
abstract = message.content[0].text.value
break
else:
continue
tone = input("Please enter the desired tone (Academic, Creative, or Aggressive): ")
output_length = input("Please enter the desired output length (1x, 2x, or 3x): ")
if output_length == "1x":
output = "SAME IN LENGTH AS"
elif output_length == "2x":
output = "TWO TIMES THE LENGTH OF"
elif output_length == "3x":
output = "THREE TIMES THE LENGTH OF"
content2 = f"""Text: {abstract}. \nGenerate a paraphrased version of the
provided text in the {tone} tone. Expand on each key point and provide
additional details where possible. Aim for a final output that is
approximately {output} the original text. Ensure that the paraphrased
version retains the core information and meaning while offering a more
detailed and comprehensive explanation."""
message2 = await create_message(thread, content2)
run3 = await run_assistant(thread, assistant)
run4 = await extract_run(thread, run3)
messages2 = await extract_result(thread)
for message in messages2.data:
if message.role == "assistant":
print("Paraphrased abstract : " + message.content[0].text.value)
paraphrased_text = message.content[0].text.value
break
else:
continue
# Convert paraphrased text to JSON format
paraphrased_sentences = paraphrased_text.split(". ")
paraphrased_json = json.dumps(paraphrased_sentences)
print("Paraphrased JSON:", paraphrased_json)
asyncio.run(main())
The GPT Assistant offers a multitude of benefits for researchers:
The GPT Assistant is just the beginning. As AI technology evolves, we can expect even more sophisticated functionalities, such as:
The GPT Assistant marks a significant step towards democratizing access to research and empowering researchers to navigate the academic landscape more efficiently and clearly. This is not just a tool; it’s a bridge between the dense world of research and the diverse audiences who seek its insights. As AI continues to evolve, we can expect this bridge to become even sturdier and more expansive, paving the way for a future where research is not just accessible but truly transformative.
Dive into the future of AI with GenAI Pinnacle. From training bespoke models to tackling real-world challenges like PII masking, empower your projects with cutting-edge capabilities. Start Exploring.
A. Extract the abstract of a research paper. Paraphrase the abstract in different academic tones. Convert the paraphrased text into a JSON format for easy integration with other tools.
A. You upload a research paper as a PDF. The Assistant analyzes the paper and generates the requested outputs abstract. You receive the results in a conversation thread format.
A. Saves time by automating paper summarization and paraphrasing. Improves comprehension through concise summaries and personalized paraphrases. Enhances communication by adapting the language to different audiences. Integrates seamlessly with other research tools via JSON format.
A. Currently, it only extracts abstracts and paraphrases existing papers. Relies on the accuracy of the uploaded paper; may not identify errors or biases. Creative paraphrasing options are still under development.
A. Fact-checking and citation generation features are in the pipeline. Automatic topic modeling and knowledge extraction capabilities are being explored. Personalized research recommendations and collaborative research tools are potential future additions.
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.