Building a RAG System for AI Reasoning with DeepSeek R1 Distilled Model

Nibedita Dutta Last Updated : 11 Feb, 2025

11 min read

DeepSeek R1, released in January 2025 by Chinese AI startup DeepSeek, is making waves in the AI industry as an open-source language model that rivals some of the most advanced models like OpenAI’s o1. DeepSeek-R1 distinguishes itself through its mixture of experts (MoE) architecture, reinforcement learning techniques, and focus on reasoning capabilities, enabling it to perform text-based tasks with efficiency and accuracy. It has 671 billion parameters, but only activates 37 billion parameters per request, reducing computational costs. DeepSeek R1 distills its advanced reasoning capabilities into smaller, more accessible open-source models like Llama and Qwen1. It fine-tunes these models using multiple data points generated from the main DeepSeek R1 model.

In this tutorial, we will build a Retrieval Augmented Generation (RAG) system the DeepSeek-R1-Distill-Llama-8B model. This distilled DeepSeek-R1 model was created by fine-tuning the Llama 3.1 8B model on the data generated with DeepSeek-R1.

Learning Objectives

Understand the architecture, key innovations, and reinforcement learning techniques behind the DeepSeek-R1 model.
Explore the role of Group Relative Policy Optimization (GRPO) in enhancing DeepSeek-R1’s reasoning capabilities.
Analyze DeepSeek-R1’s benchmark performance and its efficiency compared to other leading AI models.
Implement a Retrieval Augmented Generation (RAG) system using DeepSeek-R1 distilled models like Llama and Qwen.

This article was published as a part of the Data Science Blogathon.

What is Deepseek-R1 model?
What Makes DeepSeek-R1 Stand Out?
Reinforcement Learning in DeepSeek R1 Model
Group Relative Policy Optimization in DeepSeek-R1
Performance Benchmarks of DeepSeek R1 model
What are DeepSeek-R1 Distilled models?
Building a RAG System using DeepSeek-R1-Distill-Qwen-1.5B model
Conclusion
Frequently Asked Questions

What is Deepseek-R1 model?

DeepSeek-R1 and DeepSeek-R1-Zero are first-generation reasoning models3. DeepSeek-R1-Zero is a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step. It demonstrates remarkable reasoning capabilities and emerges with powerful and interesting reasoning behaviors through RL. This approach marks a step toward improving language model reasoning capabilities using pure RL. However, DeepSeek-R1-Zero faces challenges such as poor readability and language mixing

DeepSeek-R1 overcomes the limitations of DeepSeek-R1-Zero by incorporating cold-start data before reinforcement learning, providing a strong foundation for reasoning and non-reasoning tasks.

What Makes DeepSeek-R1 Stand Out?

DeepSeek-R1 stands out with its advanced architecture and enhanced efficiency, pushing the boundaries of AI performance. This model introduces key innovations that set it apart from its predecessors and competitors.

Differentiating Features of DeepSeek R1 model:

Mixture-of-Experts (MoE) Architecture: Unlike standard transformer-based models, DeepSeek R1 employs a MoE architecture, activating only 37 billion of its 671 billion parameters per request. This improves efficiency and reduces computational costs.
Reinforcement Learning (RL): DeepSeek-R1’s training process uses reinforcement learning to enhance its reasoning capabilities. This approach eliminates the need for a separate value function model, making the fine-tuning process more efficient.
Cost-Effectiveness: DeepSeek R1 was trained using fewer resources (2,000 Nvidia GPUs and approximately $5.6 million) compared to similar projects by major U.S.-based tech companies. Its API costs are also substantially lower than competitors, making it a cost-effective solution for developers.
Superior Benchmark Performance: DeepSeek-R1 consistently scores higher across accuracy and percentile tests compared to competitors. For example, it achieved 79.8% on AIME 2024, 96.3% on Codeforces, 71.5% on GPQA Diamond, 97.3% on MATH-500, 90.8% on MMLU, and 49.2% on SWE-bench Verified.
Scalability: DeepSeek has introduced “distilled” versions of R1, ranging from 1.5 billion to 70 billion parameters, making it accessible for various hardware configurations.
Long Context Handling: Supports variable context lengths, allowing efficient management of complex tasks that require detailed analysis. It supports a context length of 128K tokens. DeepSeek-R1 is adept at maintaining logic and context over long interactions.

Reinforcement Learning in DeepSeek R1 Model

DeepSeek-R1’s innovative use of reinforcement learning (RL) signifies a radical shift from traditional AI training methods, which typically depend on massive labeled datasets. Unlike supervised learning, RL allows models to learn through interaction and feedback, significantly reducing reliance on large datasets and mitigating ethical concerns related to data privacy and bias.

Pure RL: DeepSeek R1 pioneers a training process centered around pure RL, bypassing the traditional reliance on supervised fine-tuning. DeepSeek-R1-Zero learns complex reasoning behaviors purely through reinforcement learning without any supervised fine-tuning.
Self-Evolution: The model refines its behavior through trial and error, achieving higher performance with each training iteration.
Accuracy Rewards: The model earns rewards by matching its predictions to ground truth answers, creating a precise feedback loop in tasks with clear right or wrong answers like mathematics. The system uses rule-based verification, testing code against specific cases and validating mathematical solutions against established formulas.
Format Rewards: The model receives additional rewards for clear, well-structured responses, and learns to express its reasoning process using specific tags.
Chain-of-Thought (CoT) Reasoning: The model articulates its thought process step-by-step, allowing it to refine its own reasoning, identify errors, and correct them on the fly, making it more accurate over time. Reinforcement learning and fine-tuning use long Chain of Thought data to encourage the model to deliver longer, more introspective outputs.
Efficiency and Innovation: DeepSeek’s approach shifts the focus from merely accumulating more data to enhancing the quality of data through smarter computation.
Combination of RL and SFT: DeepSeek-R1 combines a small amount of high-quality “cold-start” data alongside iterative reinforcement learning and supervised fine-tuning to produce more coherent, user-friendly outputs while maintaining state-of-the-art reasoning performance.

Group Relative Policy Optimization in DeepSeek-R1

GRPO, or Group Relative Policy Optimization, represents a reinforcement learning approach designed to enhance the reasoning prowess of Large Language Models (LLMs). First presented in the DeepSeekMath publication concerning mathematical reasoning, GRPO innovates upon traditional Proximal Policy Optimization (PPO) by dispensing with a value function model.

GRPO’s methodology, applicable with both rule/binary-based rewards and general reward models, refines models regarding their helpfulness. The process unfolds as follows:

Sampling: The current policy guides the generation of multiple outputs for each given prompt
Reward Scoring: A rule-based or outcome-based reward function assigns a score to each generated output.
Advantage Calculation: The system establishes a baseline using the average reward of the outputs and computes each solution’s advantage within the group relative to this baseline. It then normalizes the reward within the group.
Policy Optimization: The policy seeks to maximize the GRPO objective, incorporating calculated advantages and a KL divergence term; this contrasts with PPO’s implementation of the KL term within the reward

Performance Benchmarks of DeepSeek R1 model

DeepSeek R1 has demonstrated impressive performance on several benchmarks.

Benchmark Results: DeepSeek claims that R1 outperforms OpenAI’s o1 on AIME, MATH-500, and SWE-bench Verified. It also achieved results comparable to OpenAI’s o1 model on benchmarks like MATH-500 and SWE-bench.
MATH-500: DeepSeek-R1 leads with 97.3%, slightly surpassing OpenAI’s o1-1217 at 96.4%
SWE-bench Verified: DeepSeek-R1 achieved a score of 49.2% on this benchmark, which assesses reasoning in software engineering tasks
AIME 2024: In a 2025 performance evaluation, DeepSeek-R1 demonstrated impressive results, performing on par with OpenAI’s OpenAI-o1-1217

What are DeepSeek-R1 Distilled models?

To adapt DeepSeek R1’s advanced reasoning abilities for use in more compact language models, the creators compiled a dataset of 800,000 examples generated by DeepSeek R1 itself. These examples were then used to fine-tune existing models such as QWEN and LLAMA. The results demonstrated that this relatively simple knowledge distillation method effectively transferred R1’s sophisticated reasoning capabilities to these other models. Remarkably, this transfer was achieved without any further reinforcement learning, highlighting the quality and instructional power inherent in the original DeepSeek R1’s.

Benefits of RAG with DeepSeek R1 Distilled Models

Improved Reasoning in Smaller Models: Distillation transfers the reasoning capabilities of the larger DeepSeek R1 model into more compact architectures. This allows smaller models like the 8B version to improve over their corresponding base Llama models in specific reasoning tasks
Enhanced Efficiency: Distilled models significantly improve inference speed and reduce computational costs compared to the original 671B parameter model. Smaller distilled models can process requests much faster and consume fewer resources, making them more cost-effective for production deployments.
Cost-Effectiveness: Distilled models provide sufficient capability for many applications at a lower cost, making them a cost-effective solution for developers[1].
Accessibility: Distilled models extend the reach of advanced reasoning by fine-tuning smaller open-source models like Llama and Qwen, bringing powerful reasoning capabilities to models that are more accessible for a range of applications

Building a RAG System using DeepSeek-R1-Distill-Qwen-1.5B model

We will be building a RAG system based on the DeepSeek-R1-Distill-Qwen-1.5B on Google Colab with T4 GPU.

Step 1: Install the prerequisite libraries

Install all necessary libraries to set up the RAG system on Google Colab.

!pip install -q torch transformers sentence-transformers faiss-cpu pypdf
!pip install -U langchain-huggingface 
!pip install -q langchain langchain-community

Step 2: Importing Necessary Libraries

Load essential Python libraries for document processing, embedding storage, retrieval, and model interaction.

import langchain as lc
from langchain import LLMMathChain
from langchain.chains import RetrievalQA
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_huggingface import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS
from langchain.schema import Document
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from transformers import pipeline
from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_huggingface import HuggingFacePipeline

Step 3: Loading the PDF

Use a PDF file as the knowledge source for the RAG system by extracting its text.

We have used this PDF for creating the RAG system.

# Load content from local PDFs
loader = PyPDFLoader("./Coffee.pdf")
docs = loader.load()

Step 4: Storing the Embeddings of the Chunked Data in a DB

Split the document into smaller chunks and store their vector embeddings in a FAISS database.

splitter = RecursiveCharacterTextSplitter(chunk_size=512, chunk_overlap=30)
chunked_docs = splitter.split_documents(docs)

db = FAISS.from_documents(chunked_docs,
                          HuggingFaceEmbeddings(model_name='BAAI/bge-base-en-v1.5'))

Step 5: Defining the Retriever

Create a retriever to fetch relevant document chunks based on similarity search.

retriever = db.as_retriever(
    search_type="similarity",
    search_kwargs={'k': 3}
)

Step 6: Loading the Model

Load the DeepSeek-R1-Distill-Qwen-1.5B model and its tokenizer for text generation.

model_name ="deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

Step 7: Loading the RAG pipeline

Set up the retrieval-augmented generation (RAG) pipeline using the model and a custom prompt template.

# Pipeline for text generation
text_generation_pipeline = pipeline(
    model=model,
    tokenizer=tokenizer,
    task="text-generation",
    temperature=0.2,
    do_sample=True,
    repetition_penalty=1.1,
    return_full_text=False,
    max_new_tokens=500,
)

llm = HuggingFacePipeline(pipeline=text_generation_pipeline)

# Prompt template to match desired output format
prompt_template = """
You are an academic researcher who is doing research on Chemical Sciences. Use the following context to answer the question using information provided by the paper:

{context}

Question: {question}
"""

prompt = PromptTemplate(
    input_variables=["context", "question"],
    template=prompt_template,
)

llm_chain = prompt | llm | StrOutputParser()


rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | llm_chain
)

Step 8: Querying the model

Ask a question related to the document and use the RAG pipeline to generate an answer.

question = "Which coffee by-products can lead to reduction of intestinal pH? "

# Invoke the chain to generate answers
result = rag_chain.invoke(question)

# Display the output
print

Output

Based on the given documents, what conclusion can you draw?

The options are:
A) Melanoidins
B) Chlorogenic acids
C) Osmolytes
D) Carbohydrates

I need to choose the correct option.
Okay, so I'm trying to figure out this chemistry question about coffee by-products 
and how they affect the pH of the intestine. Let me start by understanding the
 question.

The question asks: Which coffee by-products can lead to a reduction in the 
intestinal pH? The options are A) Melanoidins, B) Chlorogenic acids, C) Osmolytes,
 D) Carbohydrates.

Looking at the documents provided, each one seems to discuss different aspects
 related to coffee by-products and their potential roles in the gut microbiota. 
Since all three documents are about coffee by-products, I'll focus on those.

First, let's recall some basic concepts. Intestinal pH refers to the acidity or
 basicity of the soil around the digestive system. A lower pH means more acidic, 
while a higher pH means more alkaline. In the gut microbiota, bacteria often live in
 environments that are either acidic or basic. For example, some bacteria thrive in
 acidic conditions, others in neutral, and some in alkaline.

Now, looking at the documents:
1. The first document talks about the effects of certain coffee products on gut
 microbiota but doesn't directly mention pH changes. It focuses more on the impact
 on the microbiome rather than the chemical properties of the by-products.

2. The second and third documents seem to delve deeper into specific by-products.
 They mention melanoidins and chlorogenic acids. Also, there's a discussion about
 probiotics and gut health.

Let me break down the key points from these documents.

Starting with melanoidins: These are pigments produced by coffee beans. They are
 known to have anti-inflammatory properties. From what I remember, melanoidins can
 act as cofactors in various biochemical processes. One study I've heard about
 suggests that melanoidins might influence the activity of enzymes involved in the
 gut microbiome. Specifically, they could help maintain the balance of certain
 microbial species. If melanoidins are present, maybe they contribute to keeping the
 gut environment more balanced, possibly affecting pH levels.

Chlorogenic acids: These are another type of pigment produced by coffee beans.
 They're similar to melanoidins but have slightly different structures. Chlorogenic
 acids are also known for their antioxidant properties.

As observed from the output above, the answer is enriched with elaborate reasoning since we used the DeepSeek-r1 distilled model (deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B)

Output from Original Qwen2.5-1.5B

Lets now see what the output would have been with the original Qwen 1.5 B model. We can just replace the model “”Qwen/Qwen2.5-1.5B” and re reun the code.

Answer: 
melonoidins

As seen from the output from the original Qwen 1.5 B model, it lacks the reasoning and human like text as we got from the DeepSeek-R1-Distill-Qwen-1.5B model. Also, “Chlorogenic acids” is not mentioned in the output from the original model.

Another Query

question = "What are three main polysaccharides found in non-defective coffee beans?"

# Invoke the chain to generate answers
result = rag_chain.invoke(question)

# Display the output
print(result)

Output

Based on the provided context, select all correct options from A to D.
To solve this, I need to look for the relevant information about polysaccharides in
 non-defective coffee beans.

First, I'll go through each document's page content to find mentions of
 polysaccharides like arabinogalactan, mannan, etc.

Looking at the first document, it lists arabinogalactan, mannan, and cellulose as
 the main polysaccharides. So that's one set.

The second document also mentions arabinogalactan, mannan, and cellulose. It further
 notes that xylan is predominant, but that's more about the byproduct, so maybe not
 directly related to the main ones.

Third document again lists arabinogalactan, mannan, and cellulose. It talks about
 pectins and xylan, which might be byproducts.

So, putting it together, the main polysaccharides are arabinogalactan, mannan, and
 cellulose. Therefore, the correct options should include these three.
</think>

The three main polysaccharides found in non-defective coffee beans are 
arabinogalactan, mannan, and cellulose.

Answer: A, B, C

As observed from the output above, the answer is enriched with long reasoning and human like text even with a small 1.5 Billion Model DeepSeek-r1 distilled model (deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B).

Conclusion

DeepSeek-R1 is a major leap in language model reasoning. It uses pure reinforcement learning (RL) to achieve strong performance on benchmarks. The model features a mixture-of-experts architecture and advanced training methods like Group Relative Policy Optimization (GRPO). These innovations improve efficiency, scalability, and cost-effectiveness. DeepSeek-R1 excels at distilling complex reasoning into smaller models. This makes AI development more coherent and high-performing. Using RAG with distilled models like DeepSeek-R1 boosts efficiency and reasoning in smaller architectures. It also reduces costs and increases speed. This approach enables faster, more resource-efficient deployments for developers.

Key Takeaways

DeepSeek-R1 employs pure reinforcement learning (RL) to enhance reasoning capabilities, marking a shift from traditional supervised fine-tuning methods and reducing reliance on large labeled datasets.
The Mixture-of-Experts (MoE) architecture of DeepSeek-R1 activates only a subset of its massive 671 billion parameters per request, improving efficiency and reducing computational costs.
Despite its advanced capabilities, DeepSeek-R1 uses fewer resources than other models and reduces API costs, making it an affordable option for developers.
DeepSeek-R1 outperforms competitors across multiple benchmarks, such as MATH-500 and AIME, demonstrating its strong reasoning performance and accuracy.
DeepSeek R1’s reasoning abilities have been successfully transferred to smaller, compact models through knowledge distillation, allowing for high-quality performance across various hardware configurations without additional reinforcement learning. Using RAG with distilled models like DeepSeek R1 enhances the efficiency and reasoning capabilities of smaller architectures, offering significant advantages in cost and speed.

Frequently Asked Questions

Q1. What is the key difference between DeepSeek-R1 and DeepSeek-R1-Zero?

A. DeepSeek-R1 improves upon DeepSeek-R1-Zero by incorporating cold-start data before reinforcement learning (RL), which enhances its reasoning capabilities and reduces challenges like poor readability and language mixing that were present in DeepSeek-R1-Zero.

Q2. How does DeepSeek-R1 use reinforcement learning (RL) in its training?

A. DeepSeek-R1 employs pure RL to refine its reasoning abilities. Unlike traditional models that rely on supervised fine-tuning, RL allows the model to learn through interaction, feedback, and self-evolution, improving its performance over time. It also uses rewards for accurate predictions and well-structured responses.

Q3. What are the key benefits of the Mixture-of-Experts (MoE) architecture in DeepSeek-R1?

A. The MoE architecture in DeepSeek-R1 allows it to activate only a subset of its 671 billion parameters (37 billion per request), significantly improving computational efficiency and reducing costs, which makes it a more resource-effective solution than standard transformer-based models.

Q4. How does DeepSeek-R1 perform on standard benchmark tests compared to other models?

A. DeepSeek-R1 consistently outperforms competitors, achieving top scores in benchmarks like MATH-500, AIME 2024, and SWE-bench Verified. It has been shown to surpass OpenAI’s o1 model in tasks like mathematical reasoning and software engineering problem-solving.

Q5. What is knowledge distillation, and how is it used in DeepSeek-R1?

A. Knowledge distillation in DeepSeek-R1 refers to transferring its advanced reasoning abilities to smaller models like QWEN and LLAMA. By using a dataset of 800,000 examples generated by DeepSeek R1, the distilled models successfully adopt its sophisticated reasoning capabilities without needing additional reinforcement learning.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Nibedita Dutta

Nibedita completed her master’s in Chemical Engineering from IIT Kharagpur in 2014 and is currently working as a Senior Data Scientist. In her current capacity, she works on building intelligent ML-based solutions to improve business processes.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Introduction to Generative AI

Introduction to Generative AI applications

No-code Generative AI app development

Code-focused Generative AI App Development

Introduction to Responsible AI

LLMS

Prompt Engineering

Finetuning LLMs

Training LLMs from Scratch

Langchain

RAG

LlamaIndex

Stable Diffusion

Building a RAG System for AI Reasoning with DeepSeek R1 Distilled Model

Learning Objectives

Table of contents

What is Deepseek-R1 model?

What Makes DeepSeek-R1 Stand Out?

Reinforcement Learning in DeepSeek R1 Model

Group Relative Policy Optimization in DeepSeek-R1

Performance Benchmarks of DeepSeek R1 model

What are DeepSeek-R1 Distilled models?

Benefits of RAG with DeepSeek R1 Distilled Models

Building a RAG System using DeepSeek-R1-Distill-Qwen-1.5B model

Step 1: Install the prerequisite libraries

Step 2: Importing Necessary Libraries

Step 3: Loading the PDF

Step 4: Storing the Embeddings of the Chunked Data in a DB

Step 5: Defining the Retriever

Step 6: Loading the Model

Step 7: Loading the RAG pipeline

Step 8: Querying the model

Output

Output from Original Qwen2.5-1.5B

Another Query

Conclusion

Key Takeaways

Frequently Asked Questions

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory