DeepCoder-14B: The Open-Source Competition to o3-mini and o1

Riya Bansal. Last Updated : 10 Apr, 2025

12 min read

In a significant development for the AI community, Agentica and Together AI have released an open-source AI coding model named DeepCoder-14B. Offering code generation capabilities on par with closed-source competitors like OpenAI’s o3-mini and o1, DeepCoder-14B positions itself as a formidable open-source alternative to proprietary models. Moreover, this new model ensures full transparency and developer accessibility. In this article, we will explore the features, training, and benchmark scores of DeepCoder-14B and compare its real-world performance with that of o3-mini and o1.

What is DeepCoder-14B?
DeepCoder-14B Benchmark Performance
Behind DeepCoder’s Success: Sandbox Environment and Training Recipe
Data Curation: From Chaos to Clean, Verified Coding Problems
DeepCoder-14B Reinforcement Learning at Scale: The rLLM Framework
Getting Hands-on with DeepCoder
DeepCoder-14B Hands-on Performance
DeepCoder-14B vs o3-mini & o1: Performance Comparison
Future Developments of DeepCoder-14B
DeepCoder-14B: Access and Usage
Conclusion
Frequently Asked Questions

What is DeepCoder-14B?

DeepCoder-14B is an open-source AI code generation model featuring 14 billion parameters. Unlike proprietary alternatives, it offers complete transparency while matching the capabilities and performance of OpenAI’s o3-mini and o1. DeepCoder-14B thus demonstrates that open-source AI coding models can compete with industry leaders without requiring massive computational resources.

The model utilizes innovative training techniques such as Iterative Context Lengthening and Overlong Filtering, allowing it to reason across 64K context windows despite being trained only on 32K contexts. Beyond its impressive coding capabilities, DeepCoder-14B also demonstrates strong mathematical reasoning skills in standard benchmark tests.

Key Features of DeepCoder-14B

DeepCoder-14B advances open-source AI coding models with capabilities rivaling proprietary alternatives.

Advanced Training Techniques: Uses Iterative Context Lengthening to handle 64K context. Implements DeepCoder-14B reinforcement learning with Overlong Filtering.
High-Quality Dataset: Trained on 24K verified coding problems. Each problem has strict quality controls with 5+ test cases.
Fully Open-Source: Provides complete transparency with all code and training data. Available on GitHub and Hugging Face.
Resource-Efficient: Supports various quantization methods for efficiency. Compatible with TensorRT and vLLM inference systems.

DeepCoder-14B Benchmark Performance

Below we present a comprehensive comparison of DeepCoder-14B against leading open-source and proprietary code generation tools. These benchmarks evaluate performance across multiple dimensions of coding capability and cross-domain problem-solving.

Model	LCB (8/1/24-2/1/25)	Codeforces Rating	Codeforces Percentile	HumanEval+ Pass@1	AIME 2024
DeepCoder-14B-Preview (ours)	60.6	1936	95.3	92.6	73.8
DeepSeek-R1-Distill-Qwen-14B	53.0	1791	92.7	92.0	69.7
o1-2024-12-17 (Low)	59.5	1991	96.1	90.8	74.4
o3-Mini-2025-1-31 (Low)	60.9	1918	94.9	92.6	60.0
o1-Preview	42.7	1658	88.5	89	40.0
Deepseek-R1	62.8	1948	95.4	92.6	79.8
Llama-4-Behemoth	49.4	–	–	–	–
DeepCoder-1.5B-Preview	25.1	963	28.5	73.0	–
Deepseek-R1-Distill-Qwen-1.5B	16.9	615	1.9	58.3	28.8

DeepCoder-14B shows remarkable performance across multiple benchmarks. It scores 60.6% on LiveCodeBench, nearly matching proprietary alternatives. The model achieves a 1936 Codeforces rating. Its HumanEval+ results are impressive. These achievements place it among top-tier models despite limited resources.

The model excels beyond coding with 73.8% accuracy on AIME math problems. This demonstrates exceptional transfer learning capabilities. Our benchmarks validate our training methodology. They prove careful data curation works. Specialized fine-tuning techniques are effective. Open-source AI coding models can achieve state-of-the-art results with moderate size.

Behind DeepCoder’s Success: Sandbox Environment and Training Recipe

DeepCoder’s remarkable performance stems from its innovative approach to code evaluation during training.

Innovative Code Execution Infrastructure

At the heart of DeepCoder’s impressive performance lies a sophisticated code execution infrastructure that enables accurate reward calculation during reinforcement learning. This system tackles one of the most challenging aspects of training code generation tools: reliably evaluating thousands of code samples against multiple test cases. Here’s how DeepCoder’s architecture and training helps address this issue.

Le me explain this in detail.

1. Dual Sandbox Approach

DeepCoder employs two complementary sandbox environments to ensure reliable code execution:

Together Code Interpreter: This production-ready environment provides exceptional speed and security at a remarkably economical price point of just 3¢ per problem. The team scaled this solution to handle over 100 concurrent sandboxes, processing more than 1,000 executions per minute. This sandbox captures standard input/output streams while maintaining strict isolation from host systems.
Local Code Sandbox: For maximum reproducibility, the team developed a guard-railed Python subprocess implementation that perfectly mirrors LiveCodeBench’s evaluation methodology. This ensures that all reported results directly correspond to the industry-standard benchmarks.

2. Principled Reward Design

Rather than using partial rewards that could lead to “reward hacking,” DeepCoder implements a sparse Outcome Reward Model with binary outcomes:

Success (1): Code must pass all sampled test cases
Failure (0): Code fails any test or violates formatting requirements

For problems with extensive test suites, the system strategically samples the 15 most challenging tests, identified by input complexity.

GRPO+: Enhanced Training Algorithm

DeepCoder introduces the GRPO+ (Generalized Reward-Weighted Policy Optimization Plus) algorithm into its training. GRPO+ is a significant evolution of the GRPO algorithm that incorporates key insights from DAPO (Diffusion Actor-Policy Optimization) research.

Key Algorithmic Innovations in GRPO+

The team made four critical modifications to enable stable training at scale:

Entropy Loss Elimination: By removing the entropy loss term that frequently caused training collapse, GRPO+ maintains consistent exploration throughout the training process.
KL Loss Removal: Freeing the model from being constrained to the original SFT model’s trust region improves both performance and training speed by eliminating reference policy calculations.
Overlong Filtering: This technique prevents penalizing truncated sequences, preserving the model’s long-context reasoning capabilities. Remarkably, this allowed DeepCoder to generalize to 64K contexts despite being trained only on 32K sequences.
Clip High: By adjusting the upper bound in the surrogate loss function, GRPO+ encourages more exploration while maintaining stable entropy levels throughout training.

These algorithmic improvements work together to create DeepCoder’s distinctive learning pattern: steadily increasing response lengths, stable reward curves, and consistent token-level entropy—all contributing to its exceptional coding capabilities.

Smarter Training: Scaling Context and Reasoning Together

Training large models is already a heavy lift, but training them to reason across long contexts is an even bigger challenge. Most models either compromise on the depth of reasoning or hit a wall when the context size increases.

DeepCoder addresses this head-on with a two-pronged training approach:

1. Iterative Context Lengthening

Instead of jumping to long contexts immediately, the model is trained in stages:

Starts at 16K tokens
Scales up to 32K
Evaluated at 64K — even though it was never trained on that length!

This gradual scaling allows the model to learn how to “think in longer documents” instead of simply memorizing token spans. The results speak for themselves:

16K context: 54% on LiveCodeBench
32K context: 58%
64K context: 60.6% (despite zero training at that length)

DeepCoder-14B Iterative context lengthening

2. Overlong Filtering (Inspired by DAPO)

To avoid feeding the model noisy, excessively long samples that dilute learning, DeepCoder adopts overlong filtering, a technique inspired by DAPO. This filters out training samples that exceed optimal length and helps maintain clarity in what the model learns.

Together, these strategies ensure that the model doesn’t just grow — it grows smarter.

Data Curation: From Chaos to Clean, Verified Coding Problems

Let’s face it – coding datasets on the internet is a mess! Whether scraped from GitHub, online judges, or forums, they’re often incomplete, buggy, or inconsistent. That becomes a problem for reinforcement learning (RL), which relies on verifiable, consistent reward signals.

To solve this, the AgenticAI team built a custom data curation pipeline that focuses on:

Including only official solutions that pass all test cases
Ensuring at least 5 high-quality unit tests per problem
Deduplicating training and test sets to avoid leakage or evaluation inflation

The code below shows the core validation logic used in their data processing pipeline. This function checks each problem against quality standards before allowing it into the dataset:

# Simplified data processing workflow using custom data curation pipeline
def validate_problem(problem):
    if problem.test_cases < 5: 
        reject()
    if not passes_all_tests(problem.solution):
        reject()
    if exists_in_test_split(problem):
        reject()
return problem

The result is a clean, verifiable dataset of 24,000 coding problems – perfectly suited for RL fine-tuning. This careful filtering ensures that rewards during training actually reflect correctness, not chance or overfitting.

DeepCoder-14B Reinforcement Learning at Scale: The rLLM Framework

Evaluating code is different from evaluating text. You can’t just compare token similarity – you need to run the code and test its output, ideally thousands of times across edge cases. That’s where DeepCoder’s open-source RL engine, rLLM comes in.

Here’s what makes rLLM stand out:

Built on the verl framework (reduces end2end training times by up to 2x), an efficient training engine designed for code
Capable of running 1,000+ unit tests per minute
Uses 100+ parallel sandboxes to evaluate submissions simultaneously
Supports both:
- Together Code Interpreter (cheap, fast, $0.03/problem)
- Local sandbox mirroring LiveCodeBench for reproducibility

This infrastructure isn’t just about speed — it makes large-scale, verifiable RL training practical. No hand-waving, no approximations; real code, real tests, real results.

Want to try it? Head to the repo: github.com/agentica-project/rllm

Getting Hands-on with DeepCoder

While DeepCoder’s performance metrics are impressive, what makes this project truly valuable to the AI community is its accessibility and reproducibility. This section walks through the practical aspects of working with this innovative model, from initial setup to advanced training configurations.

Step 1: Setting Up Your Environment

DeepCoder’s development team has optimized the codebase for Python 3.10, ensuring stability while leveraging modern language features. The installation process begins with creating a dedicated Conda environment:

conda create -n rllm python=3.10 -y
conda activate rllm

After navigating to the rllm directory, you’ll need to install both the verl reinforcement learning framework and the main package:

cd rllm
pip install -e ./verl
pip install -e .

This installation pattern reflects modular architecture, with verl serving as the specialized DeepCoder-14B reinforcement learning engine that powers its impressive code generation capabilities.

Step 2: Preparing Training Data

One of DeepCoder’s strengths lies in its meticulously curated dataset. The repository provides both the raw training data and preprocessing scripts to transform it into optimized formats for training.

To begin working with this data:

# First, download the curated datasets from GDrive
python scripts/data/download_datasets.py
# Then generate optimized parquet files for training
python scripts/data/deepcoder_dataset.py  # For DeepCoder
# or
python scripts/data/deepscaler_dataset.py  # For DeepScaleR

These preprocessing steps implement the rigorous data quality controls mentioned earlier, ensuring that all code examples meet the strict requirements for DeepCoder-14B reinforcement learning.

Step 3: Training Options for Different Scales

DeepCoder’s flexible training architecture accommodates various computational resources, making it accessible to both individual researchers and larger teams with significant infrastructure.

For Individual Researchers

Those with access to a single high-performance machine can begin training with:

export MODEL_PATH="deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B"

./scripts/deepcoder/train/file.sh --model $MODEL_PATH

This single-node configuration provides an excellent entry point for experimenting with the framework or fine-tuning for specific domains.

For Research Teams

Larger experiments benefit from DeepCoder’s distributed training capabilities. The setup uses Ray for coordinating training across multiple machines:

The head node must initialize the Ray cluster:
Worker nodes then connect to this coordinator:
With the cluster ready, training can be launched:

The head node must initialize the Ray cluster:
export VLLM_ATTENTION_BACKEND=XFORMERS ray start --head
Worker nodes then connect to this coordinator:
export VLLM_ATTENTION_BACKEND=XFORMERS ray start --address=[HEAD_NODE_ADDRESS]
With the cluster ready, training can be launched:
./scripts/deepcoder/train/file.sh --model [CHECKPOINT_PATH]

This scalable approach was instrumental in achieving DeepCoder’s breakthrough performance, allowing the team to effectively train on longer context lengths and larger datasets.

Step 4: Rigorous Evaluation Framework

DeepCoder’s performance claims are backed by a comprehensive evaluation framework that automatically runs multiple instances of vLLM to test the model’s capabilities:

./scripts/eval/eval_model.sh --model [CHECKPOINT_PATH] \
                           --datasets [DATASET1] [DATASET2] \
                           --output-dir [OUTPUT_DIR] \
                           --n [N_PASSES] \
                           --tp [TENSOR_PARALLEL_SIZE] \
                           --max-length [MAX_CONTEXT_LENGTH]

This evaluation approach mirrors the LiveCodeBench methodology, ensuring that reported metrics accurately reflect real-world performance on challenging coding tasks.

DeepCoder-14B Hands-on Performance

In this section, we explore DeepCoder-14B’s capability to explain fundamental programming concepts in a clear and beginner-friendly way.

Task: Explaining a programming concept

Let’s use DeepCoder-14B to explain how a hash table works and see if it can generate a Python example for it.

Code:

response = llm.create_chat_completion(
    messages = [
        {
            "role": "user",
            "content": "Explain how a hash table works with an example in Python."
        }
    ]
)
print(response['choices'][0]['message']['content'])

Review:

DeepCoder-14B provided an impressively thoughtful and step-by-step conceptual breakdown of how hash tables function. Here’s what stood out:

Personalized Reasoning: The response felt almost like a beginner walking through the concept out loud, which adds a relatable, educational flavor to the explanation.
Detailed Theory: It covered key ideas like hashing, collisions, chaining, open addressing, and their real-world implementation in Python via dictionaries.
Structured Approach: The model didn’t jump into code immediately but instead laid out the logic and design—outlining steps like creating the array, defining a hash function, and handling collisions.
Missing Code Block: Although it promised to demonstrate a simple hash table in Python, the code snippet wasn’t included in this output. For a fully complete answer, you might prompt it to “continue with the Python code example.”

Inference Performance Note: While the model output was conceptually strong, the latency was very high (~11 minutes total time), indicating that DeepCoder-14B may be best suited for non-realtime applications like content generation, tutoring, or documentation.

DeepCoder-14B vs o3-mini & o1: Performance Comparison

In this section, we’ll compare how DeepCoder-14B performs against OpenAI’s o1 and 03-mini on two common programming tasks – code generation and bug fixing. We’ll give the same 2 tasks to DeepCoder-14B, o3-mini (simulated with Phi-2), and o1 (simulated with LLaMA-2 7B) and see how the models’ size and design impact code quality, explanation depth, and reasoning ability. From generating a simple function to identifying logic errors in recursive code, this comparison will give us a clearer picture of when bigger models really shine, and when smaller ones hold their own.

Task 1: Code Generation Tools Comparison – DeepCoder vs o3-mini (Phi-2)

Let’s use DeepCoder-14B to generate a Python function that finds all prime numbers between 1 and 100, and compare its response with that of o3-mini.

DeepCoder-14B Code:

response = llm.create_chat_completion(
    messages = [
        {
            "role": "user",
            "content": "Write a Python function to find prime numbers between 1 and 100."
        }
    ]
)
print("DeepCoder Output:\n", response['choices'][0]['message']['content'])

Phi-2 (Simulating o3-mini) Code:

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
tokenizer = AutoTokenizer.from_pretrained("microsoft/phi-2")
model = AutoModelForCausalLM.from_pretrained("microsoft/phi-2", device_map="auto")
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer
prompt = "Write a Python function to find prime numbers between 1 and 100."
output = pipe(prompt, max_new_tokens=150)[0]["generated_text"]
print("Phi-2 Output:\n", output)

Review:

DeepCoder-14B provides a deeply thoughtful, step-by-step breakdown of the logic behind finding prime numbers, mimicking how a beginner might reason through the problem. While insightful, it doesn’t return actual code, which limits its usefulness for direct execution. In contrast, Phi-2 (o3-mini) delivers a clean, correct Python function without any explanation—fast, efficient, and ready to run. DeepCoder is better for educational depth, whereas Phi-2 excels at practical coding speed and clarity.

Task 2: Bug Fixing and Reasoning – DeepCoder vs o1 (LLaMA-2 7B)

Now let’s challenge DeepCoder-14B with a classic debugging task. We’ll feed it a buggy recursive factorial function and ask it to fix the code and explain what went wrong. We’ll then give the same task to OpenAI’s o1 model (simulated by LLaMA-27B) and compare their responses.

Buggy Code:

buggy_code = """
def factorial(n):
    if n == 0:
        return 0
    else:
        return n * factorial(n-1)
"""

DeepCoder-14B:

response = llm.create_chat_completion(
    messages = [
        {
            "role": "user",
            "content": f"This code has a bug. Fix it and explain the correction:\n{buggy_code}"
        }
    ]
)
print("DeepCoder Output:\n", response['choices'][0]['message']['content'])

LLaMA-2 7B (simulating o1):

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf")
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-chat-hf", device_map="auto")
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
prompt = "This code has a bug. Fix it and explain the correction:\n" + buggy_code
output = pipe(prompt, max_new_tokens=200)[0]["generated_text"]
print("LLaMA-2 Output:\n", output)

Review:

In this task, both DeepCoder-14B and o1 (LLaMA-2 7B) correctly identified the bug in the factorial function—recognizing that the base case should return 1 instead of 0. DeepCoder-14B demonstrated strong reasoning by walking through the logic and highlighting how the incorrect base case leads to wrong results, particularly for n=1.

However, its output suffered from a critical flaw: a repetitive loop of “Wait, no,” which detracted from readability and made the response feel unstable. In contrast, o1 provided a concise, clean, and correct response, typically including both the fixed code and a brief explanation. While it lacked DeepCoder’s depth of reasoning, o1’s reliability and clarity made it more suitable for practical use, especially in deployment or educational contexts.

Future Developments of DeepCoder-14B

While current results focus on coding, the team plans to:

Extend the context window to 128K through dynamic NTK scaling.
Develop multimodal reasoning capabilities.
Create specialized variants for security auditing and legacy code modernization.

This release marks a significant step toward democratizing advanced AI coding tools, providing researchers and developers with:

A complete training recipe matching proprietary model performance.
Infrastructure for verifiable RL at scale.
Baseline for future open-source advancements in program synthesis.

The model’s MIT license ensures unrestricted commercial and research use, fostering innovation across the AI ecosystem. With its combination of competitive performance and full transparency, DeepCoder-14B establishes a new standard for open-source AI coding models development.

DeepCoder-14B: Access and Usage

Everything about DeepCoder is built around transparency and community:

Model weights: Publicly available via Hugging Face
Training pipeline: Shared through the rLLM GitHub repo
Blog breakdown: Official Notion Post

This makes it a great resource for:

Researchers exploring RL fine-tuning
Hackers and developers building custom coding agents
Educators demonstrating how real-world AI coding systems are built and tested

Conclusion

In an era dominated by closed walls and black-box models, DeepCoder-14B is a breath of fresh air. It shows that open-source AI coding models can scale, compete, and innovate – without hiding behind APIs or paywalls. From context scaling to math generalization, from verified datasets to high-speed sandboxes, everything about DeepCoder feels thoughtful, intentional, and community-first.

Developers looking to enhance their coding workflow can start using DeepCoder immediately. The model’s impressive performance on competition-level coding tasks makes it suitable for a wide range of applications, from automated code completion to algorithmic problem-solving. If you’re building the future of AI-assisted development, DeepCoder-14B isn’t just worth trying – it might become your new baseline.

Frequently Asked Questions

Q1. Why is DeepCoder-14B significant for the open-source community?

A. DeepCoder-14B challenges o3-mini model capabilities by delivering comparable coding performance (60.6% Pass@1 on LiveCodeBench) while being fully open-source. It provides full access to weights, datasets, and training frameworks, enabling developers to audit, adapt, and deploy the model without restrictive licenses.

Q2. How does DeepCoder-14B achieve efficiency with fewer parameters?

A. The model uses innovative training strategies like Iterative Context Lengthening, scaling from 16K to 32K tokens during training while generalizing to 64K contexts. Combined with Overlong Filtering to remove noisy data and GRPO+—a refined RL algorithm—it optimizes reasoning without parameter bloat, ensuring resource efficiency which can be seen through o3-mini vs DeepCoder-14B efficiency graph.

Q3. What benchmarks demonstrate its capabilities?

A. DeepCoder-14B scores 1936 on Codeforces (top 5% of human competitors) and 73.8% on AIME math problems, showing cross-domain reasoning. It matches DeepCoder-14B vs o3-mini accuracy despite using half the parameters, proving smaller models can rival larger proprietary counterparts through optimized training.

Q4. How does its open ecosystem benefit developers?

A. The model’s MIT-licensed codebase, Hugging Face deployment, and reproducible rLLM training framework let developers customize it for niche tasks (e.g., legacy code modernization) or integrate it into IDEs. Transparent benchmarks and sandbox environments ensure reliable testing, unlike closed models with opaque evaluation.

Q5. Can it handle complex, real-world coding tasks?

A. Yes. Its dual sandbox system (cloud-based and local) validates code against rigorous test cases, and its 64K context support enables analysis of lengthy codebases. Developers report success in automating bug fixes, test generation, and algorithmic problem-solving at competition levels.

Q6. What makes its dataset unique?

A. The 24K-problem dataset enforces ≥5 verified test cases per problem and strict train/test splits to prevent leakage. This curation ensures clean RL rewards, reducing overfitting risks common in scraped datasets.

Riya Bansal.

Gen AI Intern at Analytics Vidhya
Department of Computer Science, Vellore Institute of Technology, Vellore, India

I am currently working as a Gen AI Intern at Analytics Vidhya, where I contribute to innovative AI-driven solutions that empower businesses to leverage data effectively. As a final-year Computer Science student at Vellore Institute of Technology, I bring a solid foundation in software development, data analytics, and machine learning to my role.

Feel free to connect with me at riya.bansal@analyticsvidhya.com

Advanced Generative AI LLMs

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Data analyst Learning Path

Tableau Learning Path

NLP Learning Path

Data Scientist Learning Path

Data Engineer Learning Path

MLOps Learning Path

AI Engineer Learning Path

Computer Vision Learning Path

Generative AI Learning Path

Generative AI Roadmap for Enterprises

LLMs Roadmap

Prompt Engineer Leaning Path

DeepCoder-14B: The Open-Source Competition to o3-mini and o1

Table of Contents

What is DeepCoder-14B?

Key Features of DeepCoder-14B

DeepCoder-14B Benchmark Performance

Behind DeepCoder’s Success: Sandbox Environment and Training Recipe

Innovative Code Execution Infrastructure

1. Dual Sandbox Approach

2. Principled Reward Design

GRPO+: Enhanced Training Algorithm

Key Algorithmic Innovations in GRPO+

Smarter Training: Scaling Context and Reasoning Together

1. Iterative Context Lengthening

2. Overlong Filtering (Inspired by DAPO)

Data Curation: From Chaos to Clean, Verified Coding Problems

DeepCoder-14B Reinforcement Learning at Scale: The rLLM Framework

Getting Hands-on with DeepCoder

Step 1: Setting Up Your Environment

Step 2: Preparing Training Data

Step 3: Training Options for Different Scales

For Individual Researchers

For Research Teams

Step 4: Rigorous Evaluation Framework

DeepCoder-14B Hands-on Performance

DeepCoder-14B vs o3-mini & o1: Performance Comparison

Task 1: Code Generation Tools Comparison – DeepCoder vs o3-mini (Phi-2)

Task 2: Bug Fixing and Reasoning – DeepCoder vs o1 (LLaMA-2 7B)

Future Developments of DeepCoder-14B

DeepCoder-14B: Access and Usage

Conclusion

Frequently Asked Questions

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth