DeepSeek R1 vs OpenAI o1 vs Sonnet 3.5: Battle of the Best LLMs

Anu Madan Last Updated : 21 Jan, 2025
11 min read

The day OpenAI released the o1 model, there was chatter everywhere that we are now closer to AGI than ever. While AGI (Artificial General Intelligence) still looms somewhere in the future, we do have the o1 model. However, it isn’t really accessible to many, thanks to its whopping ticket price of $200 per month. Now what if I told you that you can get access to o1 level reasoning and computational capabilities, completely free of cost? Yes, its true – with DeepSeek’s new R1 model, you can! The Chinese AI startup, DeepSeek, has been raining gifts since the New Year, starting with the launch of DeepSeek V3 – a model that competes with GPT 4o – and its mobile app. Their latest gift to the AI community is DeepSeek R1 – a large language model (LLM) that gives o1 a run for its money for literally a fraction of the cost! In this blog, we will compare DeepSeek R1 Vs OpenAI o1 Vs Sonnet 3.5 and see if the promising metrics stand true or not.

What is DeepSeek R1?

DeepSeek R1 is an advanced reasoning-focused, open source, LLM designed to revolutionize reasoning capabilities in AI systems. It introduces a novel approach to training LLMs, leveraging reinforcement learning (RL) as its cornerstone, while minimizing the use of traditional supervised fine-tuning (SFT).

The model emphasizes logic, problem-solving, and interpretability, making it apt for tasks involving deep logical reasoning, such as STEM tasks, coding, and advanced Chain-of-Thought (CoT) reasoning. This makes it a direct competitor to OpenAI’s o1 and Claude’s Sonnet 3.5.

What’s even better is that DeepSeek R1’s API is a whopping 97% cheaper compared to Claude’s Sonnet 3.5 and almost 93% cheaper compared to OpenAI’s o1.

Learn More: DeepSeek R1- OpenAI’s o1 Biggest Competitor is HERE!

How to Access DeepSeek R1?

You can explore DeepSeek R1 through the DeepSeek Chat interface. Here’s what to do:

  1. Head to: https://chat.deepseek.com/.
  2. Sign in to your account or sign up for one.
  3. In the middle of the screen, click on: “Deepthink“.

The platform will already work with its DeepSeek R1 version.

Now, if you wish to use the API instead:

  1. Obtain your API key from the DeepSeek Developer Portal: https://api-docs.deepseek.com/
  2. Set up your development environment with necessary libraries such as Python’s requests or OpenAI package.
  3. Configure your API client with the base URL: https://api.deepseek.com

DeepSeek R1 Vs OpenAI o1 Vs Claude Sonnet 3.5: Model Comparison

Feature DeepSeek-R1 OpenAI o1 Series Claude Sonnet 3.5
Training Approach Reinforcement learning (RL) with minimal supervised data Supervised fine-tuning (SFT) + RLHF Supervised fine-tuning + RLHF
Special Methods Cold-start data, rejection sampling, and pure RL Combines SFT and RL for general versatility Focused on alignment and safety
Core Focus Reasoning-intensive tasks (math, coding, CoT) General-purpose LLM Ethical and safe AI, balanced reasoning
Input Token Cost $0.14 (cache hit), $0.55 (cache miss) per million $1.50–$60 per million tokens $1.45–$8.60 per million tokens
Output Token Cost $2.19 per million tokens $60 per million tokens $6–$10 per million tokens
Affordability Extremely cost-effective, especially for frequent use High cost for advanced models Moderately priced for safety applications
Accessibility Fully open-source (free for hosting/customization) Proprietary, pay-per-use API Proprietary, pay-per-use API

DeepSeek R1 Vs o1 Vs Sonnet 3.5: Tasks

I’m now going to test DeepSeek R1, OpenAI o1, and Sonnet 3.5 for several logical and coding related tasks, using their chat interfaces. I’ll rank them from 1-3 based on the responses they generate.

Here,
1 – means the best response.
2 – means the second best response.
3 – the last one.

At the end, the model with the lowest total would be the winner!

Task 1: Logical Reasoning

Prompt: “You walk into a room and see a bed. On the bed there are two dogs, four cats, a giraffe, five cows, and a duck. There are also three chairs and a table. How many legs are on the floor?”

Result by DeepSeek R1

DeepSeek R1 - logical reasoning

Result by OpenAI o1

OpenAI o1 - logical reasoning

Result by Sonnet 3.5

Sonnet 3.5 - logical reasoning

Review:

DeepSeek R1: This model takes some time to generate its response. While the calculations were correct, the model didn’t count the legs of the table and chairs. Although surprisingly, it did count the human legs that the other two models didn’t.

OpenAI o1: This model too takes  time to generate the response. Again, while the calculations were correct and there was a detailed explanation for the same, it fails to include the human legs on the floor. Thus, its result is also incorrect.

Sonnet 3.5: This model is quick to generate the response and its calculations are correct. However, it fails to account for the human legs which would have been present on the floor in the room. So its final answer is incorrect.

Overall, I didnt’ get a correct response from any model! But DeepSeek R1’s logical approach did impress me.

Result: DeepSeek R1: 1 | OpenAI o1: 3 | Sonnet 3.5: 2

Task 2: Out of the Box Thinking

Prompt: “Create a secret language between two friends who are great at maths but very poor at english.”

Result by DeepSeek R1

DeepSeek R1 Vs OpenAI o1 Vs Sonnet 3.5 - Application output 1

Result by OpenAI o1

DeepSeek R1 Vs OpenAI o1 Vs Sonnet 3.5 - Application output 2

Result by Sonnet 3.5

DeepSeek R1 Vs OpenAI o1 Vs Sonnet 3.5 - Application output 3

Review:

DeepSeek R1: The model thought through the task and gave me three possible ways to create a secret language, all of which were unique and equally secretive.

OpenAI o1: This model gave a detailed insight into the language, guiding me through all the details. But I found that the secret language was a bit too fairly simple to decode, considering it had to be a “secret”.

Sonnet 3.5: The model was quick to generate the answer. Although its approach was slightly tedious, the result was a language that surely was super secretive.

Overall, Deepseek R1 and Sonnet 3.5 stood out for me but R1 wins because of the choices it provided in its response.

Result: DeepSeek R1: 1 | OpenAI o1: 3 | Sonnet 3.5: 2

Task 3: Scientific Reasoning

Prompt: “You have a powerful laser and a perfectly reflective mirror. How can you aim the laser at the mirror in such a way that the reflected beam never comes back to you?”

Result by DeepSeek R1

DeepSeek R1 - Scientific Reasoning

Result by OpenAI o1

OpenAI o1 - Scientific Reasoning

Result by Sonnet 3.5

Sonnet 3.5 - Scientific Reasoning

Review:

DeepSeek R1: The model generates a very well-written result with a small visualization to enhance the overall experience. It details down the logic in simple words and provides a well rounded solution.

OpenAI o1: The model generates a good result with clear explanation but lacks the visual element that could further enhance the learning experience.

Sonnet 3.5: The model gives a clear response listing out the answer and the reason behind it. While the answer is well-written, it lacks the visual element that could take it a step further.

Overall, DeepSeek R1 inches the victory home with its small visual flowchart.

Result: DeepSeek R1: 1 | OpenAI o1: 3 | Sonnet 3.5: 2

Task 4: Coding Skills

Prompt: “Create a simple HTML page with a button that changes the background color to Pink when clicked with some confetti”

Result by DeepSeek R1

Result by OpenAI o1

Result by Sonnet 3.5

Review:

DeepSeek R1: The model writes the code and also provides an inbuilt interface to run the HTML code, making the entire process quite simple.

OpenAI o1: The model quickly writes the code but to check it you will have to run it in a separate interface.

Sonnet 3.5: The model writes the code and also provides an interface to run the code within it. However, while running the code, it encountered an error.

Overall, DeepSeek R1 and Sonnet3.5 take the lead with their inbuilt feature to run HTML code but since Sonnet’s code runs into an error, it’s DeepSeek R1 that presides here.

Result: DeepSeek R1: 1 | OpenAI o1: 2 | Sonnet 3.5: 3

Task 5: Problem-Solving Skills

Prompt: “Write a program that can solve a Sudoku puzzle”

Result by DeepSeek R1

def solve_sudoku(board):
    # Find the next empty cell (denoted by 0)
    empty = find_empty(board)
    if not empty:
        return True  # Puzzle solved
    row, col = empty

    for num in range(1, 10):
        if is_valid(board, row, col, num):
            board[row][col] = num

            if solve_sudoku(board):
                return True

            # Backtrack if no valid continuation
            board[row][col] = 0

    return False
def is_valid(board, row, col, num):

    # Check row
    if num in board[row]:
        return False

    # Check column
    if num in [board[i][col] for i in range(9)]:
        return False

    # Check 3x3 subgrid
    subgrid_row = (row // 3) * 3
    subgrid_col = (col // 3) * 3
    for i in range(subgrid_row, subgrid_row + 3):
        for j in range(subgrid_col, subgrid_col + 3):
            if board[i][j] == num:
                return False

    return True

def find_empty(board):
    for i in range(9):
        for j in range(9):
            if board[i][j] == 0:
                return (i, j)
    return None

def print_board(board):
    for i in range(9):
        if i % 3 == 0 and i != 0:
            print("-" * 21)
        for j in range(9):
            if j % 3 == 0 and j != 0:
                print("|", end=" ")
            print(f"{board[i][j] or '. '}", end=" ")
        print()

# Example usage
if __name__ == "__main__":
    # Sample Sudoku puzzle (0 = empty cell)
    puzzle = [
        [5,3,0,0,7,0,0,0,0],
        [6,0,0,1,9,5,0,0,0],
        [0,9,8,0,0,0,0,6,0],
        [8,0,0,0,6,0,0,0,3],
        [4,0,0,8,0,3,0,0,1],
        [7,0,0,0,2,0,0,0,6],
        [0,6,0,0,0,0,2,8,0],
        [0,0,0,4,1,9,0,0,5],
        [0,0,0,0,8,0,0,7,9]
    ]

    print("Original puzzle:")
    print_board(puzzle)

    if solve_sudoku(puzzle):
        print("\nSolved puzzle:")
        print_board(puzzle)
    else:
        print("\nNo solution exists")

Result by OpenAI o1

def solve_sudoku(board):
    """
    Solve a 9x9 Sudoku puzzle in-place using backtracking.
    board: A 9x9 2D list of integers, where 0 denotes an empty cell.
    Returns True if the puzzle can be solved, otherwise False.
    """
    # Find an empty cell
    empty = find_empty_cell(board)
    if not empty:
        # No empty cells left, puzzle is solved
        return True
    else:
        row, col = empty

    # Try digits 1 through 9
    for num in range(1, 10):
        if is_valid(board, row, col, num):
            board[row][col] = num  # Place num

            # Recursively attempt to solve
            if solve_sudoku(board):
                return True

            # Undo move (backtrack)
            board[row][col] = 0

    return False

def find_empty_cell(board):
    """
    Find an empty cell in the board, indicated by 0.
    Returns (row, col) tuple or None if no empty cell is found.
    """
    for r in range(9):
        for c in range(9):
            if board[r][c] == 0:
                return (r, c)  # row, col
    return None

def is_valid(board, row, col, num):
    """
    Check if placing num at board[row][col] is valid
    according to Sudoku rules:
      - num not in current row
      - num not in current column
      - num not in current 3x3 box
    """
    # Check row
    if num in board[row]:
        return False

    # Check column
    for r in range(9):
        if board[r][col] == num:
            return False

    # Check 3x3 sub-box
    # Identify top-left corner of the 3x3 box
    box_row = (row // 3) * 3
    box_col = (col // 3) * 3
    for r in range(box_row, box_row + 3):
        for c in range(box_col, box_col + 3):
            if board[r][c] == num:
                return False

    return True

if __name__ == "__main__":
    # Example Sudoku puzzle (0 = empty)
    puzzle = [
        [7, 8, 0, 4, 0, 0, 1, 2, 0],
        [6, 0, 0, 0, 7, 5, 0, 0, 9],
        [0, 0, 0, 6, 0, 1, 0, 7, 8],
        [0, 0, 7, 0, 4, 0, 2, 6, 0],
        [0, 0, 1, 0, 5, 0, 9, 3, 0],
        [9, 0, 4, 0, 6, 0, 0, 0, 5],
        [0, 7, 0, 3, 0, 0, 0, 1, 2],
        [1, 2, 0, 0, 0, 7, 4, 0, 0],
        [0, 4, 9, 2, 0, 6, 0, 0, 7]
    ]

    print("Initial puzzle:")
    for row in puzzle:
        print(row)

    if solve_sudoku(puzzle):
        print("\nSolved puzzle:")
        for row in puzzle:
            print(row)
    else:
        print("No solution found.")

Result by Sonnet 3.5

def print_board(board):
    """Print the Sudoku board in a readable format."""
    for i in range(len(board)):
        if i % 3 == 0 and i != 0:
            print("- - - - - - - - - - - -")
        for j in range(len(board[0])):
            if j % 3 == 0 and j != 0:
                print("|", end=" ")
            if j == 8:
                print(board[i][j])
            else:
                print(str(board[i][j]) + " ", end="")

def find_empty(board):
    """Find an empty cell in the board (represented by 0)."""
    for i in range(len(board)):
        for j in range(len(board[0])):
            if board[i][j] == 0:
                return (i, j)  # row, column
    return None
def is_valid(board, num, pos):
    """Check if the number can be placed in the given position."""
    # Check row
    for j in range(len(board[0])):
        if board[pos[0]][j] == num and pos[1] != j:
            return False

    # Check column
    for i in range(len(board)):
        if board[i][pos[1]] == num and pos[0] != i:
            return False

    # Check 3x3 box
    box_x = pos[1] // 3
    box_y = pos[0] // 3
    for i in range(box_y * 3, box_y * 3 + 3):
        for j in range(box_x * 3, box_x * 3 + 3):
            if board[i][j] == num and (i, j) != pos:
                return False

    return True

def solve(board):
    """Solve the Sudoku puzzle using backtracking."""
    empty = find_empty(board)
    if not empty:
        return True

    row, col = empty

    for num in range(1, 10):
        if is_valid(board, num, (row, col)):
            board[row][col] = num

            if solve(board):
                return True

            board[row][col] = 0
    return False

# Example usage
if __name__ == "__main__":
    # 0 represents empty cells
    puzzle = [
    [0, 2, 0, 6, 0, 8, 0, 0, 0],
    [5, 8, 0, 0, 0, 9, 7, 0, 0],
    [0, 0, 0, 0, 4, 0, 0, 0, 0],
    [3, 7, 0, 0, 0, 0, 5, 0, 0],
    [6, 0, 0, 0, 0, 0, 0, 0, 4],
    [0, 0, 8, 0, 0, 0, 0, 1, 3],
    [0, 0, 0, 0, 2, 0, 0, 0, 0],
    [0, 0, 9, 8, 0, 0, 0, 3, 6],
    [0, 0, 0, 3, 0, 6, 0, 9, 0]
]

    print("Original Puzzle:")
    print_board(puzzle)
    print("\nSolving...\n")

    if solve(puzzle):
        print("Solved Puzzle:")
        print_board(puzzle)
    else:
        print("No solution exists.")

Review:

DeepSeek R1: The model generates the code which is quite is more user-friendly due to its well-structured output formatting with a dedicated print_board function. This makes it easier to visualize the puzzle before and after solving.

OpenAI o1: The model generates the code which has slightly cleaner approach with better naming conventions in the find_empty_cell function and the main solving logic, which can be more readable for some developers.

Sonnet 3.5: The model generates a very written, concise code which comes with class based divisions that makes it easier to understand.

Overall, Sonnet3.5 takes the lead with its clear and concise code.

Result: DeepSeek R1: 2 | OpenAI o1: 3 | Sonnet: 1

Final Result

Final Score: DeepSeek R1: 6 | OpenAI o1: 15 | Sonnet: 9

DeepSeek R1 emerges as the winner with a close chase by Sonnet 3.5. But each model has its own key features that make them special. o1 gives out detailed explanations that can really be helpful for people who wish to understand a topic in detail. On the other hand, Sonnet 3.5 is very quick in its responses. Itt generates responses in half the time as compared to the other two models. Its responses are concise and always to the point. Meanwhile, DeepSeek R1, although takes time to generate its response, comes up with great results. However, the answers can have syntax errors which might let down the overall experience.

Conclusion

DeepSeek R1 stands out as a game-changer in the LLM space. It offers reasoning capabilities comparable to OpenAI’s o1 series and Claude’s Sonnet 3.5, at a fraction of the cost. Its reinforcement learning-based approach and focus on logic-intensive tasks make it a strong competitor for users needing help with deep reasoning, math, or coding tasks. While its output is impressive, occasional syntax errors and slower response times show that there’s room for improvement.

Thus, the choice between DeepSeek R1, o1, and Sonnet 3.5 depends on your specific task requirements—whether it’s cost, speed, detailed insights, or reasoning-focused outputs.

Frequently Asked Questions

Q1. What is DeepSeek R1?

A. DeepSeek R1 is an open-source large language model designed for logic-intensive tasks like math, coding, and Chain-of-Thought (CoT) reasoning, using reinforcement learning as its core training method.

Q2. How does DeepSeek R1 compare to OpenAI’s o1 and Claude’s Sonnet 3.5?

A. DeepSeek R1 offers reasoning and computational capabilities on par with o1 and Sonnet 3.5 but at a much lower cost, making it an affordable alternative for users.

Q3. What is is the cost for using DeepSeek R1?

A. DeepSeek R1 is significantly cheaper, with token input costs starting at $0.14 per million for cache hits and $2.19 per million for output tokens.

Q4. How can I access DeepSeek R1?

A. You can access DeepSeek R1 via the DeepSeek Chat interface at chat.deepseek.com or through its API by signing up at api-docs.deepseek.com.

Q5. What are some use cases of DeepSeek R1?

A. DeepSeek R1 is ideal for advanced reasoning, math problem-solving, coding, creating complex logic workflows, and Chain-of-Thought reasoning.

Q6. Does DeepSeek R1 support API integration?

A. Yes, you can integrate DeepSeek R1 using its API, which supports real-time requests and task execution with minimal setup.

Q7. Is DeepSeek R1 truly open source?

A. Yes, DeepSeek R1 is fully open-source, allowing users to host and customize it for specific applications.

Anu Madan has 5+ years of experience in content creation and management. Having worked as a content creator, reviewer, and manager, she has created several courses and blogs. Currently, she working on creating and strategizing the content curation and design around Generative AI and other upcoming technology.

Responses From Readers

Clear

We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.

Show details