The day OpenAI released the o1 model, there was chatter everywhere that we are now closer to AGI than ever. While AGI (Artificial General Intelligence) still looms somewhere in the future, we do have the o1 model. However, it isn’t really accessible to many, thanks to its whopping ticket price of $200 per month. Now what if I told you that you can get access to o1 level reasoning and computational capabilities, completely free of cost? Yes, its true – with DeepSeek’s new R1 model, you can! The Chinese AI startup, DeepSeek, has been raining gifts since the New Year, starting with the launch of DeepSeek V3 – a model that competes with GPT 4o – and its mobile app. Their latest gift to the AI community is DeepSeek R1 – a large language model (LLM) that gives o1 a run for its money for literally a fraction of the cost! In this blog, we will compare DeepSeek R1 Vs OpenAI o1 Vs Sonnet 3.5 and see if the promising metrics stand true or not.
DeepSeek R1 is an advanced reasoning-focused, open source, LLM designed to revolutionize reasoning capabilities in AI systems. It introduces a novel approach to training LLMs, leveraging reinforcement learning (RL) as its cornerstone, while minimizing the use of traditional supervised fine-tuning (SFT).
The model emphasizes logic, problem-solving, and interpretability, making it apt for tasks involving deep logical reasoning, such as STEM tasks, coding, and advanced Chain-of-Thought (CoT) reasoning. This makes it a direct competitor to OpenAI’s o1 and Claude’s Sonnet 3.5.
What’s even better is that DeepSeek R1’s API is a whopping 97% cheaper compared to Claude’s Sonnet 3.5 and almost 93% cheaper compared to OpenAI’s o1.
Learn More: DeepSeek R1- OpenAI’s o1 Biggest Competitor is HERE!
You can explore DeepSeek R1 through the DeepSeek Chat interface. Here’s what to do:
The platform will already work with its DeepSeek R1 version.
Now, if you wish to use the API instead:
Feature | DeepSeek-R1 | OpenAI o1 Series | Claude Sonnet 3.5 |
Training Approach | Reinforcement learning (RL) with minimal supervised data | Supervised fine-tuning (SFT) + RLHF | Supervised fine-tuning + RLHF |
Special Methods | Cold-start data, rejection sampling, and pure RL | Combines SFT and RL for general versatility | Focused on alignment and safety |
Core Focus | Reasoning-intensive tasks (math, coding, CoT) | General-purpose LLM | Ethical and safe AI, balanced reasoning |
Input Token Cost | $0.14 (cache hit), $0.55 (cache miss) per million | $1.50–$60 per million tokens | $1.45–$8.60 per million tokens |
Output Token Cost | $2.19 per million tokens | $60 per million tokens | $6–$10 per million tokens |
Affordability | Extremely cost-effective, especially for frequent use | High cost for advanced models | Moderately priced for safety applications |
Accessibility | Fully open-source (free for hosting/customization) | Proprietary, pay-per-use API | Proprietary, pay-per-use API |
I’m now going to test DeepSeek R1, OpenAI o1, and Sonnet 3.5 for several logical and coding related tasks, using their chat interfaces. I’ll rank them from 1-3 based on the responses they generate.
Here,
1 – means the best response.
2 – means the second best response.
3 – the last one.
At the end, the model with the lowest total would be the winner!
Prompt: “You walk into a room and see a bed. On the bed there are two dogs, four cats, a giraffe, five cows, and a duck. There are also three chairs and a table. How many legs are on the floor?”
DeepSeek R1: This model takes some time to generate its response. While the calculations were correct, the model didn’t count the legs of the table and chairs. Although surprisingly, it did count the human legs that the other two models didn’t.
OpenAI o1: This model too takes time to generate the response. Again, while the calculations were correct and there was a detailed explanation for the same, it fails to include the human legs on the floor. Thus, its result is also incorrect.
Sonnet 3.5: This model is quick to generate the response and its calculations are correct. However, it fails to account for the human legs which would have been present on the floor in the room. So its final answer is incorrect.
Overall, I didnt’ get a correct response from any model! But DeepSeek R1’s logical approach did impress me.
Prompt: “Create a secret language between two friends who are great at maths but very poor at english.”
DeepSeek R1: The model thought through the task and gave me three possible ways to create a secret language, all of which were unique and equally secretive.
OpenAI o1: This model gave a detailed insight into the language, guiding me through all the details. But I found that the secret language was a bit too fairly simple to decode, considering it had to be a “secret”.
Sonnet 3.5: The model was quick to generate the answer. Although its approach was slightly tedious, the result was a language that surely was super secretive.
Overall, Deepseek R1 and Sonnet 3.5 stood out for me but R1 wins because of the choices it provided in its response.
Prompt: “You have a powerful laser and a perfectly reflective mirror. How can you aim the laser at the mirror in such a way that the reflected beam never comes back to you?”
DeepSeek R1: The model generates a very well-written result with a small visualization to enhance the overall experience. It details down the logic in simple words and provides a well rounded solution.
OpenAI o1: The model generates a good result with clear explanation but lacks the visual element that could further enhance the learning experience.
Sonnet 3.5: The model gives a clear response listing out the answer and the reason behind it. While the answer is well-written, it lacks the visual element that could take it a step further.
Overall, DeepSeek R1 inches the victory home with its small visual flowchart.
Prompt: “Create a simple HTML page with a button that changes the background color to Pink when clicked with some confetti”
DeepSeek R1: The model writes the code and also provides an inbuilt interface to run the HTML code, making the entire process quite simple.
OpenAI o1: The model quickly writes the code but to check it you will have to run it in a separate interface.
Sonnet 3.5: The model writes the code and also provides an interface to run the code within it. However, while running the code, it encountered an error.
Overall, DeepSeek R1 and Sonnet3.5 take the lead with their inbuilt feature to run HTML code but since Sonnet’s code runs into an error, it’s DeepSeek R1 that presides here.
Prompt: “Write a program that can solve a Sudoku puzzle”
def solve_sudoku(board):
# Find the next empty cell (denoted by 0)
empty = find_empty(board)
if not empty:
return True # Puzzle solved
row, col = empty
for num in range(1, 10):
if is_valid(board, row, col, num):
board[row][col] = num
if solve_sudoku(board):
return True
# Backtrack if no valid continuation
board[row][col] = 0
return False
def is_valid(board, row, col, num):
# Check row
if num in board[row]:
return False
# Check column
if num in [board[i][col] for i in range(9)]:
return False
# Check 3x3 subgrid
subgrid_row = (row // 3) * 3
subgrid_col = (col // 3) * 3
for i in range(subgrid_row, subgrid_row + 3):
for j in range(subgrid_col, subgrid_col + 3):
if board[i][j] == num:
return False
return True
def find_empty(board):
for i in range(9):
for j in range(9):
if board[i][j] == 0:
return (i, j)
return None
def print_board(board):
for i in range(9):
if i % 3 == 0 and i != 0:
print("-" * 21)
for j in range(9):
if j % 3 == 0 and j != 0:
print("|", end=" ")
print(f"{board[i][j] or '. '}", end=" ")
print()
# Example usage
if __name__ == "__main__":
# Sample Sudoku puzzle (0 = empty cell)
puzzle = [
[5,3,0,0,7,0,0,0,0],
[6,0,0,1,9,5,0,0,0],
[0,9,8,0,0,0,0,6,0],
[8,0,0,0,6,0,0,0,3],
[4,0,0,8,0,3,0,0,1],
[7,0,0,0,2,0,0,0,6],
[0,6,0,0,0,0,2,8,0],
[0,0,0,4,1,9,0,0,5],
[0,0,0,0,8,0,0,7,9]
]
print("Original puzzle:")
print_board(puzzle)
if solve_sudoku(puzzle):
print("\nSolved puzzle:")
print_board(puzzle)
else:
print("\nNo solution exists")
def solve_sudoku(board):
"""
Solve a 9x9 Sudoku puzzle in-place using backtracking.
board: A 9x9 2D list of integers, where 0 denotes an empty cell.
Returns True if the puzzle can be solved, otherwise False.
"""
# Find an empty cell
empty = find_empty_cell(board)
if not empty:
# No empty cells left, puzzle is solved
return True
else:
row, col = empty
# Try digits 1 through 9
for num in range(1, 10):
if is_valid(board, row, col, num):
board[row][col] = num # Place num
# Recursively attempt to solve
if solve_sudoku(board):
return True
# Undo move (backtrack)
board[row][col] = 0
return False
def find_empty_cell(board):
"""
Find an empty cell in the board, indicated by 0.
Returns (row, col) tuple or None if no empty cell is found.
"""
for r in range(9):
for c in range(9):
if board[r][c] == 0:
return (r, c) # row, col
return None
def is_valid(board, row, col, num):
"""
Check if placing num at board[row][col] is valid
according to Sudoku rules:
- num not in current row
- num not in current column
- num not in current 3x3 box
"""
# Check row
if num in board[row]:
return False
# Check column
for r in range(9):
if board[r][col] == num:
return False
# Check 3x3 sub-box
# Identify top-left corner of the 3x3 box
box_row = (row // 3) * 3
box_col = (col // 3) * 3
for r in range(box_row, box_row + 3):
for c in range(box_col, box_col + 3):
if board[r][c] == num:
return False
return True
if __name__ == "__main__":
# Example Sudoku puzzle (0 = empty)
puzzle = [
[7, 8, 0, 4, 0, 0, 1, 2, 0],
[6, 0, 0, 0, 7, 5, 0, 0, 9],
[0, 0, 0, 6, 0, 1, 0, 7, 8],
[0, 0, 7, 0, 4, 0, 2, 6, 0],
[0, 0, 1, 0, 5, 0, 9, 3, 0],
[9, 0, 4, 0, 6, 0, 0, 0, 5],
[0, 7, 0, 3, 0, 0, 0, 1, 2],
[1, 2, 0, 0, 0, 7, 4, 0, 0],
[0, 4, 9, 2, 0, 6, 0, 0, 7]
]
print("Initial puzzle:")
for row in puzzle:
print(row)
if solve_sudoku(puzzle):
print("\nSolved puzzle:")
for row in puzzle:
print(row)
else:
print("No solution found.")
def print_board(board):
"""Print the Sudoku board in a readable format."""
for i in range(len(board)):
if i % 3 == 0 and i != 0:
print("- - - - - - - - - - - -")
for j in range(len(board[0])):
if j % 3 == 0 and j != 0:
print("|", end=" ")
if j == 8:
print(board[i][j])
else:
print(str(board[i][j]) + " ", end="")
def find_empty(board):
"""Find an empty cell in the board (represented by 0)."""
for i in range(len(board)):
for j in range(len(board[0])):
if board[i][j] == 0:
return (i, j) # row, column
return None
def is_valid(board, num, pos):
"""Check if the number can be placed in the given position."""
# Check row
for j in range(len(board[0])):
if board[pos[0]][j] == num and pos[1] != j:
return False
# Check column
for i in range(len(board)):
if board[i][pos[1]] == num and pos[0] != i:
return False
# Check 3x3 box
box_x = pos[1] // 3
box_y = pos[0] // 3
for i in range(box_y * 3, box_y * 3 + 3):
for j in range(box_x * 3, box_x * 3 + 3):
if board[i][j] == num and (i, j) != pos:
return False
return True
def solve(board):
"""Solve the Sudoku puzzle using backtracking."""
empty = find_empty(board)
if not empty:
return True
row, col = empty
for num in range(1, 10):
if is_valid(board, num, (row, col)):
board[row][col] = num
if solve(board):
return True
board[row][col] = 0
return False
# Example usage
if __name__ == "__main__":
# 0 represents empty cells
puzzle = [
[0, 2, 0, 6, 0, 8, 0, 0, 0],
[5, 8, 0, 0, 0, 9, 7, 0, 0],
[0, 0, 0, 0, 4, 0, 0, 0, 0],
[3, 7, 0, 0, 0, 0, 5, 0, 0],
[6, 0, 0, 0, 0, 0, 0, 0, 4],
[0, 0, 8, 0, 0, 0, 0, 1, 3],
[0, 0, 0, 0, 2, 0, 0, 0, 0],
[0, 0, 9, 8, 0, 0, 0, 3, 6],
[0, 0, 0, 3, 0, 6, 0, 9, 0]
]
print("Original Puzzle:")
print_board(puzzle)
print("\nSolving...\n")
if solve(puzzle):
print("Solved Puzzle:")
print_board(puzzle)
else:
print("No solution exists.")
DeepSeek R1: The model generates the code which is quite is more user-friendly due to its well-structured output formatting with a dedicated print_board function. This makes it easier to visualize the puzzle before and after solving.
OpenAI o1: The model generates the code which has slightly cleaner approach with better naming conventions in the find_empty_cell function and the main solving logic, which can be more readable for some developers.
Sonnet 3.5: The model generates a very written, concise code which comes with class based divisions that makes it easier to understand.
Overall, Sonnet3.5 takes the lead with its clear and concise code.
Final Score: DeepSeek R1: 6 | OpenAI o1: 15 | Sonnet: 9
DeepSeek R1 emerges as the winner with a close chase by Sonnet 3.5. But each model has its own key features that make them special. o1 gives out detailed explanations that can really be helpful for people who wish to understand a topic in detail. On the other hand, Sonnet 3.5 is very quick in its responses. Itt generates responses in half the time as compared to the other two models. Its responses are concise and always to the point. Meanwhile, DeepSeek R1, although takes time to generate its response, comes up with great results. However, the answers can have syntax errors which might let down the overall experience.
DeepSeek R1 stands out as a game-changer in the LLM space. It offers reasoning capabilities comparable to OpenAI’s o1 series and Claude’s Sonnet 3.5, at a fraction of the cost. Its reinforcement learning-based approach and focus on logic-intensive tasks make it a strong competitor for users needing help with deep reasoning, math, or coding tasks. While its output is impressive, occasional syntax errors and slower response times show that there’s room for improvement.
Thus, the choice between DeepSeek R1, o1, and Sonnet 3.5 depends on your specific task requirements—whether it’s cost, speed, detailed insights, or reasoning-focused outputs.
A. DeepSeek R1 is an open-source large language model designed for logic-intensive tasks like math, coding, and Chain-of-Thought (CoT) reasoning, using reinforcement learning as its core training method.
A. DeepSeek R1 offers reasoning and computational capabilities on par with o1 and Sonnet 3.5 but at a much lower cost, making it an affordable alternative for users.
A. DeepSeek R1 is significantly cheaper, with token input costs starting at $0.14 per million for cache hits and $2.19 per million for output tokens.
A. You can access DeepSeek R1 via the DeepSeek Chat interface at chat.deepseek.com or through its API by signing up at api-docs.deepseek.com.
A. DeepSeek R1 is ideal for advanced reasoning, math problem-solving, coding, creating complex logic workflows, and Chain-of-Thought reasoning.
A. Yes, you can integrate DeepSeek R1 using its API, which supports real-time requests and task execution with minimal setup.
A. Yes, DeepSeek R1 is fully open-source, allowing users to host and customize it for specific applications.