Sky-T1: The $450 LLM Challenging GPT-4o & DeepSeek V3

Diksha Kumari Last Updated : 14 Jan, 2025
3 min read

The AI community was already stunned when DeepSeek V3 launched, delivering GPT-4o-level capabilities at a fraction of the cost. But now, the NovaSky team at UC Berkeley has raised the bar even higher. Meet Sky-T1-32B-Preview—a model that delivers top-tier performance for a training cost of less than $450. That’s not a typo. While others spend millions, NovaSky is proving that cutting-edge AI doesn’t need a sky-high budget.

And here’s the best part: they’ve made everything open-source. Data, code, model weights—it’s all available for anyone to use, learn from, and improve. This isn’t just about affordability; it’s about democratizing AI and empowering everyone to innovate. Let’s find out more about Sky-T1-32B-Preview.

What Makes this Project Special?

While models like o1 and Gemini 2.0 have showcased impressive reasoning capabilities, their technical details and weights remain locked behind closed doors. This creates barriers for academic and open-source communities. In response, NovaSky has built a fully open-source model that excels not just in math but also in coding – all while being trained for less than $450.

Making of Sky-T1-32B-Preview

Source: Sky-T1

1. Data Preparation

  • The team collected diverse datasets (math, coding, science, and puzzles).
  • They used smart techniques like “rejection sampling,” which filters out wrong answers to ensure only high-quality data was used.
  • They also reformatted the data for clarity, boosting the accuracy of results.

2. Training Process

  • NovaSky fine-tuned a large open-source model (Qwen-2.5-32B) using their curated dataset.
  • Training took just 19 hours on eight advanced GPUs, costing under $450.

3. Balanced Approach

  • They carefully balanced the training data between math and coding tasks, ensuring the model could handle both types of reasoning effectively.

Sky-T1-32B-Preview Benchmarking

Sky-T1-32B-Preview delivers outstanding results across multiple benchmarks:

  • Math: Achieved 82.4% on Math500 and 43.3% on AIME2024, rivaling top models like o1-preview.
  • Coding: Scored 86.3% on LiveCodeBench-Easy, demonstrating its ability to tackle complex coding challenges.
  • Versatility: Outperforms several open-source models and competes with pricier closed models like o1-preview.

Key Insights

  • Data Mixture is Crucial: Balancing math and coding data was essential. Initially, adding coding data reduced math accuracy, but enriching the dataset with challenging problems from NuminaMath and TACO restored performance in both domains.
  • Model Size Matters: Smaller models (7B and 14B) showed only modest improvements, often generating repetitive content. The 32B model proved to be the sweet spot for advanced reasoning.

The Future of Open-Source Reasoning Models

Sky-T1-32B-Preview is just the beginning. NovaSky plans to:

  • Develop more efficient models with strong reasoning capabilities.
  • Explore advanced techniques to enhance accuracy and efficiency at test time.

By making their work fully open-source, NovaSky is paving the way for a more inclusive and collaborative AI future.

End Note

AI development is often dominated by companies with huge budgets, leaving smaller organizations and researchers behind. NovaSky’s work democratizes AI by showing that top-tier models can be trained affordably. Their fully open-source approach also encourages collaboration and innovation, paving the way for more accessible AI advancements.

Stay tuned to Analytics Vidhya News for more such awesome content!

As an Instructional Designer at Analytics Vidhya, Diksha has experience creating dynamic educational content on the latest technologies and trends in data science. With a knack for crafting engaging, cutting-edge content, Diksha empowers learners to navigate and excel in the evolving tech landscape, ensuring educational excellence in this rapidly advancing field.

Responses From Readers

Clear

We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.

Show details