New AI Model Outshine GPT-3 with Just 30B Parameters

K.C. Sabreena Basheer Last Updated : 27 Jun, 2023
3 min read

MosaicML, the renowned open-source language models (LLMs) provider, has recently unveiled its groundbreaking MPT-30B models: Base, Instruct, and Chat. These state-of-the-art models, powered by NVIDIA’s latest-generation H100 accelerators, represent a significant leap in quality compared to the original GPT-3.

Also Read: What are Large Language Models (LLMs)?

The Unprecedented Success of MPT-7B and the Evolution to MPT-30B

Since their launch in May 2023, the MPT-7B models have taken the industry by storm, amassing an impressive 3.3 million downloads. Building upon this triumph, MosaicML has now released the highly anticipated MPT-30B models. This raises the bar even higher and unlocks a myriad of new possibilities across various applications.

New AI Model Outshine GPT-3 with Just 30B Parameters

Unmatched Features of MPT-30B

One of the most noteworthy achievements of MPT-30B is its ability to surpass GPT-3’s quality while utilizing a mere 30 billion parameters, a fraction of GPT-3’s 175 billion. This groundbreaking reduction in parameter count not only makes MPT-30B more accessible for local hardware deployment but also significantly reduces the cost of inference. Additionally, the expense associated with training custom models based on MPT-30B is notably lower than the estimates for training the original GPT-3, making it an irresistible choice for businesses.

Learn More: Customizing Large Language Models GPT3 for Real-life Use Cases

MosaicML MPT-30B model fares better than GPT-3, Falcon-40B and Llama-30B.

Furthermore, MPT-30B’s training involved longer sequences of up to 8,000 tokens, enabling it to handle data-heavy enterprise applications. This extraordinary performance is made possible by utilizing NVIDIA’s H100 GPUs, which ensure superior throughput and expedited training times.

Also Read: China’s Hidden Market for Powerful Nvidia AI Chips

Exploring the Boundless Applications of MPT-30B

Numerous visionary companies have already embraced MosaicML’s MPT models, revolutionizing their AI applications:

  • Replit, a trailblazing web-based integrated development environment (IDE), has successfully harnessed MosaicML’s training platform to construct a remarkable code-generation model. Replit has achieved remarkable enhancements in code quality, speed, and cost-effectiveness by leveraging its proprietary data.
  • Scatter Lab, an innovative AI startup specializing in chatbot development, has leveraged MosaicML’s technology to train its own MPT model. The result is a multilingual generative AI model capable of understanding both English and Korean, significantly enhancing the chat experiences for their extensive user base.
  • Navan, a globally renowned travel and expense management software company, is leveraging the solid foundation provided by MPT to develop customized LLMs for cutting-edge applications such as virtual travel agents and conversational business intelligence agents. Ilan Twig, Co-Founder and CTO at Navan, enthusiastically praises MosaicML’s foundation models for offering unparalleled language capabilities alongside remarkable efficiency in fine-tuning and serving inference at scale.

Learn More: If you are a business leader looking to harness the power of AI, the ‘AI for Business Leaders’ workshop at the DataHack Summit 2023 is a must-attend.

Accessing the Power of MPT-30B

Developers can effortlessly access the extraordinary capabilities of MPT-30B through the HuggingFace Hub, which is available as an open-source model. This allows developers to fine-tune the model using their data and seamlessly deploy it for inference on their infrastructure. Alternatively, developers can opt for MosaicML’s managed endpoint, MPT-30B-Instruct, a hassle-free solution for model inference at a fraction of the cost compared to similar endpoints. With pricing of just $0.005 per 1,000 tokens, MPT-30B-Instruct offers an exceptionally cost-effective option for developers.

MosaicML's latest LLMs, MPT-30B Base, Instruct, & Chat, outshine GPT-3, using only 30B parameters.

Our Say

MosaicML’s groundbreaking release of the MPT-30B models marks a historic milestone in the domain of large language models. It empowers businesses to harness the unrivaled capabilities of generative AI while optimizing costs & maintaining full control over their data. In conclusion, MPT-30B represents a true game-changer, delivering unparalleled quality and cost-effectiveness. The future holds immense potential as more companies embrace and leverage this transformative technology to drive innovation across industries.

Sabreena Basheer is an architect-turned-writer who's passionate about documenting anything that interests her. She's currently exploring the world of AI and Data Science as a Content Manager at Analytics Vidhya.

Responses From Readers

We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.

Show details