Alibaba's Latest AI Model Beats OpenAI's o1-mini, On Par With DeepSeek R1

10 hours ago 24

Alibaba Cloud has unveiled a new reasoning-focused AI model that manages to match the performance of much larger competitors despite being a fraction of their size. 

The cloud computing division of the Chinese tech giant's latest offering challenges the notion that bigger is always better in the AI world.

Dubbed QwQ-32B, the model is built on Alibaba's Qwen2.5-32B foundation and uses 32.5 billion parameters while delivering comparable performance to DeepSeek r1, which houses a massive 671 billion parameters. 

The David versus Goliath achievement has caught the attention of AI researchers and developers globally.

"This remarkable outcome underscores the effectiveness of RL when applied to robust foundation models pretrained on extensive world knowledge," Alibaba's Qwen team stated in their announcement blog post today.

QwQ-32B, according to the company, particularly shines in mathematical reasoning and coding tasks. 

"We find that RL training can continuously improve the performance, especially in math and coding, and we observe that the continuous scaling of RL can help a medium-size model achieve competitive performance against gigantic MoE model," Alibaba wrote in their announcement tweet.

It scored 65.2% on GPQA (a graduate-level scientific reasoning test), 50% on AIME (advanced mathematics), and an impressive 90.6% on MATH-500, which covers a wide range of mathematical problems, according to internal benchmark results.

The AI community has responded with enthusiasm. "Absolutely love it!," noted Vaibhav Srivastav, a data scientist and AI researcher, whereas Julien Chaumond, CTO at Huggin Face said the model “changes everything.”

And of course, there were a few funny memes too.

Also, Ollama and Groq announced that they implemented support for the model, meaning users can now program open source agents and use this model on third-party apps as well as achieving record-breaking inference speeds with Groq's infrastructure.

This efficiency gain marks a potential shift in the industry, where the trend has been toward ever-larger models. QwQ-32B instead takes a similar approach to DeepSeek R1, showing that clever training techniques might be just as important as raw parameter count when it comes to AI performance.

QwQ-32B does have limitations. It sometimes struggles with language mixing and can fall into recursive reasoning loops that affect its efficiency.

 Additionally, like other Chinese AI models, it complies with local regulatory requirements that may restrict responses on politically sensitive topics and has a somewhat limited 32K token context window.

Open the sauce

Unlike many advanced AI systems—especially from America and Western countries—that operate behind paywalls, QwQ-32B is available as open-source software under the Apache 2.0 license. 

The release follows Alibaba's January launch of Qwen 2.5-Max, which the company claimed outperformed competitors "almost across the board." 

That earlier release came during Lunar New Year celebrations, highlighting the competitive pressure Chinese tech companies face in the rapidly evolving AI landscape.

The influence of Chinese models in the state of the AI industry is such that in a previous statement about this topic, President Donald Trump described their performance as a "wake-up call" to Silicon Valley, but viewed them as "an opportunity rather than a threat." 

When DeepSeek R1 was released, it triggered a significant decline in the stock market, but QwQ-32B has not affected investors in the same way.

The Nasdaq is down overall, primarily for political reasons rather than a FUD attributed to Alibaba’s influence.

Still, Alibaba sees this release as just the beginning. 

"This marks Qwen's initial step in scaling Reinforcement Learning to enhance reasoning capabilities," the company stated in their blog post. 

"We are confident that combining stronger foundation models with RL powered by scaled computational resources will propel us closer to achieving Artificial General Intelligence (AGI)."

Edited by Sebastiaan Sinclair

Generally Intelligent Newsletter

A weekly AI journey narrated by Gen, a generative AI model.

Read Entire Article