
Metaverse
Forget DeepSeek. Large language models are getting cheaper still – Crypto News
As recently as 2022, just building a large language model (LLM) was a feat at the cutting edge of artificial-intelligence (AI) engineering. Three years on, experts are harder to impress. To really stand out in the crowded marketplace, an AI lab needs not just to build a high-quality model, but to build it cheaply.
In December a Chinese firm, DeepSeek, earned itself headlines for cutting the dollar cost of training a frontier model down from $61.6m (the cost of Llama 3.1, an LLM produced by Meta, a technology company) to just $6m. In a preprint posted online in February, researchers at Stanford University and the University of Washington claim to have gone several orders of magnitude better, training their s1 LLM for just $6. Phrased another way, DeepSeek took 2.7m hours of computer time to train; s1 took just under seven hours.
The figures are eye-popping, but the comparison is not exactly like-for-like. Where DeepSeek’s v3 chatbot was trained from scratch—accusations of data theft from OpenAI, an American competitor, and peers notwithstanding—s1 is instead “fine-tuned” on the pre-existing Qwen2.5 LLM, produced by Alibaba, China’s other top-tier AI lab. Before s1’s training began, in other words, the model could already write, ask questions, and produce code.
Piggybacking of this kind can lead to savings, but can’t cut costs down to single digits on its own. To do that, the American team had to break free of the dominant paradigm in AI research, wherein the amount of data and computing power available to train a language model is thought to improve its performance. They instead hypothesised that a smaller amount of data, of high enough quality, could do the job just as well. To test that proposition, they gathered a selection of 59,000 questions covering everything from standardised English tests to graduate-level problems in probability, with the intention of narrowing them down to the most effective training set possible.
To work out how to do that, the questions on their own aren’t enough. Answers are needed, too. So the team asked another AI model, Google’s Gemini, to tackle the questions using what is known as a reasoning approach, in which the model’s “thought process” is shared alongside the answer. That gave them three datasets to use to train s1: 59,000 questions; the accompanying answers; and the “chains of thought” used to connect the two.
They then threw almost all of it away. As s1 was based on Alibaba’s Qwen AI, anything that model could already solve was unnecessary. Anything poorly formatted was also tossed, as was anything that Google’s model had solved without needing to think too hard. If a given problem didn’t add to the overall diversity of the training set, it was out too. The end result was a streamlined 1,000 questions that the researchers proved could train a model just as high-performing as one trained on all 59,000—and for a fraction of the cost.
Such tricks abound. Like all reasoning models, s1 “thinks” before answering, working through the problem before announcing it has finished and presenting a final answer. But lots of reasoning models give better answers if they’re allowed to think for longer, an approach called “test-time compute”. And so the researchers hit upon the simplest possible approach to get the model to carry on reasoning: when it announces that it has finished thinking, just delete that message and add in the word “Wait” instead.
The tricks also work. Thinking four times as long allows the model to score over 20 percentage points higher on maths tests as well as scientific ones. Being forced to think for 16 times as long takes the model from being unable to earn a single mark on a hard maths exam to getting a score of 60%. Thinking harder is more expensive, of course, and the inference costs increase with each extra “wait”. But with training available so cheaply, the added expense may be worth it.
The researchers say their new model already beats OpenAI’s first effort in the space, September’s o1-preview, on measures of maths ability. The efficiency drive is the new frontier.
Curious about the world? To enjoy our mind-expanding science coverage, sign up to Simply Science, our weekly subscriber-only newsletter.
© 2025, The Economist Newspaper Limited. All rights reserved. From The Economist, published under licence. The original content can be found on www.economist.com
-
Blockchain7 days ago
The CFO and Treasurer’s Guide to Digital Assets – Crypto News
-
Cryptocurrency1 week ago
Famous Crypto Analyst Advises to Sell NVIDIA Stock: Here’s Why – Crypto News
-
Cryptocurrency1 week ago
Tariffs Are Just the Tip of the Iceberg, Warns Billionaire Investor Ray Dalio – Crypto News
-
Cryptocurrency1 week ago
BitMEX Study Reveals Exchange-Specific Price Trends for Perpetual Swaps Across Leading Exchanges – Crypto News
-
Technology1 week ago
Apple could give iPhone a radical makeover for its 20th anniversary, report says – Crypto News
-
Business1 week ago
Will Dogecoin Price Ever Reach $1? Top Analysts Weigh In – Crypto News
-
Cryptocurrency1 week ago
Dire Wolf Solana Meme Coin Soars to $13.6M Market Cap After ‘De-Extinction’ – Crypto News
-
Technology1 week ago
Apple exported iPhones worth ₹1.5 trillion from India in FY25: Union Minister Ashwini Vaishnaw – Crypto News
-
others1 week ago
John Deaton Highlights Ripple’s Journey from Legal Struggle To ETF Launches – Crypto News
-
Technology1 week ago
Can It Take The Baton And Initiate The Next Altcoin Rally As The Market Strengthens? – Crypto News
-
Cryptocurrency1 week ago
The Downside Prevails As Cardano Price Rejected at $0.60 – Crypto News
-
Cryptocurrency1 week ago
Dogecoin hits multi-month low, but is a market reset on the way? – Crypto News
-
Technology1 week ago
Musks DOGE using AI to snoop on U.S. federal workers, sources say – Crypto News
-
Cryptocurrency1 week ago
ETH Hits 2-Year Low as BTC, XRP Hold Support – Crypto News
-
Cryptocurrency1 week ago
Peter Schiff Cautions US Against Trade War Escalation With China – Crypto News
-
Blockchain6 days ago
How to mine Bitcoin at home in 2025: A realistic guide – Crypto News
-
Technology1 week ago
iPad Air M3 (2025) Review: Still the most practical iPad – Crypto News
-
Business1 week ago
Cathie Wood’s Ark Invest Loads $13 Million of Coinbase Stock, COIN Price Reversal Soon? – Crypto News
-
others1 week ago
Australia Shuts Over 90 Companies Linked To Pig Butchering Schemes – Crypto News
-
Business1 week ago
“Perfect Time to Buy” – Patterns Point to a Pepe Coin Price Resurgence – Crypto News
-
Cryptocurrency1 week ago
Bitcoin is highly correlated with stock market since August 2024 – Crypto News
-
Business1 week ago
Sui Price Recovers As CBOE Files To List SUI ETF – Crypto News
-
Technology6 days ago
Microsoft’s Greatest Hits and Epic Fails: A 50-Year Wild Ride – Crypto News
-
Blockchain1 week ago
Cardano (ADA) Eyes Resistance Break—Failure Could Spark Fresh Losses – Crypto News
-
Technology1 week ago
PumpFun Livestream Feature Is Back — But What’s Changed? – Crypto News
-
Business1 week ago
Is Ripple Hinting at Cardano Partnership? – Crypto News
-
Blockchain1 week ago
Cathie Wood’s ARK bags $26M in Coinbase shares, unloads Bitcoin ETF – Crypto News
-
Technology1 week ago
China Retaliates, Triggering a Dead Cat Bounce in Crypto – Crypto News
-
Business1 week ago
Solana Unveils Confidential Balances Token Extension – Crypto News
-
others1 week ago
Top 3 Reasons XRP Price May Surge as Analyst Delivers a $693 Billion Prediction – Crypto News
-
Cryptocurrency1 week ago
BTC Risks Further Downside if it Fails to Reclaim This Resistance – Crypto News
-
Cryptocurrency1 week ago
OpenAI Countersues Elon Musk, Accuses Billionaire of ‘Bad-Faith Tactics’ – Crypto News
-
Blockchain7 days ago
BTC, ETH, XRP, BNB, SOL, DOGE, ADA, LEO, LINK, AVAX – Crypto News
-
Technology6 days ago
Dogecoin Price Gearing for A 3X Rally Amid DOGE Whale Accumulation – Crypto News
-
others6 days ago
Binance Issues Important Update On 10 Crypto, Here’s All – Crypto News
-
others1 week ago
WTI price mostly unchanged at European opening – Crypto News
-
others1 week ago
Technical Indicator Suggesting Bitcoin (BTC) Bull Market Hasn’t Started Yet: Quant Analyst PlanB – Crypto News
-
others1 week ago
Gold price under pressure despite high risk aversion – Commerzbank – Crypto News
-
Technology1 week ago
Shiba Inu Price Risks 50% Crash As Bearish Breakout Looms – Crypto News
-
Blockchain1 week ago
Web3 active developers drop nearly 40% in one year – Crypto News
-
Blockchain1 week ago
XRP Down, But History Says Millionaires Were Made This Way – Crypto News
-
others1 week ago
Economist Alex Krüger Warns US Stocks Could Repeat 2008 Bear Market Amid Trump’s Trade War – Crypto News
-
Technology1 week ago
XRP Leveraged ETF Outshines Solana At Launch – Crypto News
-
Cryptocurrency1 week ago
Stablecoin infrastructure platform M^0 expands to Solana – Crypto News
-
Blockchain1 week ago
Investors Looking To Buy Bitcoin? – Crypto News
-
Cryptocurrency1 week ago
Galaxy’s imminent US listing reflects SEC change – Crypto News
-
others1 week ago
Crypto Products See $240,000,000 in Outflows Likely in Response to US Tariff Threats: CoinShares – Crypto News
-
Blockchain1 week ago
NY attorney general urges Congress to keep pensions crypto-free — ‘No intrinsic value’ – Crypto News
-
Technology7 days ago
iQOO Z10 5G, Z10x 5G launched in India, price starts at ₹13,499. Check full price, specs and more – Crypto News
-
Metaverse7 days ago
Google launches Gemini 2.5 Flash—Ideal for chatbots, assistants and instant summarisation – Crypto News