
Metaverse
Forget DeepSeek. Large language models are getting cheaper still – Crypto News
As recently as 2022, just building a large language model (LLM) was a feat at the cutting edge of artificial-intelligence (AI) engineering. Three years on, experts are harder to impress. To really stand out in the crowded marketplace, an AI lab needs not just to build a high-quality model, but to build it cheaply.
In December a Chinese firm, DeepSeek, earned itself headlines for cutting the dollar cost of training a frontier model down from $61.6m (the cost of Llama 3.1, an LLM produced by Meta, a technology company) to just $6m. In a preprint posted online in February, researchers at Stanford University and the University of Washington claim to have gone several orders of magnitude better, training their s1 LLM for just $6. Phrased another way, DeepSeek took 2.7m hours of computer time to train; s1 took just under seven hours.
The figures are eye-popping, but the comparison is not exactly like-for-like. Where DeepSeek’s v3 chatbot was trained from scratch—accusations of data theft from OpenAI, an American competitor, and peers notwithstanding—s1 is instead “fine-tuned” on the pre-existing Qwen2.5 LLM, produced by Alibaba, China’s other top-tier AI lab. Before s1’s training began, in other words, the model could already write, ask questions, and produce code.
Piggybacking of this kind can lead to savings, but can’t cut costs down to single digits on its own. To do that, the American team had to break free of the dominant paradigm in AI research, wherein the amount of data and computing power available to train a language model is thought to improve its performance. They instead hypothesised that a smaller amount of data, of high enough quality, could do the job just as well. To test that proposition, they gathered a selection of 59,000 questions covering everything from standardised English tests to graduate-level problems in probability, with the intention of narrowing them down to the most effective training set possible.
To work out how to do that, the questions on their own aren’t enough. Answers are needed, too. So the team asked another AI model, Google’s Gemini, to tackle the questions using what is known as a reasoning approach, in which the model’s “thought process” is shared alongside the answer. That gave them three datasets to use to train s1: 59,000 questions; the accompanying answers; and the “chains of thought” used to connect the two.
They then threw almost all of it away. As s1 was based on Alibaba’s Qwen AI, anything that model could already solve was unnecessary. Anything poorly formatted was also tossed, as was anything that Google’s model had solved without needing to think too hard. If a given problem didn’t add to the overall diversity of the training set, it was out too. The end result was a streamlined 1,000 questions that the researchers proved could train a model just as high-performing as one trained on all 59,000—and for a fraction of the cost.
Such tricks abound. Like all reasoning models, s1 “thinks” before answering, working through the problem before announcing it has finished and presenting a final answer. But lots of reasoning models give better answers if they’re allowed to think for longer, an approach called “test-time compute”. And so the researchers hit upon the simplest possible approach to get the model to carry on reasoning: when it announces that it has finished thinking, just delete that message and add in the word “Wait” instead.
The tricks also work. Thinking four times as long allows the model to score over 20 percentage points higher on maths tests as well as scientific ones. Being forced to think for 16 times as long takes the model from being unable to earn a single mark on a hard maths exam to getting a score of 60%. Thinking harder is more expensive, of course, and the inference costs increase with each extra “wait”. But with training available so cheaply, the added expense may be worth it.
The researchers say their new model already beats OpenAI’s first effort in the space, September’s o1-preview, on measures of maths ability. The efficiency drive is the new frontier.
Curious about the world? To enjoy our mind-expanding science coverage, sign up to Simply Science, our weekly subscriber-only newsletter.
© 2025, The Economist Newspaper Limited. All rights reserved. From The Economist, published under licence. The original content can be found on www.economist.com
-
Blockchain1 week ago
Conduit Raises $36M to Expand Cross-Border Stablecoin System – Crypto News
-
Blockchain1 week ago
Conduit Raises $36M to Expand Cross-Border Stablecoin System – Crypto News
-
others1 week ago
BitMEX Unveils AI-Powered VIP Trading Reports in Partnership with Hoc-trade – Crypto News
-
Cryptocurrency1 week ago
XRP Spot ETF Update: SEC Advances WisdomTree Proposal Review – Crypto News
-
Blockchain1 week ago
$8 XRP Sounds Huge—But This Analyst Isn’t Cheering Yet – Crypto News
-
Business1 week ago
Bitcoin Crash Fears Escalate as BTC Price Stalls Under $110K Amid $3.2B BTC Inflow – Crypto News
-
Cryptocurrency1 week ago
The monetary power of the periphery: How Dallas defends the dollar – Crypto News
-
Metaverse1 week ago
Anthropic rolls out real-time voice chat for Claude on iOS and Android: What it means for users – Crypto News
-
Cryptocurrency1 week ago
XRP drops 1.05% as resistance levels cap recovery – Crypto News
-
others1 week ago
Trader Michaël van de Poppe Says Ethereum-Based Altcoin Primed To Do Well in Coming Months, Updates Outlook on Bitcoin and Sui – Crypto News
-
Technology1 week ago
Cool savings for a hot season: Top 10 deals for you on ACs, refrigerators, microwaves, and more with up to 60% off – Crypto News
-
Cryptocurrency1 week ago
One day left to invest in Bitcoin Pepe before it hits centralised exchanges – Crypto News
-
Technology1 week ago
Why Is Pepe Coin Trending Today? – Crypto News
-
Cryptocurrency1 week ago
Nifty 50 Ends Higher After Two-Day Drop, But Bulls Struggle to Break 25,000 – Crypto News
-
others1 week ago
Gold surges above $3,300 as US jobs data disappoints, Trump tariffs blocked – Crypto News
-
Blockchain1 week ago
Testing Strength At Key Support – Crypto News
-
Blockchain6 days ago
Czech Justice Minister Resigns Over $45M Bitcoin Donation Scandal – Crypto News
-
others1 week ago
Echo Announces New Platform Sonar For Public Token Sales – Are ICO Days Back? – Crypto News
-
Blockchain1 week ago
XRP Marks Another Milestone As Dubai Brings $16 Billion In Real Estate Company To The Blockchain – Details – Crypto News
-
Cryptocurrency1 week ago
Coinbase helps bust $20M spoofing case – Crypto News
-
Cryptocurrency1 week ago
Litecoin price forecast: tracking LTC’s bullish technical setup – Crypto News
-
Cryptocurrency1 week ago
Litecoin price forecast: tracking LTC’s bullish technical setup – Crypto News
-
Business1 week ago
Sharplink Gaming Files $1 Billion Shelf Offering To Purchase Ethereum – Crypto News
-
others1 week ago
Sharplink Gaming Files $1 Billion Shelf Offering To Purchase Ethereum – Crypto News
-
Technology1 week ago
WhatsApp Status gets new Instagram-like features: Here’s what’s new – Crypto News
-
Cryptocurrency6 days ago
Bitcoin in ‘make or break’ zone – Trump Media hints at what’s next – Crypto News
-
Technology6 days ago
Just-In: IMF Raises Red Flag Over Pakistan’s Bitcoin Mining Plans, Is $1.5B IMF Loan at Risk? – Crypto News
-
Technology1 week ago
Breaking: Telegram Partners with Elon Musks’s xAI, TON Price Jumps 23% – Crypto News
-
Cryptocurrency1 week ago
Ethereum surges 5% as SharpLink eyes $425m ETH treasury – Crypto News
-
Technology1 week ago
XRP News: RLUSD Stablecoin Bags New Listing on Major DeFi Platform – Crypto News
-
Cryptocurrency1 week ago
SOL Strategies Files $1B Shelf Prospectus to Boost Solana Investment ‘Flexibility’ – Crypto News
-
Blockchain1 week ago
Bitcoin $106,800 Support Retest To Determine BTC’s Next Move – Crypto News
-
Cryptocurrency1 week ago
XRP futures surge past $223M as price holds $2.27 support – Crypto News
-
Cryptocurrency1 week ago
Cold Summer? Bitcoin Price Breaches $105K Support As Tariffs Return to Play – Crypto News
-
others7 days ago
Bankrupt Crypto Exchange FTX Officially Kicks Off Second Round of Creditor Repayments With $5,400,000,000 Distribution – Crypto News
-
others6 days ago
JPMorgan Chase CEO Warns US Bond Crisis Coming After Massive Money Printing, Says Regulators Will Panic – Crypto News
-
Cryptocurrency6 days ago
Can Shiba Inu Price Recover as Age Consumed & Falling MVRV Signal Bottom? – Crypto News
-
Blockchain6 days ago
Bitcoin Still Bullish, But $200,000 Off The Table And $137,000 In Sight – Crypto News
-
others1 week ago
U.S. Department of Labor Reverses 2022 Guidance That Blocked Digital Assets From 401(k) Plans – Crypto News
-
others1 week ago
Gold rebounds as US Dollar retreats while court strikes down Trump’s tariffs – Crypto News
-
Blockchain1 week ago
RBI Expands Digital Rupee Pilots, UPI Leads Global Real-Time Payments – Crypto News
-
Blockchain1 week ago
Telegram raises $1.7 billion via bond offering – Crypto News
-
Business1 week ago
XRP Crash: Why Price Is Falling Today? – Crypto News
-
Business1 week ago
Floki Inu Announces Valhalla Mainnet Launch Date; FLOKI Price to Rally? – Crypto News
-
Metaverse1 week ago
IndiaAI Mission gets 16,000 new GPUs, three more foundational models – Crypto News
-
others1 week ago
$413,200,000,000 in Unrealized Losses Hit US Banks As FDIC Warns Rising Rates Adding Pressure – Crypto News
-
Cryptocurrency7 days ago
Friday Charts: Click here for good news – Crypto News
-
Blockchain6 days ago
Major crypto hacks fell 40% in May, says PeckShield – Crypto News
-
Business6 days ago
Michael Saylor Signals Another Massive Strategy Bitcoin Purchase – Crypto News
-
Blockchain6 days ago
Strategy signals another Bitcoin buy on June 2 – Crypto News