Technology
AI is a particularly well-suited tech trajectory for India: Cerebras’ Feldman – Crypto News
How is this large chip (CS-3) helping train AI models faster, and how do you see businesses, academic institutions and governments leveraging it?
One of the fundamental challenges in AI right now is to distribute a single model over hundreds or thousands of GPUs. You can’t fit the big matrix multipliers (matrix multiplication is a big part of the math done in deep learning models, which requires significant computing power) on a single GPU. But we can fit this on a single wafer, and so we can bring to the enterprise and the academician, the power of tens of thousands of GPUs but the programming simplicity of a single GPU, helping them do work that they wouldn’t otherwise be able to do.
We are able to tie together dozens or hundreds of these (chips) into supercomputers and make it easy to train big models. GPT-4 cited in its paper that it had 240 contributors, of which 35 are mostly doing distributed computing. Which enterprise company has 35 supercomputer jockeys, whose job it is to do distributed computing? The answer is, very few. That means it’s very difficult for them (most companies) to do big AI work. We eliminate that need (with this big chip).
Please share some examples of how companies across sectors are working with you to leverage this big chip.
Companies like GlaxoSmithKline Pharmaceuticals are using us to do genomic research and in the drug design workflow. Companies like Mayo Clinic, one of the leading medical institutions in the world, have multiple projects. Some of them are looking at using genetic data to predict which rheumatoid arthritis drug would be best for a given individual. Others are doing hospital administration—how you can predict how long a patient will stay in a hospital based on the medical history.
Customers like Total (TotalEnergies)—the giant French oil and gas company—are using us to do AI work in oil exploration. We also have government customers and those who are doing research on the Covid virus. We have government researchers who include our system in giant physics simulations, where they use what’s called a Simulation plus AI or HPC (high performance computing) plus AI (system), where the AI is doing some training work and recommending starting points for the simulator.
How’s your partnership with Abu Dhabi-based Group 42 Holding panning out for the Arabic LLM and supercomputers you’re building with them?
G42 is our strategic partner. We’ve completed two supercomputers in the US, four exaflops each, and we’ve just started the third supercomputer in Dallas, Texas. We announced that we will build nine supercomputers with them. We also saw the opportunity to train an Arabic LLM to cater to the 400 million native Arabic speakers. G42 had the data and we both had researchers who we brought together to train what is head and shoulders, the best Arabic model in the world.
We have many projects underway with them. We also trained one of the best coding models called Crystal Coder. We have also worked with M42, a JV between G42 Healthcare and Mubadala Health, and trained a medical assistant. The aspirations in the Emirates are extraordinary, and the vision and desire to be a leader in AI, exceptional.
What about India, where you have already had talks with certain companies and government officials, too?
We had lots of conversations with data centre owners, cloud providers, and with government officials in New Delhi. We have a team of about 40 engineers in Bangalore (Bengaluru) and we’re growing as fast as we can there. India has some of the great university systems—the IITs and NITs of the world. And many of the researchers working on big compute problems around the world were trained in India.
Obviously, India is one of the most exciting markets in the world, but it does not have enough supercomputers for the talent it has. So, it’s both important for sovereignty and for a collection of national issues to have better infrastructure in India to create an opportunity to keep some of its world-class talent that wants to work on the biggest supercomputers.
I think AI is a particularly well-suited technology trajectory for India. It builds on a strength in CS (computer science) and in statistics that you’ve had in your university system for generations.
Talking about your big chip, companies now appear to be focused more on fine-tuning large language models (LLMs), building smaller language models (SLMs), and doing inference (using a pre-trained AI model to make predictions or decisions on new, unseen data), rather than building large multimodal models (LMMs) and foundational models. Many such customers would do with fewer GPUs. Wouldn’t your big chip prove an overkill and too expensive for such clients?
The amount of compute you need is approximately the product of the size of the model, and times the number of tokens you train on. Now, the cost to do inference is a function of the size of the model. And so, as we’re thinking about how to deploy these models in production, there’s a preference for smaller models. However, there isn’t a preference for less accuracy. So, while the models might be 7 billion(b), 13b or 30b, the number of tokens they’re trained on is a trillion, two trillion (and more). So, the amount of compute you need hasn’t changed. In fact, in many instances, it’s gone up.
Hence, you still need huge amounts of compute, even though the models are smaller, because you’re running so many tokens through so much data. In fact, you’re trading off parameter size with data. And you’re not using less compute, you’re just allocating it differently, because that has different ramifications on the cost of inference.
I also do not subscribe to the view that as inference grows, there will be less training. As inference grows, the importance of accuracy and training increases. And so, there will be more training. If you have a good model, say, for reading pathology slides, and it’s 93% accurate, and you don’t retrain, and someone else comes up with one that’s 94% accurate, who’s going to use your model? And so, there will be tremendous pressure as these are deployed to you to be better, and better, and better. Training continues for years and years to come.
Inference will come in many flavours—there will be easy batch inference, and then there will be real-time inference in which latency matters a huge amount. If you want, the obvious example, self-driving cars that are making inference decisions in near real time. And as we move inference to harder problems, and we include it in a control system, then the inference challenge is much harder. Those are problems we’re interested in. Most of the inference problems today are pretty straightforward, and we’ve partnered with Qualcomm, because they have an excellent offering. And we wanted to be sure we could show up with a solution that did not include Nvidia.
But what about the cost comparison with GPUs?
Inference is on the rise, and our systems today are being used for what I call are real-time, very hard inference problems—predominantly for defence and security. I think there will be more of those over time. But in the next 9-12 months, the market will be dominated by much easier problems.
That said, CS-3 costs about the same as three DGX H100s (Nvidia’s AI system capable of handling demanding tasks such as generative AI, natural language processing, and deep learning recommendation models), and gives you the performance of seven or 10. And so you have a dramatic price performance advantage.
But if you want one GPU, we’re not a good choice. We begin being sort of equivalent to 40 or 50 GPUs. So, we have a higher entry point, but we’re designed for the AI practitioner who’s doing real work—you have to be interested in training models of some size, or some complexity on interesting data sets. That’s where we enter.
But Nvidia is estimated to have 80-95% global market share in AI chips, and it will be very difficult to compete in this space.
I don’t think we have to take share. I think the market is growing so fast. I mean, Nvidia added $40 billion last year to their revenue. And that market is growing unbelievably quickly. The universe of AI is expanding at an extraordinary rate. And there’ll be many winners. We did big numbers last year. We’re going to do bigger numbers this year. We’ve raised in total about $750 million till date, and the last round’s valuation was $4.1 billion.
How water and energy efficient are your big chips?
There are several interesting elements of a big chip. Each chip uses about 18 kilowatts but it replaces 40 or 50 chips that use 600 watts. Moreover, when you build one chip, it does a lot of work—you can afford more efficient cooling. Also, GPUs use air, and air is an inefficient cooler. We use water, because we can amortize a more efficient and more expensive cooling system over more compute on a wafer. And so, we generally run per-unit compute, somewhere between a third and half the power draw. Why is that? The big chip allows us both to be more efficient in our compute, to keep information on the chip, not move it and spend power in switches (electronic switches are the basic building blocks of microchips), etc. It also allows us to use a more efficient cooling mechanism.
You’ve also said that this big chip has broken Moore’s law. What exactly do you mean?
Moore’s law said the number of transistors on a single chip would double every 18 months at lower costs. But, first, that required the shrinking of the fab geometries. Second, the chips got bigger themselves. But the radical limit, which constrains everybody but Cerebras, was about 815-820 square millimetres. We obliterated the radical limits and went to 46,000 square millimetres. So, in a single chip, we were able to use more silicon to break Moore’s law. That was the insight for this workload—the cost of moving data off the chip, the cost of all these switches, and the cost that forced Nvidia to buy Mellanox (a company Nvidia acquired in March 2019 to optimize datacentre-scale workloads) which could be avoided with a big chip. While everybody else is working with 60 billion, 80 billion, 120 billion transistors, we’re at 4 trillion.
Some people believe that GenAI is being overhyped and are becoming disillusioned, given its limitations including hallucinations, lack of accuracy, copyright violations, trademarks, IP violations, etc. The other school believes that GenAI models will iron out all these kinds of problems over a period of time and achieve maturity by 2027 or 2028. What’s your view?
These arguments only exist if there are kernels of truth on both sides. AI is not a silver bullet. It allows us to attack a class of problems with computers that have historically been foreclosed to us—like images, like text. They allow us to find insight in data in a new and different way. And generally, the first step is to make existing work better—better summarization, we replace people who do character recognition, we replace analysts who looked at satellite imagery with machines, and you get a modest performance of sort of societal benefit GDP (gross domestic product) growth. But it typically takes 3-7 years, following which you begin to reorganize things around the new technology, and get the massive bump.
For instance, computers first replaced ledgers, then assistants, and then replaced typewriters. And we got a little bump in productivity. But when we moved to the cloud, and reorganized the delivery of software where you could gain access to compute anywhere, we suddenly got a huge jump in unit labour and productivity.
So, there are kernels of truth in both arguments. But to people who say this is the answer to everything, you’re clearly going to be wrong. To people who say there are obviously large opportunities to have substantial impact, you’re right.
In my view, AI is the most important technology trajectory of our generation, bar none. But it’s not going to solve every problem—it will give us answers to many problems. It solves protein folding—a problem that humans had not been able to solve until then. It has made games like chess and poker, which had been interesting to people for hundreds and hundreds of years, trivial. It will change the way wars are fought. It will change the way drugs are discovered.
But will it make me a better husband? Probably not. Will it help my friendships? Will it help my dreams and aspirations? No. Sometimes we go crazy thinking about a new technology.
Talking about crazy, what are your quick thoughts on artificial general intelligence (AGI)?
I think we will definitely have machines that can do pretty thoughtful reasoning. But I don’t think that’s AGI. That’s data-driven logical learning. I am not optimistic about AGI in the next 5-10 years, as most people constructed. I think we will get better and better at extracting insight from data, extracting logic from data, and reasoning. But I don’t think we’re close to some notion of self-awareness.
In this context, what should CXOs keep in mind when implementing AI, GenAI projects?
There are three fundamental elements when dealing with AI—the algorithm, data, and computing power. And you have to decide where you have an advantage. Many CXOs have data, and they are sitting on a gold mine if they’re invested in curated data. And AI is a mechanism to extract insight from data. This can help them think about the partnerships they need.
Consider the case of OpenAI, which had algorithms; they partnered with Microsoft for compute, and they used open-source data. GlaxoSmithKline had data, partnered with us for compute, and had internal algorithm expertise. These three parts will help your strategy, and your data will be enormously important for construction of models that solve your problems.
Unlock a world of Benefits! From insightful newsletters to real-time stock tracking, breaking news and a personalized newsfeed – it’s all here, just a click away! Login Now!
Download The Mint News App to get Daily Market Updates & Live Business News.
Published: 08 Apr 2024, 06:30 AM IST
-
Technology7 days agoMulticloud Agility Comes to Financial Services – Crypto News
-
Cryptocurrency1 week agoOpenAI Confirms Data Breach—Here’s Who Is Impacted – Crypto News
-
Blockchain1 week agoEthereum Fusaka Will Be ‘The Most Bullish Upgrade’ Ever – Crypto News
-
Cryptocurrency1 week agoNew chains, old problems – Blockworks – Crypto News
-
Blockchain1 week agoEther Eyeing $3.2K As Stablecoin Yields Remain Low: Santiment – Crypto News
-
Technology6 days agoVivo X300, X300 Pro launched in India, price starts at ₹75,999: Display, camera details and all you need to know – Crypto News
-
Blockchain1 week agoUK Expands Crypto Reporting Rules as Global Tax Oversight Tightens – Crypto News
-
Blockchain1 week agoWhat’s Going On Behind The Scenes With XRP? Expert Answers – Crypto News
-
Cryptocurrency1 week agoAmundi, Europe’s biggest asset manager, tokenises money market fund on Ethereum – Crypto News
-
Technology1 week ago
Bitcoin Maximalist Max Keiser Predicts ZEC Crash To $55 as Zcash Extends Decline – Crypto News
-
Blockchain1 week agoBitcoin Must Break Key Supply Clusters To Regain ATH Momentum – Watch These Levels – Crypto News
-
Blockchain1 week agoUpbit $30 Million Hack Update: Authorities Link Breach To North Korean Hackers – Crypto News
-
others1 week ago
Why Crypto Market Down Today? (29 Nov) – Crypto News
-
Cryptocurrency1 week agoZcash, Monero in Tight Ranking Race: Who Wins? – Crypto News
-
others1 week ago
Solana Price Outlook as CoinShares Withdraws SEC Filing for Staked Solana ETF – Crypto News
-
Cryptocurrency1 week agoCrypto market mixed as Bitcoin tests $93K, Ethereum and XRP hit major resistance – Crypto News
-
Blockchain1 week agoEther Eyeing $3.2K As Stablecoin Yields Remain Low: Santiment – Crypto News
-
Business1 week ago
Crypto Exchange Bitget Donates $1.54M To Hong Kong Fire Victims – Crypto News
-
Technology1 week agoYou’re using Gemini wrong: 7 Viral AI prompts that will instantly transform your presentations – Crypto News
-
Cryptocurrency1 week agoXRP price prediction: ETF inflows, CME futures, and technical pressure align – Crypto News
-
Cryptocurrency1 week agoWhy CoinShares Just Quit the $600M XRP and SOL ETF Battle – Crypto News
-
others1 week agoWTI Crude Oil gains as Russia-Ukraine talks, OPEC+ meeting eyed – Crypto News
-
others1 week agoGBP/USD edges lower to 1.3220 – Crypto News
-
Metaverse1 week agoIs Big Tech’s superintelligence narrative inflating the AI bubble? – Crypto News
-
Technology1 week ago
Jerome Powell Speech Today: What To Expect as Fed Ends QT – Crypto News
-
Business1 week ago
Cardano News: ADA Ecosystem Proposes ‘Critical Integrations Budget’ To Advance Network Growth – Crypto News
-
Business1 week ago
Coinbase Submits Recommendations to CFTC on Crypto Market Rules – Crypto News
-
Blockchain1 week agoBitcoin Forms Short-Term Bottom, $100,000 Rally in Sight – Crypto News
-
Metaverse1 week agoAI’s positive impact in focus at Mint All About AI Tech4Good Summit 2025 – Crypto News
-
Cryptocurrency1 week ago‘ZEC Is 20x Lower Than XRP’: Solana Builder Breaks Silence After Zcash’s 50% Crash – Crypto News
-
others1 week agoXAG/USD hits record $56 as bulls dominate – Crypto News
-
Business1 week ago
Hyperliquid Team Moves $90M HYPE as Network Becomes Top Fee Chain – Crypto News
-
Business1 week ago
Michael Saylor Hints Fresh Bitcoin Buy With “Green Dots” Tease – Crypto News
-
Cryptocurrency1 week agoBitcoin, Ethereum, and XRP Crash Triggering $637M in Liquidations – Crypto News
-
others7 days ago
Ethereum Price Crashes Below $3,000 as $500M Longs Liquidated: What’s Next? – Crypto News
-
Cryptocurrency1 week agoTurkmenistan joins global crypto regulation push with sweeping new digital asset law – Crypto News
-
others1 week ago
21Shares XRP ETF To Begin Trading on Monday as Institutional Inflows Hit $666 Million – Crypto News
-
Business1 week ago
China Begins Policy Talks to Crack Down on Stablecoin and Crypto Payments – Crypto News
-
Blockchain1 week agoSUI Climbs Into High-Risk Territory As Wave 4 Nears Its Exhaustion Point – Crypto News
-
Business1 week ago
Arthur Hayes Predicts Bitcoin Rally To $500K By Next Year Over Fed Easing – Crypto News
-
Cryptocurrency1 week agoRemittix vs. Digitap ($TAP): Get in the action for $1 million in Black Friday bonuses – Crypto News
-
Business1 week ago
Robert Kiyosaki Recommends Bitcoin and Ethereum as Hedge Against Potential Global Crisis – Crypto News
-
Cryptocurrency1 week ago
Peter Schiff Predicts Bitcoin Decline Will Extend Into December as BTC Closes Out Red November – Crypto News
-
Technology1 week ago
Ethereum Price Prediction 2025: How High Can ETH Go by Year-End? – Crypto News
-
others1 week ago
BNB Chain Taps Arbitrum Veteran Nina Rong to Lead Ecosystem Growth – Crypto News
-
others1 week ago
Is Kalshi Manipulating Prediction Markets? Platform Hit With Lawsuit Over Violations – Crypto News
-
others1 week agoGBP/JPY steadies as firm Tokyo inflation revives BoJ rate-hike speculation – Crypto News
-
others1 week agoCanada GDP smashes expectations at 2.6% – TDS – Crypto News
-
others1 week ago
Grayscale Cleared to Launch First Spot Chainlink ETF This Week Amid Rising Demand – Crypto News
-
Cryptocurrency1 week agoAstrology for traders – Blockworks – Crypto News
