AI is a particularly well-suited tech trajectory for India: Cerebras' Feldman

Technology

AI is a particularly well-suited tech trajectory for India: Cerebras’ Feldman – Crypto News

Published

2 years ago

April 8, 2024

Dripp

How is this large chip (CS-3) helping train AI models faster, and how do you see businesses, academic institutions and governments leveraging it?

One of the fundamental challenges in AI right now is to distribute a single model over hundreds or thousands of GPUs. You can’t fit the big matrix multipliers (matrix multiplication is a big part of the math done in deep learning models, which requires significant computing power) on a single GPU. But we can fit this on a single wafer, and so we can bring to the enterprise and the academician, the power of tens of thousands of GPUs but the programming simplicity of a single GPU, helping them do work that they wouldn’t otherwise be able to do.

We are able to tie together dozens or hundreds of these (chips) into supercomputers and make it easy to train big models. GPT-4 cited in its paper that it had 240 contributors, of which 35 are mostly doing distributed computing. Which enterprise company has 35 supercomputer jockeys, whose job it is to do distributed computing? The answer is, very few. That means it’s very difficult for them (most companies) to do big AI work. We eliminate that need (with this big chip).

Please share some examples of how companies across sectors are working with you to leverage this big chip.

Companies like GlaxoSmithKline Pharmaceuticals are using us to do genomic research and in the drug design workflow. Companies like Mayo Clinic, one of the leading medical institutions in the world, have multiple projects. Some of them are looking at using genetic data to predict which rheumatoid arthritis drug would be best for a given individual. Others are doing hospital administration—how you can predict how long a patient will stay in a hospital based on the medical history.

Customers like Total (TotalEnergies)—the giant French oil and gas company—are using us to do AI work in oil exploration. We also have government customers and those who are doing research on the Covid virus. We have government researchers who include our system in giant physics simulations, where they use what’s called a Simulation plus AI or HPC (high performance computing) plus AI (system), where the AI is doing some training work and recommending starting points for the simulator.

How’s your partnership with Abu Dhabi-based Group 42 Holding panning out for the Arabic LLM and supercomputers you’re building with them?

G42 is our strategic partner. We’ve completed two supercomputers in the US, four exaflops each, and we’ve just started the third supercomputer in Dallas, Texas. We announced that we will build nine supercomputers with them. We also saw the opportunity to train an Arabic LLM to cater to the 400 million native Arabic speakers. G42 had the data and we both had researchers who we brought together to train what is head and shoulders, the best Arabic model in the world.

We have many projects underway with them. We also trained one of the best coding models called Crystal Coder. We have also worked with M42, a JV between G42 Healthcare and Mubadala Health, and trained a medical assistant. The aspirations in the Emirates are extraordinary, and the vision and desire to be a leader in AI, exceptional.

What about India, where you have already had talks with certain companies and government officials, too?

We had lots of conversations with data centre owners, cloud providers, and with government officials in New Delhi. We have a team of about 40 engineers in Bangalore (Bengaluru) and we’re growing as fast as we can there. India has some of the great university systems—the IITs and NITs of the world. And many of the researchers working on big compute problems around the world were trained in India.

Obviously, India is one of the most exciting markets in the world, but it does not have enough supercomputers for the talent it has. So, it’s both important for sovereignty and for a collection of national issues to have better infrastructure in India to create an opportunity to keep some of its world-class talent that wants to work on the biggest supercomputers.

I think AI is a particularly well-suited technology trajectory for India. It builds on a strength in CS (computer science) and in statistics that you’ve had in your university system for generations.

Talking about your big chip, companies now appear to be focused more on fine-tuning large language models (LLMs), building smaller language models (SLMs), and doing inference (using a pre-trained AI model to make predictions or decisions on new, unseen data), rather than building large multimodal models (LMMs) and foundational models. Many such customers would do with fewer GPUs. Wouldn’t your big chip prove an overkill and too expensive for such clients?

The amount of compute you need is approximately the product of the size of the model, and times the number of tokens you train on. Now, the cost to do inference is a function of the size of the model. And so, as we’re thinking about how to deploy these models in production, there’s a preference for smaller models. However, there isn’t a preference for less accuracy. So, while the models might be 7 billion(b), 13b or 30b, the number of tokens they’re trained on is a trillion, two trillion (and more). So, the amount of compute you need hasn’t changed. In fact, in many instances, it’s gone up.

Hence, you still need huge amounts of compute, even though the models are smaller, because you’re running so many tokens through so much data. In fact, you’re trading off parameter size with data. And you’re not using less compute, you’re just allocating it differently, because that has different ramifications on the cost of inference.

I also do not subscribe to the view that as inference grows, there will be less training. As inference grows, the importance of accuracy and training increases. And so, there will be more training. If you have a good model, say, for reading pathology slides, and it’s 93% accurate, and you don’t retrain, and someone else comes up with one that’s 94% accurate, who’s going to use your model? And so, there will be tremendous pressure as these are deployed to you to be better, and better, and better. Training continues for years and years to come.

Inference will come in many flavours—there will be easy batch inference, and then there will be real-time inference in which latency matters a huge amount. If you want, the obvious example, self-driving cars that are making inference decisions in near real time. And as we move inference to harder problems, and we include it in a control system, then the inference challenge is much harder. Those are problems we’re interested in. Most of the inference problems today are pretty straightforward, and we’ve partnered with Qualcomm, because they have an excellent offering. And we wanted to be sure we could show up with a solution that did not include Nvidia.

But what about the cost comparison with GPUs?

Inference is on the rise, and our systems today are being used for what I call are real-time, very hard inference problems—predominantly for defence and security. I think there will be more of those over time. But in the next 9-12 months, the market will be dominated by much easier problems.

That said, CS-3 costs about the same as three DGX H100s (Nvidia’s AI system capable of handling demanding tasks such as generative AI, natural language processing, and deep learning recommendation models), and gives you the performance of seven or 10. And so you have a dramatic price performance advantage.

But if you want one GPU, we’re not a good choice. We begin being sort of equivalent to 40 or 50 GPUs. So, we have a higher entry point, but we’re designed for the AI practitioner who’s doing real work—you have to be interested in training models of some size, or some complexity on interesting data sets. That’s where we enter.

But Nvidia is estimated to have 80-95% global market share in AI chips, and it will be very difficult to compete in this space.

I don’t think we have to take share. I think the market is growing so fast. I mean, Nvidia added $40 billion last year to their revenue. And that market is growing unbelievably quickly. The universe of AI is expanding at an extraordinary rate. And there’ll be many winners. We did big numbers last year. We’re going to do bigger numbers this year. We’ve raised in total about $750 million till date, and the last round’s valuation was $4.1 billion.

How water and energy efficient are your big chips?

There are several interesting elements of a big chip. Each chip uses about 18 kilowatts but it replaces 40 or 50 chips that use 600 watts. Moreover, when you build one chip, it does a lot of work—you can afford more efficient cooling. Also, GPUs use air, and air is an inefficient cooler. We use water, because we can amortize a more efficient and more expensive cooling system over more compute on a wafer. And so, we generally run per-unit compute, somewhere between a third and half the power draw. Why is that? The big chip allows us both to be more efficient in our compute, to keep information on the chip, not move it and spend power in switches (electronic switches are the basic building blocks of microchips), etc. It also allows us to use a more efficient cooling mechanism.

You’ve also said that this big chip has broken Moore’s law. What exactly do you mean?

Moore’s law said the number of transistors on a single chip would double every 18 months at lower costs. But, first, that required the shrinking of the fab geometries. Second, the chips got bigger themselves. But the radical limit, which constrains everybody but Cerebras, was about 815-820 square millimetres. We obliterated the radical limits and went to 46,000 square millimetres. So, in a single chip, we were able to use more silicon to break Moore’s law. That was the insight for this workload—the cost of moving data off the chip, the cost of all these switches, and the cost that forced Nvidia to buy Mellanox (a company Nvidia acquired in March 2019 to optimize datacentre-scale workloads) which could be avoided with a big chip. While everybody else is working with 60 billion, 80 billion, 120 billion transistors, we’re at 4 trillion.

Some people believe that GenAI is being overhyped and are becoming disillusioned, given its limitations including hallucinations, lack of accuracy, copyright violations, trademarks, IP violations, etc. The other school believes that GenAI models will iron out all these kinds of problems over a period of time and achieve maturity by 2027 or 2028. What’s your view?

These arguments only exist if there are kernels of truth on both sides. AI is not a silver bullet. It allows us to attack a class of problems with computers that have historically been foreclosed to us—like images, like text. They allow us to find insight in data in a new and different way. And generally, the first step is to make existing work better—better summarization, we replace people who do character recognition, we replace analysts who looked at satellite imagery with machines, and you get a modest performance of sort of societal benefit GDP (gross domestic product) growth. But it typically takes 3-7 years, following which you begin to reorganize things around the new technology, and get the massive bump.

For instance, computers first replaced ledgers, then assistants, and then replaced typewriters. And we got a little bump in productivity. But when we moved to the cloud, and reorganized the delivery of software where you could gain access to compute anywhere, we suddenly got a huge jump in unit labour and productivity.

So, there are kernels of truth in both arguments. But to people who say this is the answer to everything, you’re clearly going to be wrong. To people who say there are obviously large opportunities to have substantial impact, you’re right.

In my view, AI is the most important technology trajectory of our generation, bar none. But it’s not going to solve every problem—it will give us answers to many problems. It solves protein folding—a problem that humans had not been able to solve until then. It has made games like chess and poker, which had been interesting to people for hundreds and hundreds of years, trivial. It will change the way wars are fought. It will change the way drugs are discovered.

But will it make me a better husband? Probably not. Will it help my friendships? Will it help my dreams and aspirations? No. Sometimes we go crazy thinking about a new technology.

Talking about crazy, what are your quick thoughts on artificial general intelligence (AGI)?

I think we will definitely have machines that can do pretty thoughtful reasoning. But I don’t think that’s AGI. That’s data-driven logical learning. I am not optimistic about AGI in the next 5-10 years, as most people constructed. I think we will get better and better at extracting insight from data, extracting logic from data, and reasoning. But I don’t think we’re close to some notion of self-awareness.

In this context, what should CXOs keep in mind when implementing AI, GenAI projects?

There are three fundamental elements when dealing with AI—the algorithm, data, and computing power. And you have to decide where you have an advantage. Many CXOs have data, and they are sitting on a gold mine if they’re invested in curated data. And AI is a mechanism to extract insight from data. This can help them think about the partnerships they need.

Consider the case of OpenAI, which had algorithms; they partnered with Microsoft for compute, and they used open-source data. GlaxoSmithKline had data, partnered with us for compute, and had internal algorithm expertise. These three parts will help your strategy, and your data will be enormously important for construction of models that solve your problems.

Unlock a world of Benefits! From insightful newsletters to real-time stock tracking, breaking news and a personalized newsfeed – it’s all here, just a click away! Login Now!

Catch all the Technology News and Updates on Live Mint.
Download The Mint News App to get Daily Market Updates & Live Business News.

More
Less

Published: 08 Apr 2024, 06:30 AM IST

Up Next

Garena Free Fire MAX redeem codes for April 8, 2024: Get daily diamonds, skins, weapons and more – Crypto News

Don't Miss

Expat exits: Why foreign-born CEOs don’t last the distance in Indian IT firms – Crypto News

Click to comment

Leave a Reply
Cancel reply

Multicloud Agility Comes to Financial Services

Technology7 days ago

Multicloud Agility Comes to Financial Services – Crypto News

OpenAI Confirms Data Breach—Here's Who Is Impacted

Cryptocurrency1 week ago

OpenAI Confirms Data Breach—Here’s Who Is Impacted – Crypto News

Ethereum Fusaka Will Be 'The Most Bullish Upgrade' Ever

Blockchain1 week ago

Ethereum Fusaka Will Be ‘The Most Bullish Upgrade’ Ever – Crypto News

Cryptocurrency1 week ago

New chains, old problems – Blockworks – Crypto News

Ether Eyeing $3.2K As Stablecoin Yields Remain Low: Santiment

Blockchain1 week ago

Ether Eyeing $3.2K As Stablecoin Yields Remain Low: Santiment – Crypto News

Vivo X300, X300 Pro launched in India, price starts at ₹75,999: Display, camera details and all you need to know

Technology6 days ago

Vivo X300, X300 Pro launched in India, price starts at ₹75,999: Display, camera details and all you need to know – Crypto News

UK Expands Crypto Reporting Rules as Global Tax Oversight Tightens

Blockchain1 week ago

UK Expands Crypto Reporting Rules as Global Tax Oversight Tightens – Crypto News

What’s Going On Behind The Scenes With XRP? Expert Answers

Blockchain1 week ago

What’s Going On Behind The Scenes With XRP? Expert Answers – Crypto News

Amundi tokenises money market fund on Ethereum

Cryptocurrency1 week ago

Amundi, Europe’s biggest asset manager, tokenises money market fund on Ethereum – Crypto News

Technology1 week ago

Bitcoin Maximalist Max Keiser Predicts ZEC Crash To $55 as Zcash Extends Decline – Crypto News

Bitcoin Must Break Key Supply Clusters To Regain ATH Momentum – Watch These Levels

Blockchain1 week ago

Bitcoin Must Break Key Supply Clusters To Regain ATH Momentum – Watch These Levels – Crypto News

Upbit $30 Million Hack Update: Authorities Link Breach To North Korean Hackers

Blockchain1 week ago

Upbit $30 Million Hack Update: Authorities Link Breach To North Korean Hackers – Crypto News

others1 week ago

Why Crypto Market Down Today? (29 Nov) – Crypto News

Zcash, Monero in Tight Ranking Race: Who Wins?

Cryptocurrency1 week ago

Zcash, Monero in Tight Ranking Race: Who Wins? – Crypto News

others1 week ago

Solana Price Outlook as CoinShares Withdraws SEC Filing for Staked Solana ETF – Crypto News

Cryptocurrency1 week ago

Crypto market mixed as Bitcoin tests $93K, Ethereum and XRP hit major resistance – Crypto News

Blockchain1 week ago

Ether Eyeing $3.2K As Stablecoin Yields Remain Low: Santiment – Crypto News

Business1 week ago

Crypto Exchange Bitget Donates $1.54M To Hong Kong Fire Victims – Crypto News

You’re using Gemini wrong: 7 Viral AI prompts that will instantly transform your presentations

Technology1 week ago

You’re using Gemini wrong: 7 Viral AI prompts that will instantly transform your presentations – Crypto News

Cryptocurrency1 week ago

XRP price prediction: ETF inflows, CME futures, and technical pressure align – Crypto News

Why CoinShares Just Quit the $600M XRP and SOL ETF Battle

Cryptocurrency1 week ago

Why CoinShares Just Quit the $600M XRP and SOL ETF Battle – Crypto News

WTI Crude Oil gains as Russia-Ukraine talks, OPEC+ meeting eyed

others1 week ago

WTI Crude Oil gains as Russia-Ukraine talks, OPEC+ meeting eyed – Crypto News

Pound Sterling Price News and Forecast: GBP/USD falls towards 1.3100

others1 week ago

GBP/USD edges lower to 1.3220 – Crypto News

Is Big Tech’s superintelligence narrative inflating the AI bubble?

Metaverse1 week ago

Is Big Tech’s superintelligence narrative inflating the AI bubble? – Crypto News

Technology1 week ago

Jerome Powell Speech Today: What To Expect as Fed Ends QT – Crypto News

Business1 week ago

Cardano News: ADA Ecosystem Proposes ‘Critical Integrations Budget’ To Advance Network Growth – Crypto News

Business1 week ago

Coinbase Submits Recommendations to CFTC on Crypto Market Rules – Crypto News

Bitcoin Forms Short-Term Bottom, $100,000 Rally in Sight

Blockchain1 week ago

Bitcoin Forms Short-Term Bottom, $100,000 Rally in Sight – Crypto News

AI’s positive impact in focus at Mint All About AI Tech4Good Summit 2025

Metaverse1 week ago

AI’s positive impact in focus at Mint All About AI Tech4Good Summit 2025 – Crypto News

'ZEC Is 20x Lower Than XRP': Solana Builder Breaks Silence After Zcash's 50% Crash

Cryptocurrency1 week ago

‘ZEC Is 20x Lower Than XRP’: Solana Builder Breaks Silence After Zcash’s 50% Crash – Crypto News

XAG/USD consolidates, $49.50 breakout in focus

others1 week ago

XAG/USD hits record $56 as bulls dominate – Crypto News

Business1 week ago

Hyperliquid Team Moves $90M HYPE as Network Becomes Top Fee Chain – Crypto News

Business1 week ago

Michael Saylor Hints Fresh Bitcoin Buy With “Green Dots” Tease – Crypto News

Bitcoin, Ethereum, and XRP Crash Triggering $637M in Liquidations

Cryptocurrency1 week ago

Bitcoin, Ethereum, and XRP Crash Triggering $637M in Liquidations – Crypto News

others7 days ago

Ethereum Price Crashes Below $3,000 as $500M Longs Liquidated: What’s Next? – Crypto News

Turkmenistan joins global crypto regulation push with sweeping new digital asset law

Cryptocurrency1 week ago

Turkmenistan joins global crypto regulation push with sweeping new digital asset law – Crypto News

others1 week ago

21Shares XRP ETF To Begin Trading on Monday as Institutional Inflows Hit $666 Million – Crypto News

Business1 week ago

China Begins Policy Talks to Crack Down on Stablecoin and Crypto Payments – Crypto News

Blockchain1 week ago

SUI Climbs Into High-Risk Territory As Wave 4 Nears Its Exhaustion Point – Crypto News

Business1 week ago

Arthur Hayes Predicts Bitcoin Rally To $500K By Next Year Over Fed Easing – Crypto News

Remittix vs. Digitap ($TAP): Get in the action for $1 million in Black Friday bonuses

Cryptocurrency1 week ago

Remittix vs. Digitap ($TAP): Get in the action for $1 million in Black Friday bonuses – Crypto News

Business1 week ago

Robert Kiyosaki Recommends Bitcoin and Ethereum as Hedge Against Potential Global Crisis – Crypto News

Cryptocurrency1 week ago

Peter Schiff Predicts Bitcoin Decline Will Extend Into December as BTC Closes Out Red November – Crypto News

Technology1 week ago

Ethereum Price Prediction 2025: How High Can ETH Go by Year-End? – Crypto News

others1 week ago

BNB Chain Taps Arbitrum Veteran Nina Rong to Lead Ecosystem Growth – Crypto News

others1 week ago

Is Kalshi Manipulating Prediction Markets? Platform Hit With Lawsuit Over Violations – Crypto News

GBP/JPY steadies as firm Tokyo inflation revives BoJ rate-hike speculation

others1 week ago

GBP/JPY steadies as firm Tokyo inflation revives BoJ rate-hike speculation – Crypto News

Canada GDP smashes expectations at 2.6% – TDS

others1 week ago

Canada GDP smashes expectations at 2.6% – TDS – Crypto News

others1 week ago

Grayscale Cleared to Launch First Spot Chainlink ETF This Week Amid Rising Demand – Crypto News

Cryptocurrency1 week ago

Astrology for traders – Blockworks – Crypto News

Crypto News

AI is a particularly well-suited tech trajectory for India: Cerebras’ Feldman – Crypto News

Technology

AI is a particularly well-suited tech trajectory for India: Cerebras’ Feldman – Crypto News

How is this large chip (CS-3) helping train AI models faster, and how do you see businesses, academic institutions and governments leveraging it?

Please share some examples of how companies across sectors are working with you to leverage this big chip.

How’s your partnership with Abu Dhabi-based Group 42 Holding panning out for the Arabic LLM and supercomputers you’re building with them?

What about India, where you have already had talks with certain companies and government officials, too?

But what about the cost comparison with GPUs?

But Nvidia is estimated to have 80-95% global market share in AI chips, and it will be very difficult to compete in this space.

How water and energy efficient are your big chips?

You’ve also said that this big chip has broken Moore’s law. What exactly do you mean?

Talking about crazy, what are your quick thoughts on artificial general intelligence (AGI)?

In this context, what should CXOs keep in mind when implementing AI, GenAI projects?

You may like

Leave a Reply Cancel reply

Leave a Reply

Trending

Leave a Reply
Cancel reply