

Technology
Mint Explainer: The mercurial rise of India-focused LLMs – Crypto News
Aggarwal, thus, joins the growing ranks of Indian companies that are building large language models (LLMs) trained on Indian languages. The companies include Bhashini–a unit of the national language translation mission by the ministry of electronics and information technology (Meity); Tech Mahindra’s Indus project; AI4Bharat at IIT-Madras; Project Vaani–part of the Bhasha AI project of ARTPARK and the Indian Institute of Science’s pan-India language initiatives; Sarvam AI’s OpenHathi series; and CoRover.ai’s BharatGPT.
Generative AI, or GenAI, refers to the ability of LLM-powered chatbots such as ChatGPT to create new content, including audio, code, images, text, simulations, and videos (hence the term, multimodal). GenAI systems fall under the broad category of machine learning, but unlike traditional ML that can analyse data patterns to make predictions, these systems create entirely new content with the help of ‘prompts’.
That said, can Ola’s Aggarwal do a Google Gemini or OpenAI’s GPT-4? And why is the parent of electric cars and scooters, Ola Electric, and ride-sharing startup, Ola Cabs, dabbling with foundational models, data centres, and silicon chips that require a lot of investment?
What’s Krutrim got to do with Ola?
Aggarwal’s Krutrim announcement comes at a time when the government is set to unveil its AI policy under the India AI programme on 10 January, which will include a policy framework for public-private partnership models on development of AI databases in Indic languages, as well as indigenous compute capacities, according to Union minister of state for IT, Rajeev Chandrasekhar. But the release of Krutrim’s base foundation model also comes at a time when Ola Electric is gearing up to file for an IPO.
Backed by SoftBank, Ola Electric is targeting a valuation of $7-8 billion by early 2024. While that figure’s much higher than the company’s current estimated worth of about $3.6 billion, it’s closer to Ola Electric’s estimated valuation of $7.3 billion as at the end of 2021.
Ola Electric plans to use the funds raised from the IPO for expanding its electric vehicle business and establishing a dedicated lithium-ion cell manufacturing unit.
Aggarwal has clarified that Krutrim is a “separate business altogether”, and will not “be integrated at a transactional level”.
“There are some entities that I own 100%—this is under my company, and not part of Ola or Ola Electric’s corporate structure,” he said. Aggarwal did say Krutrim had “some investments into (Ola Electric)”, but did not disclose any details.
Further, in a presentation, Aggarwal said that all Ola group companies were “already using Krutrim for a lot of their internal workloads, be it customer support, voice and chat, customer sales calls, and for other processes…”
This clearly implies that Krutrim’s products and services will be cross-sold to enhance the offerings of the group companies.
How’s GenAI used in vehicles?
The use of generative AI in the auto sector is not new. Mercedes-Benz, for instance, recently used ChatGPT to power voice assistants in a beta program available to more than 900,000 vehicles.
Also consider the example of a Formula E electric race car, the GENBETA, an enhanced GEN3 race car. The GEN3 is the fastest, lightest, electric race car with a top speed of more than 322 kmph, and is used by the 11 teams and 22 drivers in the ABB FIA Formula E World Championship.
Google Cloud provided generative AI to analyse the drivers’ runs. Additionally, experts from McKinsey & Co.’s AI arm, called QuantumBlack, built data and analytics components to create the driver interface that analysed and queried data in real-time using generative AI.
According to Nvidia, generative AI is also enabling new breakthroughs in autonomous vehicle development in research areas including the use of neural radiance field technology to turn recorded sensor data into fully interactive 3D simulations. These digital twin environments, as well as synthetic data generation, can be used to develop, test and validate autonomous vehicles at incredible scale.
Aggarwal’s AI ambitions, however, appear to go far beyond just the auto sector, given that the Ola group’s businesses extend beyond mobility to financial services offerings including payment systems, insurance agents and cloud kitchens.
What’s the plan with Krutrium?
Krutrim’s AI model, according to the company, has been trained on more than 2 trillion tokens (loosely, numerical representation of pieces of words and sub-words that an LLM can understand. For instance, banana is a word, while homework can be split into two words–home and work). While Aggarwal compared Krutrim to GPT4, the latter has been trained on more than 13 trillion tokens.
That said, the strength of Krutrim may lie in its understanding of 20 Indian languages and generating content in 10 Indian languages, including Marathi, Hindi, Telugu, Kannada, and Odiya. Aggarwal said Krutrim has been “trained on 20 times more Indic tokens than any other model, ensuring a deep understanding of Indian culture, values, and aspirations”.
While there’s a waitlist if you register for the base LLM model at OlaKrutrim, Aggarwal plans to make the “whole platform” available for developers to build application programming interfaces, or APIs, for enterprise applications, in February. Ola also plans to launch Krutrim Pro in the next quarter.
Can Ola afford a Krutrim?
That said, building a foundational model from scratch is an expensive affair. OpenAI’s GPT was in the works for more than six years and cost upwards of $100 million and used an estimated 30,000 graphics processing units (GPUs). Aggarwal has not disclosed any details of his investments, or the costs, in Krutrim so far.
In FY22, Ola derived about 61% of its revenue, or ₹1,208.6 crore, from its ride-hailing business in India, while posting a loss of ₹101 crore. Financial services comprised a small part of the revenue. The group posted a consolidated operating revenue of ₹1,970.4 crore in FY22, rising from ₹983.2 crore in the year before. Ola’s net losses, though, widened in FY22 to ₹1,522.33 crore from ₹1,116.6 crore in the previous year.
That said, since Krutrim is a separate business, Aggarwal may be bootstrapping the venture, given that he has a personal net worth of a little over $1.4 billion. One, however, will have to wait till Aggarwal discloses more details about his investment plans in designing silicon chips and building the LLM ecosystem.
How can GenAI work with regional languages?
The fact remains that even though India is home to more than 400 languages, making it one of the most linguistically diverse countries in the world, most foundation models and LLMs are trained primarily using internet data, which is predominantly English. As per Statista, English was the most popular language for web content, representing nearly 59% of websites as of January this year. Russian ranked second with 5.3% of web content, followed by Spanish with 4.3%.
While one can only but laud the contribution of India’s Centre for Development of Advanced Computing (C-DAC) in developing the country’s multilingual ecosystem over the past three decades, the fact remains that AI models need to be trained using regional languages to bridge the digital divide in countries like India, which is why efforts such as Krutrim make a lot of sense.
Krutrim, on its part, says it will tap Bhashini, whose technology comprises automatic speech recognition, optical character recognition, natural language understanding, machine translation, and text-to-speech. The Bhashini platform, for instance, uses optical character recognition (OCR) to extract text from data of printed materials such as brochures to train AI models in 14 languages.
But getting local datasets is a challenge, according to the CEO of Bhashini, Amitabh Nag, who pointed out that many of the 22 official Indian languages do not have digital data, which makes it challenging to build and train an AI model. Bhashini has so far spent $6-7 million to collect data from different sources and employed more than 200 people to collect data (text as well as speech) and feed it into the system, following which the data is curated, annotated, and labelled.
What other Indic LLMs are in the works?
- The ‘Nilekani Center at AI4Bharat’ (named after Nandan Nilekani), launched at the Indian Institute of Technology-Madras in July last year, is building open-source language AI for Indian languages, including datasets, models, and applications. The project is supported by EkStep Foundation, Microsoft’s Research Lab, and the India Development Center.
- Sarvam AI, a generative AI startup founded by Vivek Raghavan and Pratyush Kumar (both co-founders of AI4Bharat), is developing LLMs specifically for India–the OpenHathi Series. The startup will focus on training AI models to support the diverse set of Indian languages and voice-first interfaces. It will work with Indian enterprises to co-build domain-specific AI models on their data, and also plans to use GenAI atop the India stack (Aadhaar, UPI, Account Aggregator, etc.) “specifically for public-good applications”. Sarvam AI is partnering with AI4Bharat, which has “contributed language resources and benchmarks”.
- Bangalore-based AI and Robotics Technology Park (ARTPARK) and the Indian Institute of Science are partnering with Google India to launch a large language model called Project Vaani. This is part of the Bhasha AI project of ARTPARK and IISc’s pan-India language initiatives, which includes SYSPIN (Synthesizing Speech in Indian languages) and RESPIN (Recognizing Speech in Indian languages). While Google plans to collect speech samples from 773 districts, the initiative is currently focused on 80 districts of 10 states. It is expected to expand over the next couple of years, with over 150,000 hours of curated speech and 100 million sentences of text in Indian scripts.
- Cloud-based communications startup Ozontel, too, recently partnered with Swecha Telangana at the Indian Institute of Information Technology-Hyderabad to compile a Telugu stories dataset, aimed at building a Telugu LLM. About 8,000 students from 20 colleges participated to create 40,000 pages of Telugu content.
- CoRover has launched its own indigenous LLM called BharatGPT, which is available in more than 12 Indian languages in partnership with Bhashini. CoRover Pvt. Ltd currently offers AI Virtual Assistants (chatbots, voicebots, videobots) to organisations including IRCTC, LIC, the Indian Navy (GRSE), Max Life Insurance, and NPCI. The company is hosted on the Google CloudPlatform (GCP), and Google’s Vertex AI is integrated with CoRover’s conversational AI platform, allowing organisations to utilise Google’s AI services.
- And in another effort in the auto sector, the Mahindra Group said in August that it aimed to construct an indigenous LLM specifically designed to converse in a multitude of Indic languages. In the first phase, the Indus Project targets the inclusion of a remarkable 40 Hindi dialects, paving the way for an ever-expanding roster. Tech Mahindra acknowledges it has “drawn inspiration from ‘Bhashini’… to amass datasets on Indic languages”.
-
Cryptocurrency1 week ago
Whale Sells $407K TRUMP, Loses $1.37M in Exit – Crypto News
-
Blockchain1 week ago
Robinhood Dealing With Fallout of Tokenized Equities Offering – Crypto News
-
Cryptocurrency1 week ago
Satoshi-Era Bitcoin Whale Moves Another $2.42 Billion, What’s Happening? – Crypto News
-
Blockchain6 days ago
Ripple and Ctrl Alt Team to Support Real Estate Tokenization – Crypto News
-
Technology6 days ago
Fed Rate Cut Odds Surge As Powell’s Future Hangs In The Balance – Crypto News
-
Cryptocurrency1 week ago
Bitcoin Breaches $120K, Institutional FOMO Takes and House Debate Propel Gains – Crypto News
-
Technology6 days ago
Fed Rate Cut Odds Surge As Powell’s Future Hangs In The Balance – Crypto News
-
Cryptocurrency1 week ago
Cardano’s $1.22 target: Why traders should be aware of THIS ADA setup – Crypto News
-
Cryptocurrency1 week ago
Why Is Bitcoin Up Today? – Crypto News
-
Cryptocurrency1 week ago
Strategy Resumes Bitcoin Buys, Boosting Holdings to Over $72 Billion in BTC – Crypto News
-
Business1 week ago
Pepe Coin Rich List June 2025: Who’s Holding Highest PEPE as it Nears Half a Million Holders? – Crypto News
-
others6 days ago
EUR/USD recovers with trade talks and Fed independence in focus – Crypto News
-
Cryptocurrency5 days ago
Bitcoin trades near $119K after new all-time high; Coinbase rebrands wallet to ‘Base App’ – Crypto News
-
Business5 days ago
XLM Is More Bullish Than ETH, SOL, And XRP, Peter Brandt Declares – Crypto News
-
Cryptocurrency1 week ago
Stellar [XLM] bulls exhausted after rally – Is a pullback nearby? – Crypto News
-
Cryptocurrency1 week ago
It’s a Statement, Says Bitfinex Alpha – Crypto News
-
others6 days ago
Top Crypto Exchange by Trading Volume Binance Announces Airdrop for New Ethereum (ETH) Ecosystem Altcoin – Crypto News
-
others6 days ago
VanEck Details Key Drivers Boosting Bitcoin Price, Including Corporate Treasury Demand, ETF Flows and More – Crypto News
-
Business6 days ago
XRP Lawsuit Update: Ripple Paid $125M in Cash, Settlement Hinges on Appeal – Crypto News
-
De-fi1 week ago
OG Large-Cap Altcoins Lead Market Rally – Crypto News
-
Technology1 week ago
Amazon Prime Day Sale 2025: Best earphones and headphone deals with up to 70% off – Crypto News
-
others1 week ago
JPMorgan Chase CEO Says Traders May Be Seriously Mistaken on Fed Rate Cuts: Report – Crypto News
-
Blockchain1 week ago
Ziglu Faces $2.7M Shortfall as Crypto Fintech Enters Special Administration – Crypto News
-
Blockchain1 week ago
UK Banks Should not Issue Stablecoins – Crypto News
-
Cryptocurrency1 week ago
Donald Trump Jr. backs social media startup aiming to become a crypto powerhouse – Crypto News
-
Blockchain1 week ago
The Bitcoin Liquidity Supercycle Has Just Begun: Hedge Fund CEO – Crypto News
-
Technology1 week ago
Google, Anthropic, OpenAI and xAI join US defence to tackle national security with AI – Crypto News
-
Business1 week ago
CME XRP Futures Hit $1.6B In Total Trading Volume Since Launch – Crypto News
-
Cryptocurrency1 week ago
Fed’s Hammack Raises Inflation Concerns Amid Push For Interest Rate Cut – Crypto News
-
Metaverse7 days ago
Why voice is emerging as India’s next frontier for AI interaction – Crypto News
-
Metaverse7 days ago
Nvidia’s Jensen Huang says AI ‘fundamental like electricity’, praises Chinese models as ‘catalyst for global progress’ – Crypto News
-
Cryptocurrency6 days ago
1inch price forecast: 1INCH hits 7-month high after double digit gains – Crypto News
-
Cryptocurrency6 days ago
1inch price forecast: 1INCH hits 7-month high after double digit gains – Crypto News
-
Business6 days ago
Ethereum Price Prediction- Bulls Target $3,700 As ETH Treasury Accumulation Soars – Crypto News
-
others5 days ago
GBP/USD rallies on US PPI dip and Trump’s potential Powell removal – Crypto News
-
others5 days ago
GBP/USD rallies on US PPI dip and Trump’s potential Powell removal – Crypto News
-
Cryptocurrency5 days ago
Anarchy, crime and stablecoins – Blockworks – Crypto News
-
Cryptocurrency1 week ago
Friday charts: The rise of zero-sum thinking – Crypto News
-
others1 week ago
Crypto Hacker Who Drained $42,000,000 From GMX Goes White Hat, Returns Funds in Exchange for $5,000,000 Bounty – Crypto News
-
Cryptocurrency1 week ago
Pump.fun Concludes $500M ICO in 12 Minutes — But Something Doesn’t Add Up – Crypto News
-
Cryptocurrency1 week ago
Why Are So Many Crypto Games Shutting Down? Experts Weigh In – Crypto News
-
De-fi1 week ago
Robinhood Opens Ether and Solana Staking to US Users – Crypto News
-
Cryptocurrency1 week ago
Top 3 altcoins under $1 worth watching: Sei, Ethena, Arbitrum – Crypto News
-
De-fi1 week ago
Ripple’s RLUSD Market Cap Passes $515M, Flips TrueUSD – Crypto News
-
others1 week ago
Bitcoin Critic Vanguard Becomes Strategy’s (MSTR) Largest Shareholder – Crypto News
-
De-fi5 days ago
U.S. Marshals Peg Federal Bitcoin Holdings at 28,988 Tokens Worth $3.4 B – Crypto News
-
Cryptocurrency5 days ago
Russia’s $85 Billion Sberbank to Launch Crypto Custody Services – Crypto News
-
Blockchain5 days ago
Nasdaq Exchange Files SEC Form to List Staking Ethereum ETF – Crypto News
-
Business1 week ago
Pump Token Surges Premarket Following Pump.fun $600M Raise – Crypto News
-
Technology1 week ago
V Guard INSIGHT-G BLDC fan review: Cool performer with a premium look – Crypto News