Metaverse
‘We want to fix the language gap in AI language models’, says Two Platforms’ Pranav Mistry – Crypto News
Backed by billionaire Mukesh Ambani’s Jio Platforms and South Korea’s Naver Corp, the artificial reality startup also plans to soon release an artificial intelligence (AI)-powered messaging and social app called Zappy in India, according to Two Platform’s founder and CEO, Pranav Mistry.
“Sutra is our mission to fix the language gap in AI language models. We are committed to pioneering AI solutions for non-English markets. We believe our Sutra models will unlock AI growth opportunities in large economies such as India, Korea, Japan, and the MEA (Middle East and Africa) region,” Mistry said in an interview with Mint.
But there are some basic differences “in our approach to building these models”, he insisted. First, unlike most other startups and companies that are building ‘local’ or ‘Indic’ LLMs for India by fine-tuning global LLMs, “we have built a foundational, and not a fine-tuned model,” he said.
General-purpose foundational models such as Google’s BERT and Gemini, OpenAI’s generative pre-trained transformer (GPT) variants, and Meta’s LlaMA series, have been pre-trained on humungous amounts of data from the internet, books, media articles, and other sources. But most of this training data is in English.
A Transformational Approach
Most companies in India are building their Indic LLMs atop these foundational models (hence they’re called ‘wrappers’) by fine tuning these general-purpose LLMs on a smaller, task-specific dataset (such as regional languages like Hindi, Marathi, Gujarati, Tamil, Telegu, Malayalam, etc., and their dialects), which allows the models to learn the nuances of the language and improves its performance.
Sutra, instead, uses two different transformer architectures. Developed by Google, transformers predict the next word in a sequence of text based on large, complex data sets. Since they process words in a single sequence while understanding their relationships with each other, transformers are very effective for tasks like translating languages.
The multilingual LLM Sutra, according to Mistry, has combined an LLM architecture with a Neural Machine Translation (NMT) one. The reason: while LLMs may struggle due to the lack of specialized training data while translating specific pairs of language, NMT systems are typically better equipped to translate idiomatic expressions and colloquial language.
Second, while “GPT-4 is great in Korean or Hindi, too, its size and cost make it more expensive for a country like India”, argued Mistry. The Sutra architecture “decouples concept learning (we learn concepts by associating new information with existing knowledge, such as learning that both apples and oranges are fruits) from language learning. So, when you use Sutra, the number of the tokens used are similar to using English tokens. This saves almost five to eight times in costs,” he explained.
Third, “our specialized NMT models are significantly smaller in parameter size, requiring much less data for training”, Mistry said. When you add more data, say Korean or some Indian language, you also increase the tokens (loosely, pieces of words and sub-words that an LLM can understand. For example, banana is a word, while homework can be split into two words, home and work). This makes the model bigger, but also slows it down. It increases the costs, too, since similar information content in English, when expressed in a language such as Hindi would need three to four times more tokens.
“Besides, in this approach, the quality of, say Hindi, can never surpass that of English in the original,” Mistry added. For instance, about 80% of a general-purpose foundational model pre-training would typically be from sources such as the internet, books, and media articles, which are mostly in English.
Innovation, Not Fine-tuning
However, if you’re fine-tuning this model with data in Hindi from India, for instance, “most of the data would be about cricket, data found on Twitter, or from people discussing news articles, etc., in Hindi. Hence, a Hindi language model built atop a foundational model that has pre-trained mostly on English will not be able to do full justice to the output in Hindi”.
“As an example, if you want to translate Gujarati to Tamil, most models first translate from Gujarati to English and then from English to Tamil, because that’s the data they have trained on. Our model does not do that, so we also require fewer tokens, which also lowers the cost of running the model,” he explained. Mistry adds that Two Platforms’ model is also aligned to human values, a process technically known as ‘AI alignment’.
Sutra, which is currently available in three versions—Light (56 billion parameters), Online (internet-connected multilingual model with 56 billion parameters), and Pro (150 billion parameters)—supports more than 50 languages, “of which 31 are fully tested”, according to Mistry. He emphasized that Sutra’s architecture and use of “synthetically translated data” not only lowers the computing costs of running these models, but also makes the model more efficient.
“Sutra maintains an impressive performance in English of 77% on the MMLU (massive multitask language understanding) benchmark. It also demonstrates superior and consistent performance in the range of 65-75% across languages. In contrast, many leading language models score closer to 25% on non-English MMLU tasks,” Mistry said.
Two Platforms uses its “in-house GPU (graphics processing unit) cluster and rents top-tier cloud GPUs when needed”. “As we expand, the rising costs of training will require us to create specialized models for different areas like images and video,” Mistry added. His company is also in the process of raising a Series A round “to accelerate the development of Sutra into a model-as-a-service (MaaS)” platform. In February 2022, Jio Platforms had invested $15 million in Two Platforms for a 25% equity stake, while a Naver Corp unit, Snow Corp., had invested $5 million.
Other than Sutra, India has Sarvam AI—a generative AI (GenAI) startup that has launched the Open Hathi series; Tech Mahindra’s Indus Project; the ‘Hanooman’ model that was jointly released this month by SML India and 3AI Holding, an Abu Dhabi-based investment firm; CoRover’s BharatGPT LLM-based chatbot; and Ola Cabs and Ola Electric co-founder Bhavish Aggarwal’s Krutrim AI. Meanwhile, the ‘Nilekani Center at AI4Bharat’ at IIT Madras, too, released ‘Airavata’ an open-source LLM for Indian languages.
In a wider context, the LLM market is projected to grow from $6.4 billion in 2024 to $36.1 billion by 2030, according to a research report released by MarketsandMarkets in March. Moreover, India-specific LLMs are certainly the need of the hour but “we need faster, more affordable, multilingual, and energy-efficient LLMs that can bridge the existing market gaps”, concluded Mistry, who hopes Sutra will be one of those companies that “fills this gap”.
-
Technology1 week ago
Tether Urged To Unfreeze $344M In USDT Linked To Terror Activities – Crypto News
-
Business1 week ago
Bitget Enters Mexico Market With SAT and UIF Registration – Crypto News
-
others1 week ago
$2.6 Billion in Bitcoin, ETH, XRP, Solana Options Expire Today, Experts Raise Concerns – Crypto News
-
Blockchain1 week agoBitcoin Treasury Co Strategy Announces $1.5B Convertible Note Buyback – Crypto News
-
others1 week ago
Bitget Enters Mexico Market With SAT and UIF Registration – Crypto News
-
others1 week agoE-Estate Announces 1 Year Live: Washington DC Summit as Real Estate Tokenization Enters Its Next Phase – Crypto News
-
others1 week ago
Why Bitcoin Price Could Reach $88,000 Despite Rising Odds Of Fed Rate Hikes – Crypto News
-
Technology1 week agoI will be skipping a gaming laptop for a thin and light laptop in 2026 and you should too – Crypto News
-
Business1 week ago
Strategy Eyes Bitcoin Sale to Fund $1.5B Convertible Note Buyback, MSTR Stock Dips – Crypto News
-
De-fi1 week agoKraken moves Bitcoin to Chainlink as bridge fears spread across DeFi – Crypto News
-
Business1 week ago
CME and NYSE Push for U.S. Regulatory Oversight of Hyperliquid – Crypto News
-
others1 week agoFinancial Firm Hit by Major Cybersecurity Incident, Data of 123,158 Americans Potentially Exposed – Crypto News
-
Cryptocurrency1 week agoBitcoin has one level left before macro pressure opens the path to $75k as Treasury yields extend two-day correction – Crypto News
-
Blockchain1 week agoPoland Approves Crypto Bill Amid Looming MiCA Deadline – Crypto News
-
Technology1 week ago
Pi Network Price Prediction After Creator-Focused App Studio Upgrade – Crypto News
-
Business1 week ago
Why Is The Crypto Market Bleeding Today? – Crypto News
-
Business1 week ago
How High Will XRP Price Go After CME Adds Ripple to NASDAQ Crypto Index on June 8? – Crypto News
-
others5 days agoSui Launches Gasless Stablecoin Transfers With Support From Fireblocks – Crypto News
-
others5 days agoSui Launches Gasless Stablecoin Transfers With Support From Fireblocks – Crypto News
-
Technology4 days ago
Breaking: Crypto Exchange Blockchain.com Secretly Files For IPO After Elon Musk’s SpaceX – Crypto News
-
Cryptocurrency1 week agoTrump family trust bought Coinbase and these crypto-related stocks in Q1, ethics filing shows – Crypto News
-
others1 week agoBank Employee Hijacks 78-Year-Old Dementia Patient’s Account, Drains $125,000 via Checks, Debit Card Use and Wire Transfers: DOJ – Crypto News
-
De-fi1 week agoDeFi Yields Are Too Damn Low! Here’s Why – Crypto News
-
Technology1 week agoTech CEOs summoned to Congress for another hearing on social medias risks for children – Crypto News
-
Blockchain1 week agoUS CLARITY Act Brings ‘Major Spike of Euphoria’ to Bitcoin: Santiment – Crypto News
-
Blockchain1 week agoEthereum Sell Signal That Last Preceded A 63% Drop Flashes Again – Crypto News
-
Technology1 week agoGoogle’s new Gemini Intelligence’s ‘advanced’ spec requirements may even exclude older Pixel and Samsung flagships – Crypto News
-
others1 week ago
Crypto Weekly Recap: CLARITY Advances, US Inflation Soars, Wall Street Raises COIN Stock Target, Strategy Resumes Bitcoin Buys – Crypto News
-
Business1 week ago
XRP Trading Volume Tops Bitcoin on Upbit as Hana Bank Acquires Stake in Dunamu – Crypto News
-
Blockchain1 week agoSolana Eyes $117 Breakout — If Bulls Can Crush This Key Resistance – Crypto News
-
Business1 week ago
Strategy’s STRC Draws $2 Billion In Capital To Buy More Bitcoin – Crypto News
-
Cryptocurrency1 week agoBitcoin ETF flows reverse as funds shed $1B on inflation fears – Crypto News
-
Technology1 week agoAI job takeover fears rise: 10 human skills that machines may still struggle to replace – Crypto News
-
Blockchain1 week agoUS CLARITY Act Will Be a ‘Boon For Domestic Innovation’: A16z – Crypto News
-
Business1 week ago
Michael Saylor Teases ‘Big’ Bitcoin Buy For Strategy – Crypto News
-
Technology7 days agoJury rules against Elon Musk in his feud with OpenAI, saying he filed his lawsuit too late – Crypto News
-
others7 days ago
Goldman Sachs Closes Solana & XRP ETF Stake, Dumps 70% ETH ETF Holdings – Crypto News
-
Cryptocurrency6 days agoSpaceX IPO bets push valuation above $2 trillion on Hyperliquid – Crypto News
-
Blockchain4 days agoCrypto Access To Banks In Focus After Trump’s New Executive Order – Crypto News
-
others1 week ago
What’s Next for Dogecoin Price, Recovery or Another Drop? – Crypto News
-
Cryptocurrency1 week agoUS Treasury yields surge to new highs as liquidity tightens, pushing Bitcoin back below $82,000 resistance – Crypto News
-
others1 week ago‘The Buildup Is Sincerely Strong’: Michaël van de Poppe Says Bitcoin Could See a Fast Move to a Four-Month High – Here Are His Targets – Crypto News
-
Technology1 week ago
BREAKING: THORChain Suffers $10M Exploit Across Bitcoin, Ethereum, BSC, Base Chains – Crypto News
-
Business1 week ago
Bitget Introduces Unified AI Trading Ecosystem, Surpasses 1M Users and $1.2B AI Agent Trading Volume – Crypto News
-
Blockchain1 week agoOpenAI and Malta Partner to Give All Citizens Free ChatGPT Plus Access – Crypto News
-
Technology1 week ago
Bhutan Official Speaks Up On Claims of Selling $1 Billion In Bitcoin – Crypto News
-
Blockchain1 week agoIf You’re Holding XRP, This Pundit Says You Should See This – Crypto News
-
Metaverse1 week agoMicrosoft AI Chief Mustafa Suleyman has a grim warning for every office worker- Within 18 months… – Crypto News
-
Cryptocurrency1 week agoBitcoin ETF flows reverse as funds shed $1B on inflation fears – Crypto News
-
Technology1 week agoAI job takeover fears rise: 10 human skills that machines may still struggle to replace – Crypto News
