

Metaverse
Meta’s Llama 2: Why open-source LLMs are the joker in the generative AI pack – Crypto News
Leslie D’Monte
The Generative AI race just got hotter with Meta releasing the second version of its free open-source large language model, Llama 2, for research and commercial use, thus providing an alternative to the pricy proprietary LLMs sold by OpenAI like ChatGPT Plus and Google Bard while giving a boost to open source LLMs.
Developers began flocking to LLaMA–Meta’s open-source LLM that was released in February (https://ai.meta.com/blog/large-language-model-llama-meta-ai/). Researchers made more than 100,000 requests for Llama 1, according to Meta. LLaMA requires “far less computing power and resources to test new approaches, validate others’ work, and explore new use cases”, according to Meta. Meta made LLaMA available in several sizes (7B, 13B, 33B, and 65B parameters — B stands for billion) and had also shared a LLaMA model card that detailed how it built the model, very unlike the lack of transparency at OpenAI.
The Generative Pre-trained Transformer series (GPT-3), on the other hand, has 175 billion parameters while GPT-4 was rumored to have been launched with 100 trillion parameters, a claim that was dismissed by OpenAI CEO Sam Altman. Foundation models train on a large set of unlabelled data, which makes them ideal for fine-tuning a variety of tasks. For instance, ChatGPT based on GPT 3.5 was trained on 570GB of text data from the internet containing hundreds of billions of words, including text harvested from books, articles, and websites, including social media.
However, according to Meta, smaller models trained on more tokens—pieces of words—are easier to re-train and fine-tune for specific potential product use cases. Meta says it has trained LLaMA 65B and LLaMA 33B on 1.4 trillion tokens. Its smallest model, LLaMA 7B, is trained on one trillion tokens. Like other LLMs, LLaMA takes a sequence of words as input and predicts the next word to generate text recursively. Meta says it chose a text from the 20 languages with the most speakers, focusing on those with Latin and Cyrillic alphabets, to train LLaMa.
The newly-released Llama 2, according to Meta, is a collection of pretrained and fine-tuned LLMs, ranging from 7 billion to 70 billion parameters. Meta has also released Llama 2-Chat, a fine-tuned version of Llama 2 that is optimized for dialogue with the same parameter ranges. Meta claims that these models “have demonstrated their competitiveness with existing open-source chat models, as well as competence that is equivalent to some proprietary models on evaluation sets we examined” but acknowledges that they still lag other models like OpenAI’s GPT-4.
One may note, though, that scraping of data has become a thorny issue and the reason for many class-action suits too. In a 157-page class action lawsuit filed on June 28 in the US District Court, Northern District of California, the plaintiffs alleged that the defendants have used “unlawful and harmful conduct in developing, marketing, and operating their AI products including ChatGPT-3.5 , ChatGPT-4.0, Dall-E, and Vall-E”, which use “stolen private information” from hundreds of millions of internet users, including children of all ages, without their informed consent or knowledge, and continue to do so to develop and train the products (https://www.livemint.com/news/india/why-is-musk-angry-and-why-is-openai-being-sued-11688448802931.html).
Meta says Llama 2 has been trained on a mix of data from publicly-available sources, which does not include data from Meta’s products or services. The company adds that it has made an effort to remove data from certain sites known to contain a high volume of personal information about private individuals. Llama 2 was trained on 2 trillion tokens of data “as this provides a good performance–cost trade-off, up-sampling the most factual sources in an effort to increase knowledge and dampen hallucinations”, according to Meta. It, however, adds that since the training corpus was mostly in English, the model may not be suitable for use in other languages.
The AI model and its new version of Llama 2 will be distributed by Microsoft through its Azure cloud service and will run on the Windows operating system (https://www.livemint.com/ai/artificial-intelligence/meta-joins-hands -with-microsoft-for-its-latest-ai-model-llama-2-likely-to-beat-chatgpt-and-bard-11689698965564.html). It’s also available on Amazon Web Services (AWS), Hugging Face and other providers too, Chief Scientist at Meta Yann LeCun tweeted soon after the release.
According to Jim Fan, senior AI scientist at Nvidia, Llama 2 is likely to cost a little over $20 million to train. He believes that Meta has done “an incredible service to the community” by releasing the model with a commercially-friendly license. “AI researchers from big companies were wary of Llama-1 due to licensing issues, but now I think many of them will jump on the ship and contribute their firepower,” Fan tweeted after Llama 2’s release.
Fan also complemented Meta on the human study they did to evaluate its efficiency. Meta’s team did a human study on 4000 prompts to evaluate Llama-2’s helpfulness. “I trust these real human ratings more than academic benchmarks, because they typically capture the “in-the-wild vibe” better,” said Fan. He added, though, that Llama-2 is not as good as GPT-3.5 as yet, mainly because of its weak coding abilities. But he added that “Meta’s team goes above and beyond on AI safety issues. In fact, almost half of the paper is talking about safety guardrails, red-teaming, and evaluations. A round of applause for such responsible efforts!” According to Fan, Llama-2 will dramatically boost multimodal AI and robotics research.
In my earlier column titled ‘Five trends that may change the course of Generative AI models (https://www.livemint.com/mint-top-newsletter/techtalk12052023.html)’, I had spoken about the rise of smaller open- source large language models (LLMs). Big tech companies like Microsoft and Oracle were strongly opposed to open-source technologies but embraced them after realizing that they couldn’t survive without doing so. Open-source language models are demonstrating this once again.
A couple of months back, a Google employee had claimed in a leaked document accessed by Semianalysis that, “Open-source models are faster, more customizable, more private, and pound-for-pound more capable. They are doing things with $100 and 13B params (parameters) that we struggle with at $10M (million) and 540B (billion). And they are doing so in weeks, not months.” The employee believes that people will not pay for a restricted model when free, unrestricted alternatives are comparable in quality. He opined that “giant models are slowing us down. In the long run, the best models are the ones which can be iterated upon quickly. We should make small variants more than an afterthought now that we know what is possible in the < 20B parameter regime".
Google may or may not subscribe to this point of view, but the fact is that open-source LLMs have not only come of age but are providing developers with a lighter and much more flexible option.
As an example, Low-Rank Adaptation of Large Language Models (LoRA) claims to have reduced the number of trainable parameters, which has lowered the storage requirement for LLMs adapted to specific tasks and enables efficient task-switching during deployment without inference latency. “LoRA also outperforms several other adaptation methods, including adapter, prefix-tuning, and fine-tuning”. In simple terms, developers can use LoRA to fine-tune LLaMA.
Pythia (from EluetherAI, which itself is likened to an open-source version of OpenAI) comprises 16 LLMs that have been trained on public data and range in size from 70M to 12B parameters.
Databricks Inc. released its LLM called Dolly in March, which it “trained for less than $30 to exhibit ChatGPT-like human interactivity”. A month later, it released Dolly 2.0–a 12B parameter language model based on the EleutherAI Pythia model family “and fine -tuned exclusively on a new, high-quality human-generated instruction following dataset, crowdsourced among Databricks employees”. The company has open-sourced Dolly 2.0 in its entirety, including the training code, dataset and model weights for commercial use, enabling any organization to create, own, and customize powerful LLMs without paying for API access or sharing data with third parties.
Hugging Face’s BigScience Large Open-science Open-access Multilingual Language Model (BLOOM) has 176 billion parameters and is able to generate text in 46 natural languages and 13 programming languages. Researchers can download, run and study BLOOM to investigate the performance and behavior of recently-developed LLMs.
Falcon, a family of LLMs developed by the Technology Innovation Institute in Abu Dhabi and released under the Apache 2.0 license, comprises two models — the Falcon-40B and the smaller Falcon-7B. According to Hugging Face, “The Falcon models still include some curated sources in their training (such as conversational data from Reddit), but significantly less so than has been common for state-of-the-art LLMs like GPT-3 or PaLM. “
The open-source LLM march has only begun.
Catch all the business news, market news, breaking news Events and Latest News Updates on Live Mint. Download Mint News App to get Daily Market Updates.
Updated: 19 Jul 2023, 02:21 PM IST
-
Blockchain6 days ago
The CFO and Treasurer’s Guide to Digital Assets – Crypto News
-
Cryptocurrency1 week ago
Famous Crypto Analyst Advises to Sell NVIDIA Stock: Here’s Why – Crypto News
-
Business1 week ago
Binance Enables Apple & Google Pay Features With This Latest Partnership – Crypto News
-
Cryptocurrency1 week ago
Tariffs Are Just the Tip of the Iceberg, Warns Billionaire Investor Ray Dalio – Crypto News
-
Cryptocurrency1 week ago
BitMEX Study Reveals Exchange-Specific Price Trends for Perpetual Swaps Across Leading Exchanges – Crypto News
-
Technology1 week ago
Apple could give iPhone a radical makeover for its 20th anniversary, report says – Crypto News
-
Business1 week ago
Will Dogecoin Price Ever Reach $1? Top Analysts Weigh In – Crypto News
-
Cryptocurrency1 week ago
Dire Wolf Solana Meme Coin Soars to $13.6M Market Cap After ‘De-Extinction’ – Crypto News
-
Technology1 week ago
Apple exported iPhones worth ₹1.5 trillion from India in FY25: Union Minister Ashwini Vaishnaw – Crypto News
-
Blockchain1 week ago
Bitcoin Price Recovery In Play—But Major Hurdles Loom Large – Crypto News
-
others1 week ago
John Deaton Highlights Ripple’s Journey from Legal Struggle To ETF Launches – Crypto News
-
Technology1 week ago
Can It Take The Baton And Initiate The Next Altcoin Rally As The Market Strengthens? – Crypto News
-
Cryptocurrency1 week ago
The Downside Prevails As Cardano Price Rejected at $0.60 – Crypto News
-
Cryptocurrency1 week ago
Dogecoin hits multi-month low, but is a market reset on the way? – Crypto News
-
Technology1 week ago
Musks DOGE using AI to snoop on U.S. federal workers, sources say – Crypto News
-
Cryptocurrency1 week ago
ETH Hits 2-Year Low as BTC, XRP Hold Support – Crypto News
-
Cryptocurrency1 week ago
Peter Schiff Cautions US Against Trade War Escalation With China – Crypto News
-
Blockchain6 days ago
How to mine Bitcoin at home in 2025: A realistic guide – Crypto News
-
Technology1 week ago
iPad Air M3 (2025) Review: Still the most practical iPad – Crypto News
-
Business1 week ago
Cathie Wood’s Ark Invest Loads $13 Million of Coinbase Stock, COIN Price Reversal Soon? – Crypto News
-
others1 week ago
Australia Shuts Over 90 Companies Linked To Pig Butchering Schemes – Crypto News
-
Business1 week ago
“Perfect Time to Buy” – Patterns Point to a Pepe Coin Price Resurgence – Crypto News
-
Cryptocurrency1 week ago
Bitcoin is highly correlated with stock market since August 2024 – Crypto News
-
Business1 week ago
Sui Price Recovers As CBOE Files To List SUI ETF – Crypto News
-
Technology6 days ago
Microsoft’s Greatest Hits and Epic Fails: A 50-Year Wild Ride – Crypto News
-
Blockchain1 week ago
Cardano (ADA) Eyes Resistance Break—Failure Could Spark Fresh Losses – Crypto News
-
Technology1 week ago
PumpFun Livestream Feature Is Back — But What’s Changed? – Crypto News
-
Business1 week ago
Is Ripple Hinting at Cardano Partnership? – Crypto News
-
Blockchain1 week ago
Cathie Wood’s ARK bags $26M in Coinbase shares, unloads Bitcoin ETF – Crypto News
-
Technology1 week ago
China Retaliates, Triggering a Dead Cat Bounce in Crypto – Crypto News
-
Business1 week ago
Solana Unveils Confidential Balances Token Extension – Crypto News
-
others1 week ago
Top 3 Reasons XRP Price May Surge as Analyst Delivers a $693 Billion Prediction – Crypto News
-
Cryptocurrency1 week ago
BTC Risks Further Downside if it Fails to Reclaim This Resistance – Crypto News
-
Cryptocurrency1 week ago
OpenAI Countersues Elon Musk, Accuses Billionaire of ‘Bad-Faith Tactics’ – Crypto News
-
Blockchain6 days ago
BTC, ETH, XRP, BNB, SOL, DOGE, ADA, LEO, LINK, AVAX – Crypto News
-
Technology6 days ago
Dogecoin Price Gearing for A 3X Rally Amid DOGE Whale Accumulation – Crypto News
-
others5 days ago
Binance Issues Important Update On 10 Crypto, Here’s All – Crypto News
-
others1 week ago
WTI price mostly unchanged at European opening – Crypto News
-
others1 week ago
Technical Indicator Suggesting Bitcoin (BTC) Bull Market Hasn’t Started Yet: Quant Analyst PlanB – Crypto News
-
others1 week ago
Gold price under pressure despite high risk aversion – Commerzbank – Crypto News
-
Technology1 week ago
Shiba Inu Price Risks 50% Crash As Bearish Breakout Looms – Crypto News
-
Blockchain1 week ago
Web3 active developers drop nearly 40% in one year – Crypto News
-
others1 week ago
Economist Alex Krüger Warns US Stocks Could Repeat 2008 Bear Market Amid Trump’s Trade War – Crypto News
-
Technology1 week ago
XRP Leveraged ETF Outshines Solana At Launch – Crypto News
-
Cryptocurrency1 week ago
Stablecoin infrastructure platform M^0 expands to Solana – Crypto News
-
Blockchain1 week ago
Investors Looking To Buy Bitcoin? – Crypto News
-
Cryptocurrency1 week ago
Galaxy’s imminent US listing reflects SEC change – Crypto News
-
others1 week ago
Crypto Products See $240,000,000 in Outflows Likely in Response to US Tariff Threats: CoinShares – Crypto News
-
Blockchain7 days ago
NY attorney general urges Congress to keep pensions crypto-free — ‘No intrinsic value’ – Crypto News
-
Technology7 days ago
iQOO Z10 5G, Z10x 5G launched in India, price starts at ₹13,499. Check full price, specs and more – Crypto News