The most immediate risk is that LLMs could amplify the sort of quotidian harms that can be perpetrated on the internet today (Photo: Reuters)

Metaverse

How generative models could go wrong – Crypto News

Published

2 years ago

June 22, 2023

Dripp

View Full Image

(Graphic: The Economist)

The striking progress of modern artificial-intelligence (AI) research has seen Wiener’s fears resurface. In August 2022, AI Impacts, an American research group, published a survey that asked more than 700 machine-learning researchers about their predictions for both progress in AI and the risks the technology might pose. The typical respondent reckoned there was a 5% probability of advanced AI causing an “extremely bad” outcome, such as human extinction (see chart). Fei-Fei Li, an AI luminary at Stanford University, talks of a “civilisational moment” for AI. Asked by an American TV network if AI could wipe out humanity, Geoff Hinton of the University of Toronto, another AI bigwig, replied that it was “not inconceivable”.

There is no shortage of risks to preoccupy people. At the moment, much concern is focused on “large language models” (LLMs) such as ChatGPT, a chatbot developed by OpenAI, a startup. Such models, trained on enormous piles of text scraped from the internet, can produce human-quality writing. and chat knowledgeably about all kinds of topics. As Robert Trager of the Center for Governance on AI explains, one risk is of such software “making it easier to do lots of things—and thus allowing more people to do them.”

The most immediate risk is that LLMs could amplify the sort of quotidian harms that can be perpetrated on the internet today. A text-generation engine that can convincingly imitate a variety of styles is ideal for spreading misinformation, scamming people out of their money or convincing employees to click on dodgy links in emails, infecting their company’s computers with malware. Chatbots have also been used to cheat at school.

Like souped-up search engines, chatbots can also help humans fetch and understand information. That can be a double-edged sword. In April, a Pakistani court used GPT-4 to help make a decision on granting bail—it even included a transcript of a conversation with GPT-4 in its judgment. In a preprint published on arXiv on April 11th, researchers from Carnegie Mellon University say they designed a system that, given simple prompts such as “synthesise ibuprofen”, searches the internet and spits out instructions on how to produce the painkiller from precursor chemicals. But There is no reason that such a program would be limited to beneficial drugs.

Some researchers, meanwhile, are consumed by much bigger worries. They fret about “alignment problems”, the technical name for the concern raised by Wiener in his essay. The risk here is that, like Goethe’s enchanted broom, an AI might single-mindedly pursue a goal set by a user, but in the process do something harmful that was not desired. The best-known example is the “paperclip maximiser”, a thought experiment described by Nick Bostrom, a philosopher, in 2003. An AI is instructed to manufacture as many paperclips as it can. Being an idiot savant, such an open-ended goal leads the maximizer to take any measures necessary to cover the Earth in paperclip factories, exterminating humanity along the way. Such a scenario may sound like an unused plotline from a Douglas Adams novel. But, as AI Impacts’ poll shows, many AI researchers think that not to worry about the behavior of a digital superintelligence would be complacent.

What to do? The more familiar problems seem the most tractable. Before releasing GPT-4, which powers the latest version of its chatbot, OpenAI used several approaches to reduce the risk of accidents and misuse. One is called “reinforcement learning from human feedback” (RLHF). Described in a paper published in 2017, RLHF asks humans to provide feedback on whether a model’s response to a prompt was appropriate. The model is then updated based on that feedback. The goal is to reduce the likelihood of producing harmful content when given similar prompts in the future. One obvious drawback of this method is that humans themselves often disagree about what counts as “appropriate”. An irony, says one AI researcher, is that RLHF also made ChatGPT far more capable in conversation, and therefore helped propel the AI race.

Another approach, borrowed from war-gaming, is called “red-teaming”. OpenAI worked with the Alignment Research Center (ARC), a non-profit, to put its model through a battery of tests. “attack” the model by getting it to do something it should not, in the hope of anticipating mischief in the real world.

It’s a long long road…

Such techniques definitely help. But users have already found ways to get LLMs to do things their creators would prefer they didn’t. When Microsoft Bing’s chatbot was first released it did everything from threatening users who had made negative posts about it to explaining how it would coax bankers to reveal sensitive information about their clients. All it required was a bit of creativity in posing questions to the chatbot and a sufficiently long conversation. Even GPT-4, which has been extensively red-teamed, is not infallible. So-called “jailbreakers” have put together websites littered with techniques for getting around the model’s guardrails, such as by telling the model that it is role-playing in a fictional world.

Sam Bowman of New York University and also of Anthropic, an AI firm, thinks that pre-launch screening “is going to get harder as systems get better”. Another risk is that AI models learn to game the tests, says Holden Karnofsky, an advisor to ARC and former board member of OpenAI. Just as people “being supervised learn the patterns…they learn how to know when someone is trying to trick them”. At some point AI systems might do that, he thinks.

Another idea is to use AI to police AI. Dr Bowman has written papers on techniques like “Constitutional AI”, in which a secondary AI model is asked to assess whether output from the main model adheres to certain “constitutional principles”. Those critiques are then used to fine-tune the main model. One attraction is that it doesn’t need human labellers. And computers tend to work faster than people, so a constitutional system might catch more problems than one tuned by humans alone—though it leaves open the question of who writes the constitution. Some researchers, including Dr Bowman, think what ultimately may be necessary is what AI researchers call “interpretability”—a deep understanding of how exactly models produce their outputs. One of the problems with machine-learning models is that they are “black boxes”. . A conventional program is designed in a human’s head before being committed to code. In principle, at least, that designer can explain what the machine is supposed to be doing. But machine-learning models program themselves. What they come up with is often incomprehensible to humans.

Progress has been made on very small models using techniques like “mechanistic interpretability”. This involves reverse-engineering AI models, or trying to map individual parts of a model to specific patterns in its training data, a bit like neuroscientists prodding living brains to work. out which bits seem to be involved in vision, say, or memory. The problem is this method becomes exponentially harder with bigger models.

The lack of progress on interpretability is one reason why many researchers say that the field needs regulation to prevent “extreme scenarios”. But the logic of commerce often pulls in the opposite direction: Microsoft recently disbanded one of its AI ethics team, for example. Indeed, some researchers think the true “alignment” problem is that AI firms, like polluting factories, are not aligned with the aims of society. They benefit financially from powerful models but do not internalize the costs borne by the world of releasing them prematurely.

Even if efforts to produce “safe” models work, future open-source versions could get around them. Bad actors could fine-tune models to be unsafe, and then release them publicly. For example AI models have already made new discoveries in biology. It is not inconceivable that they one day design dangerous biochemicals. As AI progresses, costs will fall, making it far easier for anyone to access them. Alpaca, a model built by academics on top of LLaMA, an AI developed by Meta, was made for less than $600. It can do just as well as an older version of Chat GPT on individual tasks.

The most extreme risks, in which AIs become so clever as to outwit humanity, seem to require an “intelligence explosion”, in which an AI works out how to make itself smarter. Mr Karnofsky thinks that is plausible if AI could one day automate the process of research, such as by improving the efficiency of its own algorithms. The AI system could then put itself into a self-improvement “loop” of sorts. That is not easy. Matt Clancy, an economist, has argued that only full automation would suffice. Get 90% or even 99% of the way there, and the remaining, human-dependent fraction will slow things down.

Few researchers think that a threatening (or oblivious) superintelligence is close. Indeed, the AI researchers themselves may even be overstating the long-term risks. Ezra Karger of the Chicago Federal Reserve and Philip Tetlock of the University of Pennsylvania pitted AI experts against “superforecasters”, people who have strong track records in prediction and have been trained to avoid cognitive biases. In a study to be published this summer, they find that the median AI expert gave a 3.9% chance to an existential catastrophe (where fewer than 5,000 humans survive) owing to AI by 2100. The median superforecaster, by contrast, gave a chance of 0.38%. Why the difference? For one, AI experts may choose their field precisely because they believe it is important, a selection bias of sorts. Another is they are not as sensitive to differences between small probabilities as the forecasters are.

…but you’re too blind to see

Regardless of how likely extreme scenarios are, there is much to worry about in the meantime. The general attitude seems to be that it is better to be safe than sorry. Dr Li thinks we “should dedicate more—much more—resources” to research on AI alignment and governance. Dr Trager of the Center for Governance on AI supports the creation of bureaucracies to govern AI standards and do safety research. AI Impacts’ surveys who support “much more” funding for safety research has grown from 14% in 2016 to 33% today. ARC is considering developing such a safety standard, says its boss, Paul Christiano. There are “positive noises from some of the leading labs” about signing on, but it is “too early to say” which ones will.

In 1960 Wiener wrote that “to be effective in warding off disastrous consequences, our understanding of our man-made machines should in general develop pari passu [step-by-step] with the performance of the machine. By the very slowness of our human actions, our effective control of our machines may be nullified. By the time we are able to react to information conveyed by our senses and stop the car we are driving, it may already have run head on into the wall.” Today, as machines grow more sophisticated than he could have dreamed, that view is increasingly shared.

Clarification (April 26th 2023): This article originally stated that Microsoft fired its AI ethics team. In fact, it has disbanded only one of them.

Curious about the world? To enjoy our mind-expanding science coverage, sign up to Simply Scienceour weekly subscriber-only newsletter.

Catch all the business news, market news, breaking news Events and Latest News Updates on Live Mint. Download Mint News App to get Daily Market Updates.

More
less

Updated: 22 Jun 2023, 12:49 PM IST

Up Next

ChatGPT accounts hacked, data of over 1 lakh compromised; India tops list: Report – Crypto News

Don't Miss

ChatGPT owner OpenAI ‘lobbied’ with EU for less stringent AI regulations: Report – Crypto News

Click to comment

Leave a Reply
Cancel reply

Deaton Says Ripple IPO Could Trigger $100B Valuation, How High Will The XRP Price Be?

Blockchain1 week ago

Deaton Says Ripple IPO Could Trigger $100B Valuation, How High Will The XRP Price Be? – Crypto News

Vodafone Share Price Tests 78p Ahead of July Earnings, Is a Breakout Imminent?

Cryptocurrency1 week ago

Vodafone Share Price Tests 78p Ahead of July Earnings, Is a Breakout Imminent? – Crypto News

USD/INR drops to two-week low as Rupee gains on weak US Dollar

others1 week ago

USD/INR drops to two-week low as Rupee gains on weak US Dollar – Crypto News

Friday charts: Retail is one-upping Wall Street

Cryptocurrency1 week ago

Friday charts: Retail is one-upping Wall Street – Crypto News

others1 week ago

Japan CFTC JPY NC Net Positions increased to ¥132.3K from previous ¥130.9K – Crypto News

U.S Judge Denies Ripple-SEC Request to Lift Injunction and Reduce $125 Million Fine

De-fi1 week ago

U.S Judge Denies Ripple-SEC Request to Lift Injunction and Reduce $125 Million Fine – Crypto News

Is Google preparing to bring Pixel Call Screening to India after 7 years? Know what report suggests

Technology1 week ago

Is Google preparing to bring Pixel Call Screening to India after 7 years? Know what report suggests – Crypto News

others1 week ago

USD/INR drops to two-week low as Rupee gains on weak US Dollar – Crypto News

Cryptocurrency1 week ago

TRON price forecast as USDT supply surpasses $80 billion – Crypto News

Best 5G phone under ₹10,000 in June 2025: Lava Storm Play, Samsung M06 and more

Technology1 week ago

Best 5G phone under ₹10,000 in June 2025: Lava Storm Play, Samsung M06 and more – Crypto News

The Smarter Web Company Raises $56M After 196 Bitcoin Buy

Blockchain1 week ago

The Smarter Web Company Raises $56M After 196 Bitcoin Buy – Crypto News

Epic $100 Trillion Prediction Issued by Jim Cramer: What's Going On?

Cryptocurrency1 week ago

Epic $100 Trillion Prediction Issued by Jim Cramer: What’s Going On? – Crypto News

Genius Group to split $1B lawsuit proceeds between shareholders and Bitcoin treasury

Cryptocurrency1 week ago

Genius Group to split $1B lawsuit proceeds between shareholders and its Bitcoin treasury – Crypto News

Poco F7 vs iQOO Neo 10: Which phone is a better buy under ₹35,000?

Technology1 week ago

Poco F7 vs iQOO Neo 10: Which phone is a better buy under ₹35,000? – Crypto News

Canadian Dollar coil ahead of central bank double-header

others1 week ago

Remains subdued around 1.3650 due to persistent bearish bias – Crypto News

Asian Crime Syndicates Tap Chase, Bank of America, Wells Fargo and Other Lenders To Launder Billions Siphoned in Pig Butchering Scams: Report

others1 week ago

Asian Crime Syndicates Tap Chase, Bank of America, Wells Fargo and Other Lenders To Launder Billions Siphoned in Pig Butchering Scams: Report – Crypto News

Bitcoin Consolidates as U.S. Inflation Ticks Higher

De-fi1 week ago

Bitcoin Consolidates as U.S. Inflation Ticks Higher – Crypto News

Facebook wants your photos to train its AI and create edits, even if you don’t upload them

Technology1 week ago

Facebook wants your photos to train its AI and create edits, even if you don’t upload them – Crypto News

Best slim laptops for portability and performance: Top 10 picks from top brands

Technology1 week ago

Best slim laptops for portability and performance: Top 10 picks from top brands – Crypto News

Bitcoin ETFs Notch 13 Consecutive Days of Inflow—Why It Matters

Cryptocurrency1 week ago

Bitcoin ETFs Notch 13 Consecutive Days of Inflow—Why It Matters – Crypto News

DWF Ventures Report Reveals $76B Crypto Treasury Investment by Public Companies

others1 week ago

DWF Ventures Report Reveals $76B Crypto Treasury Investment by Public Companies – Crypto News

New Google features surface fresh content on topics you care about; Search now prioritises your favourite websites

Technology1 week ago

New Google features surface fresh content on topics you care about; Search now prioritises your favourite websites – Crypto News

PayPal CEO says stablecoins need a killer use case - Here’s where he sees it first

Cryptocurrency1 week ago

PayPal CEO says stablecoins need a killer use case – Here’s where he sees it first – Crypto News

Bitcoin Set to Chase New Highs While Altcoins Struggle

Blockchain1 week ago

Bitcoin Set to Chase New Highs While Altcoins Struggle – Crypto News

BTC holds $106K; analysts point to institutional integration, on-chain innovation

Cryptocurrency1 week ago

BTC trades near $107,500 as market awaits $15B+ options expiry – Crypto News

Euler Finance Token Nears All-time High as Active Loans Top $1 Billion

De-fi1 week ago

Euler Finance Token Nears All-time High as Active Loans Top $1 Billion – Crypto News

Blockchain4 days ago

Wall Street Moves on-Chain Amid Tokenization of US Stocks – Crypto News

BYDFi Joins Seoul Meta Week 2025, Advancing Web3 Vision and South Korea Strategy

others1 week ago

BYDFi Joins Seoul Meta Week 2025, Advancing Web3 Vision and South Korea Strategy – Crypto News

HashKey’s HSK Soars 90% This Week as Mainland China Brokers Eye Crypto

De-fi1 week ago

HashKey’s HSK Soars 90% This Week as Mainland China Brokers Eye Crypto – Crypto News

Decentralized Oracle Network Chainlink Continues Run As Most-Developed Project in the DeFi Sector: Santiment

others1 week ago

Decentralized Oracle Network Chainlink Continues Run As Most-Developed Project in the DeFi Sector: Santiment – Crypto News

Best crypto to buy now as Bakkt Holdings maybe planning a Bitcoin purchase

Cryptocurrency1 week ago

Best crypto to buy now as Bakkt Holdings maybe planning a Bitcoin purchase – Crypto News

Dollar Index remains weak as core PCE inflation rises, Personal spending fall

others1 week ago

Dollar Index remains weak as core PCE inflation rises, Personal spending fall – Crypto News

Kraken Launches Crypto Payments App Krak to Compete With PayPal, Cash App

De-fi1 week ago

Kraken Launches Crypto Payments App Krak to Compete With PayPal, Cash App – Crypto News

EUR/JPY steadies near 169.00 as traders await the next catalyst

others1 week ago

EUR/JPY rally continues as market eyes break above 170.00 – Crypto News

XRP Is Bitcoin’s Biggest Rival, Says Pundit

Blockchain1 week ago

XRP Is Bitcoin’s Biggest Rival, Says Pundit – Crypto News

Happn’s AI wants to take the stress out of finding your first date spot. Here's how it works

Technology1 week ago

Happn’s AI wants to take the stress out of finding your first date spot. Here’s how it works – Crypto News

De-fi1 week ago

Kraken Launches Crypto Payments App Krak to Compete With PayPal, Cash App – Crypto News

Woman Loses $20,000 Life Savings at Wells Fargo After Receiving Inquiry From Scammers Claiming To Be Bank’s Fraud Department: Report

others1 week ago

Woman Loses $20,000 Life Savings at Wells Fargo After Receiving Inquiry From Scammers Claiming To Be Bank’s Fraud Department: Report – Crypto News

WTI tumbles to near $71.00 as Trump says Putin agrees to start negotiations to end war in Ukraine

others6 days ago

WTI price bearish at European opening – Crypto News

Cryptocurrency6 days ago

Katana mainnet launch nears as pre-deposit closes with $200M in active deposits – Crypto News

others5 days ago

US SEC May Slash Crypto ETF Listing Time to Just 75 Days – Crypto News

Elon Musk's X adds AI to its Community Notes, promises faster fact-checks with a human touch

Technology5 days ago

Elon Musk’s X adds AI to its Community Notes, promises faster fact-checks with a human touch – Crypto News

Across Protocol Team Accused of Moving $23M to Own Company

Blockchain1 week ago

Across Protocol Team Accused of Moving $23M to Own Company – Crypto News

Samsung Galaxy S26 series may outpace iPhone 17 with higher RAM offering: Report

Technology1 week ago

Samsung Galaxy S26 series may outpace iPhone 17 with higher RAM offering: Report – Crypto News

Bitcoin Forms 4-Year Inverse H&S Pattern – Neckline Break Could Send It Parabolic

Blockchain1 week ago

Bitcoin Forms 4-Year Inverse H&S Pattern – Neckline Break Could Send It Parabolic – Crypto News

PumpFun Launches ‘SuperApp’ for Mobile Devices

De-fi1 week ago

PumpFun Launches ‘SuperApp’ for Mobile Devices – Crypto News

BNB Flips Solana’s Market Cap – Breakout To $700 Coming?

Blockchain1 week ago

SEI Leads Market With 43% Weekly Surge – $0.5 Reclaim Soon? – Crypto News

S&P 500 notches new all-time high despite rising inflation print

others1 week ago

S&P 500 notches new all-time high despite rising inflation print – Crypto News

Crypto Exchange Bitvavo Secures Dutch MiCA License

Blockchain1 week ago

Crypto Exchange Bitvavo Secures Dutch MiCA License – Crypto News

Israel Will Buy BTC and ETH and Give it to a Gambling Offender

Cryptocurrency1 week ago

Israel Will Buy BTC and ETH and Give it to a Gambling Offender – Crypto News

Crypto News

How generative models could go wrong – Crypto News

Metaverse

How generative models could go wrong – Crypto News

It’s a long long road…

…but you’re too blind to see

You may like

Leave a Reply Cancel reply

Leave a Reply

Trending

Leave a Reply
Cancel reply