Connect with us

Hi, what are you looking for?

Tech & Science

Google has a ‘big red button’ to safely disable harmful AIs

As Business Insider reports, DeepMind, an AI company Google bought in 2014, has worked with scientists at Oxford University to develop a way to interrupt AIs without them becoming “aware” of the human intervention. The aim is to ensure humans can prevent AIs causing harm, without them learning how to ignore the signal to cancel a running program.
The team has created a framework that allows a human operator to monitor an AI. When abnormal or potentially harmful behaviour is detected, a human could press a hypothetical “big red button” to safely interrupt the AI and take its attention away from the scenario. The framework is designed so the AI cannot learn how to prevent interruptions when they are triggered.
“[AIs] interacting with a complex environment like the real world are unlikely to behave optimally all the time,” the researchers wrote in their paper. “If such an [AI] is operating in real-time under human supervision, now and then it may be necessary for a human operator to press the big red button to prevent the [AI] from continuing a harmful sequence of actions.”
The researchers continue to explain how an AI could learn to ignore the presses of the big red button. If an AI expects a certain response, or reward, from carrying out a sequence of actions, it may be alerted to the possibility of an external interruption if this reward is never received.
Modern AIs use reinforcement strategies to learn new abilities. This is achieved by maximising a so-called “reward function” that sees the AI favouring actions that give it rewards. Designing a reward function correctly can be very difficult though. In one study cited by the researchers, an AI designed to play the game Tetris began pausing the game every time it started to play. It had realised that if it paused the game it could avoid losing forever, thus permanently obtaining a “reward.”
Using complex algorithms, DeepMind has created mechanisms to prevent AIs overriding a “big red button” using reward functions. It calls this “safe interruptibility,” allowing a human to cancel an AI’s current behaviour without the computer reacting to the trigger.
“Safe interruptibility can be useful to take control of a robot that is misbehaving and may lead to irreversible consequences, or to take it out of a delicate situation, or even to temporarily use it to achieve a task it did not learn to perform or would not normally receive rewards for,” the researchers concluded in their paper.
As artificial intelligences become more sophisticated and capable of self-learning, it is widely thought they could come to gradually dominate over humans. Speaking at Oxford University last May, Nick Bostrom, leader of The Future of Humanity Institute, a partner in DeepMind’s work, said he expected to see machines become “superintelligent” not long after they become comparable to humans.
“It might take a long time to get to human level but I think the step from there to superintelligence might be very quick,” he said. “I think these machines with superintelligence might be extremely powerful, for the same basic reasons that we humans are very powerful relative to other animals on this planet. It’s not because our muscles are stronger or because or teeth are sharper, it’s because our brains are better.”
Companies such as Google are increasingly working to consider the ethics of the artificial intelligences they are creating. Even with a “big red button” mechanism implemented, creating interruptible AIs could prove to be as difficult as getting them to learn in the first place, particularly if they start to override human interruptions.

Written By

You may also like:

Business

Chinese students at an e-commerce school rehearse selling hijabs and abayas into a smartphone - Copyright AFP Jade GAOJing Xuan TENGDonning hijabs and floor-length...

World

US Secretary of State Antony Blinken waves as he boards his plane at Joint Base Andrews on his way to Beijing - Copyright POOL/AFP...

World

US President Joe Biden delivers remarks after signing legislation authorizing aid for Ukraine, Israel and Taiwan at the White House on April 24, 2024...

World

AfD leaders Alice Weidel and Tino Chrupalla face damaging allegations about an EU parliamentarian's aide accused of spying for China - Copyright AFP Odd...