Google and Oxford Are Working on an A.I. Kill Switch That A.I. Won’t Learn to Turn Off

Google isn't going to let robots take over the world. That's Google's job.
This article is over 8 years old and may contain outdated information

Recommended Videos

Google wants to make sure that if anyone is going to take over the world and subjugate humanity, it’s going to be Google the company not Google the inevitable self-chosen name of the sentient AI that will one day destroy us all. That’s why they’re working on an off switch for artificial intelligence—or just for the safety of humans working alongside less ambitious but still potentially dangerous AI—which is a trickier proposition than it may seem.

After all, the point of AI is, eventually, to help make advances in computing (and everything else) faster and better than humans can. We’re not quite at the point where AI is equal to or better than humans just yet, but once we are, AI will be able to improve on itself even faster than we can, which makes it a difficult task to come up with a human-designed don’t-murder-us button that won’t immediately become obsolete if AI so chooses.

In the more near term, though, such a thing will also be necessary as robots working in, say, factories make use of AI. You’d want to make sure the AI doesn’t lead the robot to do something that isn’t exactly helpful to the manufacturing process, so you’d need to make them what Google and Oxford University’s aptly named Future of Humanity Institute call “safely interruptible agents”—or, as they put it in layman’s terms in their paper: give AI a “big red button.” It would theoretically allow a person to remove a robot—or themselves—from whatever dangerous situation arose and then safely reactivate the AI at a later time.

They explain, “This paper explores a way to make sure a learning agent will not learn to prevent (or seek!) being interrupted by the environment or a human operator. We provide a formal definition of safe interruptibility and exploit the off-policy learning property to prove that either some agents are already safely interruptible, like Q-learning, or can easily be made so, like Sarsa. We show that even ideal, uncomputable reinforcement learning agents for (deterministic) general computable environments can be made safely interruptible.”

There may just be hope for us yet.

(via Gizmodo, image via Futurama)

—The Mary Sue has a strict comment policy that forbids, but is not limited to, personal insults toward anyone, hate speech, and trolling.—

Follow The Mary Sue on Twitter, Facebook, Tumblr, Pinterest, & Google+.


The Mary Sue is supported by our audience. When you purchase through links on our site, we may earn a small affiliate commission. Learn more about our Affiliate Policy
Author
Image of Dan Van Winkle
Dan Van Winkle
Dan Van Winkle (he) is an editor and manager who has been working in digital media since 2013, first at now-defunct Geekosystem (RIP), and then at The Mary Sue starting in 2014, specializing in gaming, science, and technology. Outside of his professional experience, he has been active in video game modding and development as a hobby for many years. He lives in North Carolina with Lisa Brown (his wife) and Liz Lemon (their dog), both of whom are the best, and you will regret challenging him at Smash Bros.