Technique Behind ChatGPT’s AI Wins Computing’s Top Prize—But Its Creators Are Worried

7 months ago 149

Andrew Barto and Richard Sutton, who received computing's highest honor this week for their foundational work on reinforcement learning, didn't waste any time using their new platform to sound alarms about unsafe AI development practices in the industry.

The pair were announced as recipients of the 2024 ACM A.M. Turing Award on Wednesday, often dubbed the "Nobel Prize of Computing," and is accompanied by a $1 million prize funded by Google.

Rather than simply celebrating their achievement, they immediately criticized what they see as dangerously rushed deployment of AI technologies.

"Releasing software to millions of people without safeguards is not good engineering practice," Barto told The Financial Times. "Engineering practice has evolved to try to mitigate the negative consequences of technology, and I don't see that being practiced by the companies that are developing."

Their assessment likened current AI development practices like "building a bridge and testing it by having people use it" without proper safety checks in place, as AI companies seek to prioritize business incentives over responsible innovation.

The duo's journey began in the late 1970s when Sutton was Barto's student at the University of Massachusetts. Throughout the 1980s, they developed reinforcement learning—a technique where AI systems learn through trial and error by receiving rewards or penalties—when few believed in the approach.

Their work culminated in their seminal 1998 textbook "Reinforcement Learning: An Introduction," which has been cited almost 80 thousand times and became the bible for a generation of AI researchers.

"Barto and Sutton’s work demonstrates the immense potential of applying a multidisciplinary approach to longstanding challenges in our field," ACM President Yannis Ioannidis said in an announcement. “Reinforcement learning continues to grow and offers great potential for further advances in computing and many other disciplines.”

The $1 million Turing Award comes as reinforcement learning continues to drive innovation across robotics, chip design, and large language models, with reinforcement learning from human feedback (RLHF) becoming a critical training method for systems like ChatGPT.

Industry-wide safety concerns

Still, the pair's warnings echo growing concerns from other big names in the field of computer science.

Yoshua Bengio, himself a Turing Award recipient, publicly supported their stance on Bluesky.

"Congratulations to Rich Sutton and Andrew Barto on receiving the Turing Award in recognition of their significant contributions to ML," he said. "I also stand with them: Releasing models to the public without the right technical and societal safeguards is irresponsible."

Their position aligns with criticisms from Geoffrey Hinton, another Turing Award winner—known as the godfather of AI—as well as a 2023 statement from top AI researchers and executives—including OpenAI CEO Sam Altman—that called for mitigating extinction risks from AI as a global priority.

Former OpenAI researchers have raised similar concerns.

Jan Leike, who recently resigned as head of OpenAI's alignment initiatives and joined rival AI company Anthropic, pointed to an inadequate safety focus, writing that "building smarter-than-human machines is an inherently dangerous endeavor.”

“Over the past years, safety culture and processes have taken a backseat to shiny products," Leike said.

Leopold Aschenbrenner, another former OpenAI safety researcher, called security practices at the company "egregiously insufficient." At the same time, Paul Christiano, who also previously led OpenAI's language model alignment team, suggested there might be a "10-20% chance of AI takeover, [with] many [or] most humans dead."

Despite their warnings, Barto and Sutton maintain a cautiously optimistic outlook on AI's potential.

In an interview with Axios, both suggested that current fears about AI might be overblown, though they acknowledge significant social upheaval is possible.

"I think there's a lot of opportunity for these systems to improve many aspects of our life and society, assuming sufficient caution is taken," Barto told Axios.

Sutton sees artificial general intelligence as a watershed moment, framing it as an opportunity to introduce new "minds" into the world without them developing through biological evolution—essentially opening the gates for humanity to interact with sentient machines in the future.

Edited by Sebastian Sinclair