Google Has a Plan to Stop Its New AI From Being Dirty and Rude
Silicon Valley CEOs usually focus on the positives when announcing their company’s next big thing. In 2007, Apple’s Steve Jobs lauded the first iPhone’s “revolutionary user interface” and “breakthrough software.” Google CEO Sundar Pichai took a different tack at his company’s annual conference Wednesday when he announced a beta test of Google’s “most advanced conversational AI yet.”
Pichai said the chatbot, known as LaMDA 2, can converse on any topic and had performed well in tests with Google employees. He announced a forthcoming app called AI Test Kitchen that will make the bot available for outsiders to try. But Pichai added a stark warning. “While we have improved safety, the model might still generate inaccurate, inappropriate or offensive responses,” he said.
Pichai’s vacillating pitch illustrates the mixture of excitement, puzzlement, and concern swirling around a string of recent breakthroughs in the capabilities of machine learning software that processes language.
The technology has already improved the power of auto-complete and web search. It has also created new categories of productivity apps that help workers by generating fluent text or programming code. And when Pichai first disclosed the LaMDA project last year he said it could eventually be put to work inside Google’s search engine, virtual assistant, and workplace apps. Yet despite all that dazzling promise, it’s unclear how to reliably control these new AI wordsmiths.
Google’s LaMDA, or Language Model for Dialogue Applications, is an example of what machine learning researchers call a large language model. The term is used to describe software that builds up a statistical feeling for the patterns of language by processing huge volumes of text, usually sourced online. LaMDA, for example, was initially trained with more than a trillion words from online forums, Q&A sites, Wikipedia, and other webpages. This vast trove of data helps the algorithm perform tasks like generating text in the different styles, interpreting new text, or functioning as a chatbot. And these systems, if they work, won’t be anything like the frustrating chatbots you use today. Right now Google Assistant and Amazon’s Alexa can only perform certain pre-programmed tasks and deflect when presented with something they don’t understand. What Google is now proposing is a computer you can actually talk to.
Chat logs released by Google show LaMDA can—at least at times—be informative, thought-provoking, or even funny. Testing the chatbot prompted Google vice president and AI researcher Blaise Agüera y Arcas to write a personal essay last December arguing the technology could provide new insights into the nature of language and intelligence. “It can be very hard to shake the idea that there’s a ‘who,’ not an ‘it’, on the other side of the screen,” he wrote.
Pichai made clear when he announced the first version of LaMDA last year, and again on Wednesday, that he sees it potentially providing a path to voice interfaces vastly broader than the often frustratingly limited capabilities of services like Alexa, Google Assistant and Apple’s Siri. Now Google’s leaders appear to be convinced they may have finally found the path to creating computers you can genuinely talk with.
At the same time, large language models have proven fluent in talking dirty, nasty, and plain racist. Scraping billions of words of text from the web inevitably sweeps in a lot of unsavory content. OpenAI, the company behind language generator GPT-3, has reported that its creation can perpetuate stereotypes about gender and race, and asks customers to implement filters to screen out unsavory content.