Google has recently announced the launch of PaLM 2, a state-of-the-art multimodal AI system that can translate between languages, write computer code, and analyze and respond to images. Like OpenAI’s GPT, PaLM 2 is a general-purpose AI model that can power chatbots like ChatGPT, among other applications. The system can even perform complex tasks such as finding a restaurant in Bulgaria, searching the web for Bulgarian responses, translating the answer into English, adding a picture of the location, and providing a code snippet to create a database entry for the place.
According to Slav Petrov, the co-lead of the PaLM 2 project, the neural network revolution that we are experiencing now started around a decade ago, and it started in part at Google. Petrov added that AI breakthroughs including the transformer, the T in GPT, came from the company’s research.
“We’re really excited to make these models available broadly externally because we want to see what people can do with them,” Petrov said. “We believe that they will open up a lot of opportunities to do things that were previously thought magic and really out of reach, but that now can be accomplished thanks to the amazing progress in machine learning that we’ve seen over the last years.”
PaLM 2’s Multilingual Capabilities
The most obvious way to interact with PaLM 2 is through Google’s chatbot, Bard, which is opening up to the general public for the first time and rolling out globally. Bard can make the most of PaLM 2’s multilingual capabilities, making it available in Japanese and Korean, as well as English, with the company intending to support 40 languages in time.
Bard’s Multimodal Capabilities
Chatbot users can now send Bard photos for the first time, with the company giving an example of sending a picture of a kitchen shelf and asking for a recipe using the ingredients. This replicates a feature promised by OpenAI alongside the launch of its most recent and powerful AI model, GPT-4, but not yet made available to the general public, leaving Google leading the way on so-called “multimodal” capabilities.
In a new feature called “Duet AI,” users of Google’s “Workspace” apps, such as Gmail, Docs, Slides, and Sheets, can use PaLM 2 AI as a co-author of text, spreadsheets, and slides. An image generator built into Google Slides lets you task an AI with visualizing your ideas, while a “help me write” button in Google Docs can generate whole swathes of text automatically.
In one example, the prompt “job post for a regional sales rep” was rapidly expanded to a full job description, replete with clear spaces to enter specific details such as company name and location. However, a disclaimer that the tool “is a creative writing aid and is not intended to be factual” underpins the dilemma for Google. Rushing the technology out to beat the competition also involves risks of AI software misbehaving.
The Risks of AI Software Misbehaving
In its preliminary research, Google warned that systems built on PaLM 2 “continue to produce toxic language harms,” with some languages issuing “toxic” responses to queries about black people in almost a fifth of all tests, part of the reason the Bard chatbot is only available in three languages at launch.
While the launch of the AI system was put together at the last minute, with detailed updates being sent to reporters just hours before the company’s chief executive, Sundar Pichai, took the stage at the conference, it’s clear that Google is committed to making PaLM 2 available broadly externally.