Artificial intelligence company OpenAI this week unveiled GPT-4, the latest incarnation of the large language model that powers its popular chatbot ChatGPT. The company says GPT-4 includes big improvements it’s already wowed people with its ability to produce human text and generate images and computer code from almost any challenge.
Researchers say the capabilities have the potential to transform science but some are frustrated that they don’t yet have access to the technology, its underlying code or information about how it was trained. This raises concerns about the safety of the technology and makes it less useful for research, the researchers say.
One of the updates to GPT-4, which was released on March 14th, is that it can now work with both images and text. And as a demonstration of its language prowess, Open AI, which is based in San Francisco, California, says it passed the US bar exam with scores in the 90th percentile, compared to the 10th percentile for the previous version of ChatGPT. However, this technology is not yet widely available – access is currently only available to paid ChatGPT subscribers.
GPT-4 Used to Create the Computer code
“At the moment there is a waiting list, so you can’t use it now,” says Evi-Anne van Dis, a psychologist at the University of Amsterdam. But she saw the GPT-4 demos. “We’ve watched a few videos of them demonstrating the ability and it’s amazing,” he says.
One example, he recounts, was a hand-drawn doodle of a website that GPT-4 used to create the computer code needed to create that website, as a demonstration of its ability to handle images as input.
However, there is frustration in the scientific community with OpenAI’s secrecy about how and on what data the model was trained and how it actually works. “All these closed-source models are basically dead ends in science,” says Sasha Luccioni, a climate scientist at HuggingFace, an open-source AI community. “They [OpenAI] can continue to build on their research, but it’s a dead end for the community as a whole.”
“Red Team” testing OpenAI
Andrew White, a chemical engineer at the University of Rochester, has privileged access to GPT-4 as a “red team”: a person paid by OpenAI to test the platform and try to do something wrong. He has had access to GPT-4 for the past six months, he says. “At the beginning of the process, it didn’t seem that different,” compared to previous iterations.
He asked the robot questions about what chemical reaction steps were needed to make the compound, predicted the yield of the reaction, and selected a catalyst. “I wasn’t that impressed at first,” says White. “It was really surprising because it would look so realistic, but here it would hallucinate an atom. It would skip a step there,” he adds. But when he gave GPT-4 access to scientific papers as part of his work on the red team, things changed dramatically.
“We realized that these models might not be that great just on their own. But when you start connecting them to the Internet to tools like a retro-synthesis planner or a calculator, suddenly new kinds of capabilities emerge.”
And with these abilities comes fear. For example, could GPT-4 enable the production of dangerous chemicals? Thanks to input from people like White, OpenAI engineers fed back into their model to discourage GPT-4 from creating unsafe, illegal or harmful content, White says.
Fake facts models like GPT-4
Another problem is the provision of false information. Luccioni says that models like GPT-4, which exist to predict the next word in a sentence, cannot be cured of coming up with false facts – known as hallucinations. “You can’t rely on these kinds of models because there are so many hallucinations,” he says. And that remains a problem in the latest version, he says, although OpenAI says it has improved security in GPT-4.
Without access to the data used for training, OpenAI’s security assurances are insufficient for Luccioni. “You don’t know what data is. So you can’t improve it. I mean, it’s absolutely impossible to do science with a model like that,” he says.