Neural networks have driven innovation in artificial intelligence, including large-scale language models that are now used in a wide range of applications from finance to human resources to healthcare. However, these networks remain black boxes that are difficult for internal engineers and scientists to understand. Now, a team led by data and computer scientists at the University of California, San Diego, has given the neural network equivalent of an X-ray to find out how they actually learn.
The researchers found that the formulas used in the statistical analysis provide a simplified mathematical description of how neural networks, such as ChatGPT's predecessor, GPT-2, learn relevant patterns in data known to be features. This formula also describes how a neural network uses these related patterns to make predictions.
“We are trying to understand neural networks from first principles,” said Dr. Daniel Beaglehole. He is a student in the UC San Diego Department of Computer Science and Engineering and a co-author of the study. “Our formula allows you to simply interpret the features the network uses to make predictions.”
The research team published their findings in the journal March 7. science.
Why is this important? AI-based tools are now pervasive in our daily lives. Banks use it to approve loans. Hospitals use it to analyze medical data such as X-rays and MRIs. Companies use it to screen job applicants. However, it is currently difficult to understand the mechanisms that neural networks use to make decisions and the biases in the training data that may affect them.
“If you don’t understand how neural networks learn, it’s very difficult to establish whether they are producing reliable, accurate, and appropriate responses,” said Mikhail Belkin, corresponding author on the paper and a professor at the UC San Diego Halicioglu Data Science Institute. said. . “This is especially important given the recent rapid growth of machine learning and neural network technologies.”
The study is part of a larger effort by Belkin's research group to develop a mathematical theory that explains how neural networks work. “The technology has moved far ahead of the theory,” he said. “We have to catch up.”
The team also showed that the statistical formula they used to understand how neural networks learn, known as Average Gradient Outer Product (AGOP), can be applied to improve the performance and efficiency of other types of machine learning architectures that do not involve neural networks.
“If we understand the fundamental mechanisms that drive neural networks, we will be able to build simpler, more efficient and easier to interpret machine learning models,” Belkin said. “We hope this will help democratize AI.”
The machine learning system Belkin envisions will require less computing power to operate and therefore less power from the grid. Additionally, these systems are less complex and easier to understand.
Explain new findings with examples
(Artificial) neural networks are computational tools for learning relationships between data features, such as identifying specific objects or faces in images. One example of a task is determining whether a person is wearing glasses in a new image. Machine learning approaches this problem by providing a neural network with many example (training) images labeled as images of “people with glasses” or “people without glasses.” Neural networks learn the relationships between images and their labels and extract data patterns or features on which to focus on making decisions. One of the reasons AI systems are considered black boxes is because it is often difficult to mathematically explain what criteria the system actually uses to make its predictions, including potential bias. New research provides a simple mathematical explanation for how the system learns these features.
Features are related patterns in the data. In the example above, there are various features that the neural network learns and then uses to determine whether the person in the photo is actually wearing glasses. One of the features you will want to pay attention to for this task is the upper part of the face. Other features may include the area around the eyes or nose where glasses are often placed. The network selectively pays attention only to the features it has learned to be relevant and then discards other parts of the image, such as the bottom of the face, hair, etc.
Feature learning is the ability to recognize relevant patterns in data and then use those patterns to make predictions. In the glasses example, the network learns to pay attention to the upper part of the face. new science In their paper, the researchers identified a statistical formula that describes how a neural network learns features.
Alternative Neural Network Architectures: Researchers have shown that inserting this formula into computing systems that do not rely on neural networks can allow these systems to learn faster and more efficiently.
“How do you ignore something you don't need? Humans are good at this,” Belkin said. “Machines are doing the same thing. Large-scale language models, for example, are implementing this ‘selective attention,’ but we didn’t know how it was done. science “In the paper, we present a mechanism that explains at least part of how neural networks ‘selectively attend’.”
Funders for the study included the National Science Foundation and the Simons Foundation for Collaboration on Theoretical Foundations of Deep Learning. Belkin is part of The Institute for Learning-enabled Optimization at Scale (TILOS), funded by NSF and led by UC San Diego.