How do neural networks learn? A mathematical formula explains how they detect relevant patterns

Neural networks have driven innovation in artificial intelligence, including large-scale language models that are now used in a wide range of applications from finance to human resources to healthcare. However, these networks remain black boxes that are difficult for internal engineers and scientists to understand. Now, a team led by data and computer scientists at the University of California, San Diego, has given the neural network equivalent of an X-ray to find out how they actually learn.

The researchers found that the formulas used in the statistical analysis provide a simplified mathematical description of how neural networks, such as ChatGPT's predecessor, GPT-2, learn relevant patterns in data known to be features. This formula also describes how a neural network uses these related patterns to make predictions.

“We are trying to understand neural networks from first principles,” said Dr. Daniel Beaglehole. He is a student in the UC San Diego Department of Computer Science and Engineering and a co-author of the study. “Our formula allows you to simply interpret the features the network uses to make predictions.”

The research team published their findings in the journal March 7. science.

Why is this important? AI-based tools are now pervasive in our daily lives. Banks use it to approve loans. Hospitals use it to analyze medical data such as X-rays and MRIs. Companies use it to screen job applicants. However, it is currently difficult to understand the mechanisms that neural networks use to make decisions and the biases in the training data that may affect them.

“If you don’t understand how neural networks learn, it’s very difficult to establish whether they are producing reliable, accurate, and appropriate responses,” said Mikhail Belkin, corresponding author on the paper and a professor at the UC San Diego Halicioglu Data Science Institute. said. . “This is especially important given the recent rapid growth of machine learning and neural network technologies.”

The study is part of a larger effort by Belkin's research group to develop a mathematical theory that explains how neural networks work. “The technology has moved far ahead of the theory,” he said. “We have to catch up.”

The team also showed that the statistical formula they used to understand how neural networks learn, known as Average Gradient Outer Product (AGOP), can be applied to improve the performance and efficiency of other types of machine learning architectures that do not involve neural networks.

“If we understand the fundamental mechanisms that drive neural networks, we will be able to build simpler, more efficient and easier to interpret machine learning models,” Belkin said. “We hope this will help democratize AI.”

The machine learning system Belkin envisions will require less computing power to operate and therefore less power from the grid. Additionally, these systems are less complex and easier to understand.

Explain new findings with examples

(Artificial) neural networks are computational tools for learning relationships between data features, such as identifying specific objects or faces in images. One example of a task is determining whether a person is wearing glasses in a new image. Machine learning approaches this problem by providing a neural network with many example (training) images labeled as images of “people with glasses” or “people without glasses.” Neural networks learn the relationships between images and their labels and extract data patterns or features on which to focus on making decisions. One of the reasons AI systems are considered black boxes is because it is often difficult to mathematically explain what criteria the system actually uses to make its predictions, including potential bias. New research provides a simple mathematical explanation for how the system learns these features.

Features are related patterns in the data. In the example above, there are various features that the neural network learns and then uses to determine whether the person in the photo is actually wearing glasses. One of the features you will want to pay attention to for this task is the upper part of the face. Other features may include the area around the eyes or nose where glasses are often placed. The network selectively pays attention only to the features it has learned to be relevant and then discards other parts of the image, such as the bottom of the face, hair, etc.

Feature learning is the ability to recognize relevant patterns in data and then use those patterns to make predictions. In the glasses example, the network learns to pay attention to the upper part of the face. new science In their paper, the researchers identified a statistical formula that describes how a neural network learns features.

Alternative Neural Network Architectures: Researchers have shown that inserting this formula into computing systems that do not rely on neural networks can allow these systems to learn faster and more efficiently.

“How do you ignore something you don't need? Humans are good at this,” Belkin said. “Machines are doing the same thing. Large-scale language models, for example, are implementing this ‘selective attention,’ but we didn’t know how it was done. science “In the paper, we present a mechanism that explains at least part of how neural networks ‘selectively attend’.”

Funders for the study included the National Science Foundation and the Simons Foundation for Collaboration on Theoretical Foundations of Deep Learning. Belkin is part of The Institute for Learning-enabled Optimization at Scale (TILOS), funded by NSF and led by UC San Diego.

How do neural networks learn? A mathematical formula explains how they detect relevant patterns

Users can now purchase e.l.f. Cosmetics products through Roblox

EU could soon charge Meta over ‘pay or consent’ model

The 2025 Polestar 4: Great steering and a small carbon footprint stand out

Leave A Reply Cancel Reply

PanAm To Return For A Limited Time Only

Benefits, Uses, Side Effects, and More

Buggy, Alvida & Gold Roger Return, Season 2 Production

Is this year of elections also the year of deepfakes?

Kate Middleton best friend’s mother breaks silence on Palace gesture

Wimbledon Royal Box guest list: All of the celebrities from David Attenborough to David Beckham

Trump’s Debate Numbers Plummeted With Swing Voters Every Time He Opened His Mouth

Users can now purchase e.l.f. Cosmetics products through Roblox

U.S. Navy Releases New T-45 Replacement’s Request For Information, Pushing Program To 2028

Final ‘Demon Slayer’ Films to Launch as Theatrical Trilogy

S8 E153: The Trial of Alex Murdaugh: Murders Timeline — Part 1

Sebi Boosts Investor Protection: Higher Demat Limit, Finfluencer Rules, Focus on Risk-Adjusted Returns

Popular Posts

Kate Middleton best friend’s mother breaks silence on Palace gesture

Sebi Boosts Investor Protection: Higher Demat Limit, Finfluencer Rules, Focus on Risk-Adjusted Returns

PVR INOX Rises 6% After New Multiplex Launch In Hyderabad

Most Read

Reviews, Benefits, Buy Online – The Hollywood Reporter

Will She Return to Reality Television?!?

Easing job jitters in the digital revolution

How do neural networks learn? A mathematical formula explains how they detect relevant patterns

Related Posts

Leave A Reply Cancel Reply