Leonid Berlyand and his research group apply mathematical principles to better understand and improve deep learning artificial intelligence. Credit: Michelle Bixby / Penn State
Artificial intelligence (AI) is increasingly prevalent, integrated into phone apps, search engines and social media platforms as well as supporting myriad research applications. Of particular interest in recent decades is a type of AI machine learning called deep learning, which has a structure inspired b…
Leonid Berlyand and his research group apply mathematical principles to better understand and improve deep learning artificial intelligence. Credit: Michelle Bixby / Penn State
Artificial intelligence (AI) is increasingly prevalent, integrated into phone apps, search engines and social media platforms as well as supporting myriad research applications. Of particular interest in recent decades is a type of AI machine learning called deep learning, which has a structure inspired by the neural networks of the human brain.
Deep learning is at the core of the large language models used by OpenAI’s ChatGPT and Microsoft Copilot, for example. More specialized deep learning models have supported a wide range of scientific research, including Nobel Prize-winning research in chemistry in 2024 to predict complex protein structure.
One of the benefits of deep learning is its ability to recognize patterns or features without explicit human programming, but this process can be opaque. This “black box” quality of deep learning raises questions about how exactly the models operate and makes them challenging to validate and optimize.
In the following Q&A, Penn State Professor of Mathematics Leonid Berlyand and graduate student Oleksii Krupchytskyi speak about how they are applying mathematical principles to elucidate the black box nature of deep learning.
What is deep learning?
Berlyand: Deep learning is a type of machine learning that uses artificial neural networks to learn from data, similar to the way humans learn. These networks, also called ANNs, were originally developed by computer scientists and are inspired by the structure of the human brain. An ANN consists of nodes connected by edges that are typically arranged in layers.
Loosely speaking, these nodes are “artificial neurons” and the edges mimic synapses that connect neurons in the brain. Learning takes place during the training process, in which data is introduced into the network and the ANN iteratively adjusts the weights of the connections to reduce errors in its predictions.
What is deep learning used for?
Berlyand: Deep learning drastically changed many areas of science and technology, including speech and voice recognition, computer vision, and natural language processing. A simple example would be a classification problem, like your phone deciding if a face is you or not you or classifying images, such as handwritten digits 0 through 9. In the latter, the input is an image, and its pixels are converted into a vector whose components are the intensity of each pixel. The output classifies the image of a digit as 0, 1, 2 and so on.
Recently, ANN-based large language models have become universally popular due to their excellent performance in a wide variety of applications, including education, health care and scientific research. In fact, so far this year, ChatGPT hit about 700 million weekly users.
Krupchytskyi: Deep learning networks are particularly good at analyzing large amounts of unstructured data, like images and text. It’s widely used in chatbots, image recognition like that required for self-driving cars and recommendation services like those used by video streaming platforms.
What makes it ‘deep?’
Berlyand: Between the input and output layers, artificial neural networks have many hidden layers. For example, if you have a model that is classifying the digits 0 to 9, one layer might focus on the edges of the image, another might focus on the darkness of certain pixels, with each layer identifying increasingly complex features. It was observed empirically that adding more and more layers improves the accuracy of ANNs and allows us to answer more complex questions. A model with more layers is considered “deeper,” hence “deep learning.”
Krupchytskyi: Deep learning models can have hundreds of such layers and millions and trillions of parameters. With deep learning, humans don’t explicitly program every connection between the layers—the model establishes these functions itself, automatically discovering relevant features. This type of model is often called a “black box,” because we don’t know exactly what is going on. One of our goals is to apply mathematical tools to better understand what these models are actually doing so we can ensure their robustness and ultimately improve their performance.
What can we gain by applying mathematical foundations to deep learning?
Berlyand: Deep learning was originated and advanced largely by computer scientists and engineers. My Penn State colleague Pierre-Emmanuel Jabin, distinguished professor of mathematics, and I wanted to provide a rigorous mathematical underpinning to various performance criteria of ANNs, such as stability and convergence of training algorithms, or when algorithms can be considered “trained.” This motivation led us to write a simple introductory textbook for undergraduate mathematics students, where definitions and concepts from deep learning are presented in a precise mathematical framework.
What I tell my students is you can be a race car driver and know how to operate the car, but if you don’t know what is inside, you can’t improve it or design a new one. Similarly, mathematical understanding of deep learning will result in a better prediction accuracy and improvement in the performance of ANNs.
Krupchytskyi: There are so many different use cases for deep learning, but the underlying mathematics is the same for all of them. Having a fundamental understanding of deep learning is important to creating reliable, interpretable and robust networks.
Computer scientists and engineers have many tools to improve the performance of ANNs that are largely based on empirical observations. We bring rich mathematical theories that have been developed for decades or even centuries and have been applied to and improved various fields, like physics, materials sciences and life sciences. Using mathematics in deep learning helps us understand which types of problems are most appropriate for ANNs, how to best structure the networks, how long they should train and can generally help improve stability.
Citation: Q&A: How mathematics can reveal the depth of deep learning AI (2025, November 5) retrieved 5 November 2025 from https://phys.org/news/2025-11-qa-mathematics-reveal-depth-deep.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.