Bao Pham | Graduate Education

A PhD student in Computer Science, Bao Pham’s research interests lie in the development of efficient and interpretable modern generative neural networks and the understanding of them. Today, we see a rise in large language models (LLMs), and humanity is experiencing a rapid change in technological and social culture due to sophisticated deep learning (DL) systems. Such a rapid change can be seen as beneficial (humanity is advancing itself to a higher level) or detrimental (humanity could be “feeding itself” to its own creation). Regardless of one’s views on these sophisticated technologies, it is difficult to understand the behavior of most DL systems, especially LLMs.
While we know much about the performance of DL system during training, there are still many questions about the internal behaviors of these sophisticated models. We do not know what information is stored inside of their artificial neurons, what are the internal dynamics of the neural networks, or, once deployed, if their predictions are reliable and accurate when given unseen inputs? These issues may prevent us from developing and maintaining safe AI systems.
While scaling up DL models often yields improved performance, these systems still do not compare to the efficiency of the human brain, which is smaller and requires very little power to function and learn. Additionally, these models have a variety of challenges, which researchers are now trying to understand, explain, and mitigate against. For example, hallucination, when the model produces false facts, and memorization, when the model duplicates its training data, are two prominent problems. Bao is excited to be collaborating with Dmitry Krotov, a physicist at the MIT-IMB Watson Laboratory, where he spends his summers as an intern. Dr. Krotov was mentored by the recent Nobel Prize for Physics recipient, John Hopfield, whose work is foundational to the approaches Bao and Dr. Krotov are employing. One of Bao’s cherished possessions is a chocolate Nobel Prize coin that Dmitry brought back to him from the ceremony. The team’s efforts were recently highlighted in Quanta Magazine.
Bao and his team are now applying modern formulations of Associative Memory, or Hopfield models, to help us understand the “black box” behaviors in neural networks. “We can describe and interpret the dynamics of artificial neural networks through the usage (and perspective) of an energy function. The interpretation of the network via energy enables us to understand what the model is learning and doing, and we can track its behaviors with an energy function even after training. Albeit it requires some thinking to solve for the respective energy function,” he says. Hopfield networks are the first models which introduced the concept of energy-based modeling. These systems are designed with the goal of storing information as local minima of their energy function. As a result, this aspect presents them as a unique type of recurrent neural network (RNN) that naturally demonstrates both interpretability and memory recall.
Bao studies artificial neural networks' behavior through the perspective of energy, leading to novel efficient neural architectures and new interpretations of how they learn to generalize. For example, one can mathematically craft a global energy function, which drives the dynamics of the model, as done in his work of Energy Transformer. Or, one may employ the perspective of Associative Memory, to map the generalization objective in generative models as a failure in memory recall, showing that generative models do indeed transition from associative memory systems to generative systems, behaving similarly to how humans learn.
Developing a strong background in high-performance computing at Harrisburg University, Bao was attracted to the work of RPI’s Professors Alex Gittens, Sibel Adali, and Tomek Strzalkowski, along with access to AIMOS, our Artificial Intelligence Multiprocessing Optimized System GPU-based supercomputer. Advised by Prof. Mohammed J. Zaki for his PhD thesis, Bao intends to go into industry post-doctorate to continue to tackle problems of explainability, interpretability, and efficiency while designing newer and performative neural networks. Commenting on Bao’s research, Zaki adds “Bao’s thesis combines aspects of three promising paradigms in machine learning, namely, associative memories, energy-based models, and diffusion models, to further our understanding of modern AI systems, and to make them more efficient and robust.”
When not working, Bao can be found running long distances alone or with friends to prepare for events. No matter how experienced or strong a runner is, Bao recommends bringing plenty of water, a lesson he learned on an enjoyable but challenging seventeen-mile journey between Latham and Saratoga Springs.