Zhengqi He

Research Scientist

RIKEN Center for Brain Science

Biography

Zhengqi He is a research scientist at the RIKEN Center for Brain Science, collaborating with Professor Taro Toyoizumi. His research focuses on computational theories for natural language processing in both the brain and AI, higher-order cognitive functions, and mining of biological big data. Previously, he worked as a research associate at the Facility for Rare Isotope Beams, where he led the development of the online beam tuning model, FLAME.

Interests

Computational Neural Science
Natural Langauge Processing
High-level Cognition
Brain and AI

Education

PhD in Engineering Physics, 2014
Tsinghua University
Graduate Program Candidate in Physics, 2013
Michigan State University
B.E. in Engineering Physics, 2010
Tsinghua University
Minor in Applied Computer Science, 2010
Tsinghua University

Experience

Research Scientist

RIKEN Center for Brain Science

November 2017 – Present Saitama, Japan

Activities include:

Discovery of functional parallel between language models and human brain using causal graph.
Discovery of spontaneously emerging syntax v.s. semantics in two-tower language models.
Development of an information-theoretic progressive framework for interpretation.
Interdisciplinary study between AI and Brain Science

Research Associate

Facility for Rare Isotope Beams

February 2012 – November 2017 Michigan, US

Activities include:

Leading the development of FRIB online beam tuning modeling engine FLAME
Development of intelligent online beam tuning applications
Development of new physics model for FRIB non-axisymmetric radio frequency cavities and multi-charge state acceleration.

Featured Publications

Zhengqi He, Taro Toyoizumi

November, 2023

Causal Graph in Language Model Rediscovers Cortical Hierarchy in Human Narrative Processing

Understanding how humans process natural language has long been a vital research direction. The field of natural language processing (NLP) has recently experienced a surge in the development of powerful language models. These models have proven to be invaluable tools for studying another complex system known to process human language, the brain. Previous studies have demonstrated that the features of language models can be mapped to fMRI brain activity. This raises the question, is there a commonality between information processing in language models and the human brain? To estimate information flow patterns in a language model, we examined the causal relationships between different layers. Drawing inspiration from the workspace framework for consciousness, we hypothesized that features integrating more information would more accurately predict higher hierarchical brain activity. To validate this hypothesis, we classified language model features into two categories based on causal network measures, “low in-degree” and “high in-degree”. We subsequently compared the brain prediction accuracy maps for these two groups. Our results reveal that the difference in prediction accuracy follows a hierarchical pattern, consistent with the cortical hierarchy map revealed by intrinsic time constants. This finding suggests a parallel between how language models and the human brain process linguistic information.

Zhengqi He, Taro Toyoizumi

January, 2023 Neural Computation (2023) 35(1)38-57

Progressive Interpretation Synthesis: Interpreting Task Solving by Quantifying Previously Used and Unused Information

A deep neural network is a good task solver, but it is difficult to make sense of its operation. People have different ideas about how to form the interpretation about its operation. We look at this problem from a new perspective where the interpretation of task solving is synthesized by quantifying how much and what previously unused information is exploited in addition to the information used to solve previous tasks. First, after learning several tasks, the network acquires several information partitions related to each task. We propose that the network, then, learns the minimal information partition that supplements previously learned information partitions to more accurately represent the input. This extra partition is associated with un-conceptualized information that has not been used in previous tasks. We manage to identify what un-conceptualized information is used and quantify the amount. To interpret how the network solves a new task, we quantify as meta-information how much information from each partition is extracted. We implement this framework with the variational information bottleneck technique. We test the framework with the MNIST and the CLEVR dataset. The framework is shown to be able to compose information partitions and synthesize experience-dependent interpretation in the form of meta-information. This system progressively improves the resolution of interpretation upon new experience by converting a part of the un-conceptualized information partition to a task-related partition. It can also provide a visual interpretation by imaging what is the part of previously un-conceptualized information that is needed to solve a new task.

Zhengqi He, Taro Toyoizumi

October, 2022

Spontaneous Emerging Preference in Two-tower Language Model

Recent advances in the development of large language models have led to substantial enhancements in performance across an array of downstream tasks. Remarkably, these models, trained with straightforward end-to-end objectives, have demonstrated an inherent ability to manage language tasks. Not long ago, tackling language tasks heavily depended on our in-depth understanding of language. The convergence of these trends provides an excellent opportunity to delve into their relationship. Specifically, we pose the question, can contemporary deep neural network (DNN) based end-to-end language modeling paradigms provide us with insights into language? In this paper, we focus on a long-standing linguistic debate, can syntax and semantics be separated? We argue that by incorporating an inductive bias for labor division, the separation between syntax and semantics naturally emerges in the English language. To demonstrate this, we employ a two-tower language model setup. Here, two language models with identical configurations are trained collaboratively in parallel. Intriguingly, this configuration results in a spontaneously emerging preference where specific tokens are consistently better predicted by one tower, while others by the second tower. This pattern remains qualitatively consistent across different model structures and reflects separation of syntax and semantics. Our findings show the potential of DNN-based end-to-end trained language models in deepening our comprehension of the properties of natural language.

Selected Publications

Francesco Fumarola, Zhengqi He, Łukasz Kus ́mierz, Taro Toyoizumi (2022). Decoding silence in free recall. Phys. Rev. Research, 4.

PDF Cite

Roberto Legaspi, Zhengqi He, Taro Toyoizumi (2019). Synthetic agency: sense of agency in artificial intelligence. Current Opinion in Behavioral Sciences, 29(84-90).

PDF Cite

Zhengqi He, M. Davidsaver, K. Fukushima, G. Shen, J. Bengtsson, M. Ikegami (2019). The fast linear accelerator modeling engine for FRIB online model service. Computer Physics Communications, 234(167-178).

PDF Cite Code