EXTENDED ABSTRACT: The resurgence and widespread application of artificial intelligence generally rely on the combination of big data and
deep learning algorithms. “All you need is attention”. Currently popular Transformer algorithms in large language models require large amounts of data to achieve a multi-head "self-attention" mechanism. However, data in materials science research are often scarce, incomplete, and highly uncertain, posing severe challenges to the search and design within the vast materials parameter space. To enable materials design based on small data, we propose a machine learning (ML) method based on a new concept of "pre-attention" mechanism. Adhering to the principles of feature engineering, we construct a "Center-Environment" (CE) feature model that reflects core-shell structural characteristics and elementary properties of elements by leveraging domain knowledge in materials science. The CE model introduces the concept of pre-attention by applying the limited data solely on a predefined focused feature model with physical significance. “What you need is pre-attention”. The CE pre-attention mechanism shifts the expensive big-data learning and self-search process for attention from complex black-box machine learning algorithms to explicit feature models with clear physical meaning, reducing big data requirements while enhancing the transparency and interpretability of machine learning models. We combine CE features with kernel functions or deep machine learning algorithms to construct machine learning models, successfully applying them to the studies of bulk materials[1-3], surfaces[4-5], and local doping systems[6-8], involving areas such as new material discovery, surface catalysis, and alloy effects. In this talk, I will mainly introduce the ML-CE modeling of Nb/NbSi alloys for alloying element effects on both stability and mechanical properties. Comparative studies show that in small-data
scenarios, our CE machine learning model exhibits higher accuracy and broader applicability than traditional deep learning models based on graph features. Since CE can be used to describe features of any complex crystal structure, machine learning based on CE features can become an effective and general method for data-driven materials design oriented towards small datasets.
Keywords: Machine learning, Pre-attention mechanism, Center-Environment features, Materials design based on small data
REFERENCES:
[1] LI Yi-hang, XIAO Bin, TANG Yu-chao, LIU Fu, WANG Xiao-meng, YAN Fei-nan, LIU Yi. Center-Environment Feature Model for Machine Learning Study of Spinel Oxides Based on First-Principles Computations. The Journal of Physical Chemistry C, 2020, 124(52): 28458–28468.
[2]LI Yi-hang, ZHU Rui-jie, WANG Yuan-qing, FENG Ling-yan, LIU Yi. Center-environment deep transfer machine learning across crystal structures: from spinel oxides to perovskite oxides. npj Computational Materials, 2023, 9(1): 109.
[3] LI Yi-hang, ZHANG Xin-ying, LI Tao, CHEN Ying-ying, LIU Yi, FENG Ling-yan. Accelerating materials discovery for electrocatalytic water
oxidation via center-environment deep learning in spinel oxides. Journal of Materials Chemistry A, 2024, 12, 19362-19377.
[4]WANG Xiao-meng, XIAO Bin, LI Yi-hang, TANG Yu-chao, LIU Fu, CHEN Jian-hui, LIU Yi. First-principles based machine learning study of
oxygen evolution reactions of perovskite oxides using a surface center-environment feature model. Applied Surface Science, 2020, 531: 147323.
[5]CHEN Rong, LIU Fu, TANG Yu-chao, LIU Yan-jie, DONG Zi-qiang, DENG Zhen-yan, ZHAO Xin-luo, LIU Yi. Combined first-principles and
machine learning study of the initial growth of carbon nanomaterials on metal surfaces. Applied Surface Science, 2022, 586: 152762.
[6]GUO Jing, XIAO Bin, LI Yi-hang, ZHAI Dong, TANG Yu-chao, DU Wan, LIU Yi. Machine learning aided first-principles studies of structure
stability of Co3(Al, X) doped with transition metal elements. Computational Materials Science, 2021, 200: 110787.
[7]GUO Jing, XIAO Bin, TANG Yu-chao, LI Yi-hang, ZHAI Dong, FAN Xue, LIU Yi. Element-configuration dependent first-principles machine
learning studies of multiple alloying effects on the structure stability of Co3(Al, W). Computational Materials Science, 2024, 233: 112767.
[8] TANG Yu-chao, XIAO Bin, CHEN Jian-hui, CHEN Shui-zhou, LI Yi-hang, LIU Fu, DU Wan, SHEN Yi-heng, FAN Xue, QIAN Quan, and LIU
Yi. Machine Learning with Center-Environment Attention Mechanism for Multi-Component Nb Alloys. Transactions of Nonferrous Metals Society of China, 2024 (in press).
Prof. Yi LIU obtained his Ph. D. degree at Materials Science and Engineering at Institute of Metal Research in China in 1997. Then he has worked in the field of computational materials science at Nagoya University, Japan (1997-2002); Juelich Research Center, Germany (2002-2003); University of Western Ontario, Canada (2003-2005); California Institute of Technology, US (2006-2012). He is a professor at Materials Genome Institute and Department of Physics at Shanghai University (2015-present) after working at the School of Materials Science and Engineering, the University of Shanghai for Science and Technology (2012-2015). His current research interests focus on the multi-paradigm materials design for advanced alloys, energy materials, and nanomaterials by combining computation (density functional theory and reactive force field molecular dynamics simulations), AI/machine learning, and high-throughput autonomous/self-driving experiment approaches.