Xiangying Meng1,2*, Zixi Jia3, Xiaofeng Liu4, Gaowu Qin2,*
1. College of Sciences; 2. Key Laboratory for Anisotropy and Texture of Materials (Ministry of Education), School of Materials Science and Engineering; 3. Faculty of Robot Science and Engineering; 4. Information Construction and Network Security Offic, Northeastern University, Shenyang 110819, China
Abstract: The construction of conformation-performance relationships (CPRs) is the core issue of efficient material design. In this paper, we propose a generalized low-dimensional CPRs (GLD-CPRs) strategy based on data processing technology for material design. This strategy can narrow the searching space and thus accelerate materials discovery. From a pure computing perspective, the strategy covers three parts: data-generation, data-management and data-utilization. In these aspects, the critical issues of data processing technology, including correlation analysis, attributes construction, data-cleaning and classification, weight evaluation, regression model training etc, are well illustrated form the viewpoint of material science, which provides a universal solution to the efficient material oriented design. This solution has been concreted by a case study of semiconductor band gap engineering. Based on more than 20,000 items of semiconductor band gap and attributes data, a set of CPRs for band gap regulation towards arbitrary semiconductor of the same kind can be generated using GLD-CPRs strategy. By comparing with the reported studies, the reliability of the strategy is confirmed. Finally, future improvements needed by GLD-CPRs are addressed. The topic of this work highlights the construction of CPRs using material informatics method, adds new depth and width to our understanding of material gene on performance. Facing the challenge of GLD-CPRs, it is hopeful to broaden the research field of materials science.
Keywords: Machine learning; Material design; Band gap engineering; Material database; High throughput calculations

Figure 1. Implement of GLD-CPRs strategy in bandgap design.

Figure 2. Weight evaluation of bandgap genes for binary semiconductors.

Figure 3. Regression model prediction of bandgap changes with components in rutile-TiO2.
毕业于中科院理化技术研究所,东北大学引进人才,美国加州理工学院访问学者,教授,博士生导师。长期从事计算材料学研究,通过第一性原理高通量计算结合智能算法,从电子和原子尺度构建材料“结构-性能”关系的理论框架。以第一(通讯)作者身份在Energy Environ. Sci., Chem. Mater., J. Phys. Chem. C, Appl. Phys. Lett. 等期刊发表SCI论文30余篇,拥有软件著作权3项,主持包括国家自然科学(青年)基金、863项目子课题、国家重点研发计划子课题等国家、省部级纵向项目10余项。
Email: Xiangying Meng (x_y_meng@mail.neu.edu.cn); Gaowu Qin (qingw@smm.neu.edu.cn)