MGE Database and Application in Materials
Science
Yanjing Su*
Beijing Advanced
Innovation Center for Materials Genome Engineering, Beijing, 100083, China
University of Science
and Technology Beijing, Beijing 100083, China
ABSTRACT: Material Genome
Engineering (MGE) database should realize flexible storage and retrieval of
complex heterogeneous material data, which serves material high-throughput
calculations/experiments by automatically collecting and archiving massive
data, together with data analysis and mining application. A novel MGE database
system has been successfully developed, based on data dynamic containers and
NoSQL database technologies. More than 10 field types, including string, file,
image, table, and container and so on, have been designed for users to customize
the personalized data template. Users can customize their own data schema by
dragging and dropping components through friendly graphical user interface, so
as to realize the flexible description and convenient exchange of data. A
high-throughput first-principles computational tool has been developed, which
has realized the generation and submission of high-throughput calculation jobs,
automatic processing and archiving of the result. A materials data mining platform
and feature parameter automatic screening system has been developed with more
than ten kinds of basic algorithm toolboxes, including pattern recognition,
support vector machine, artificial neural network and so on, and several
materials-oriented toolboxes such as the performance optimization design and
phase classification of multi-component alloys based on machine learning,
material data symbol regression technology based on genetic programming, and
material text mining technology. By system integration, a scalable MGE database
system framework (www.MGEdata.cn) is formed that integrates data collection,
database, data mining, and material design. This talk introduces the main
progress of the research and development of the MGE concept database technology
and the main functions of the MGEdata database.
Keywords: Material Genome Engineering; database; high-throughput computational; machine learning
Yanjing Su, professor in Department of Material Physics, University of Science and Technology Beijing. He got his Ph. D in USTB associated with HongKong University of Science and Technology in 2000. After that, he worked in National Institute of Materials Japan as JSPA fellowship. He has published ~150 journal papers, including Nano Lett., Acta Mater., npj Comput. Mater.. His research interests are material physics and chemistry include (1) computer simulations, including DFT, MD and EF; (2) nanomechanical characterization in nanomaterials; (3) deformation and fracture under mechanical load/chemical environment; (4) materials informatics, including material database, and materials dada science.