Matgen: A Material Design Platform
Integrating High-throughput Calculation, Automated Workflow and Repository
Pin
Chen, Hui Yan, Sen Gao, Qing Mo, Zexin Xu, Yu Wang,
GeChuanqi Pan, Han Chen & Yunfei Du*
National Supercomputer Center in Guangzhou,
School of Data and Computer Science, Sun Yat-sen University, Guangzhou, 510006,
China
ABSTRACT: The
goal of “Materials Genetic Engineering” is to establish a new research and
development (R&D) method based on “theoretical prediction and experimental
verification”, which is integrated with high-throughput calculation,
experiments and data science and played an important role in the discovery of
novel materials. Supported by Key-Area Research and Development Program of
Guangdong Province, we have independently developed a new material platform,
named Matgen, that integrates high-throughput calculation, automatic workflow
and repository to allow front-end modeling, material property calculation, data
processing, etc. (accessing address: https://matgen.nscc-gz.cn/).
The followings are the main features of Matgen.
1. Matgen
adopts an easy-to-access web service. The modules of calculation, workflow and
repository are highly coupled. Besides, an extensible microservice design model
is used, each module is independently developed and maintained.
2. The
calculation module contains platform support software and scientific computing
software. The former includes repository construction software Matgen-toolkit,
data accessing and analysis software Matgen-API, and online visualization
software 3DStructGen. The scientific computing software currently supports
VASP, CP2K, Quantum Espresso, RASPA, etc.
3. Workflow
module adopts self-developed software, named Matflow, for task submission,
monitoring, result analysis and visualization. As a result, the batch data can
be obtained with little interaction.
4. The
repository module currently contains molten salt, porous, polymer, biological
and other material data according to the application development situation. The
total data storage capacity exceeds 90 T, and the data volume is still
increasing. The core data includes 94,000 deduplicated experimental crystal
structure data, and 67,000 experimental crystal structures of electrical,
optical, magnetic, and elastic calculation data.
5. Multi-level
authority management. The normal users can query and view data, and the
registered users can query, view and obtain data. In addition, the registered
users can use calculation input files and workflow calculation templates to
submit jobs when the Tianhe-2 supercomputer account was associated.
Keywords: Platform, High-throughput calcula
Professor Yunfei Du is the doctoral supervisor in the School of Data and Scientific Computing, Sun Yat-sen University (SYSU) and graduated from the National University of Defense Technology (NUDT). He has been engaged in the development of domestic high-performance computer (HPC) system software for a long time. As a technical backbone, he has participated in the development of HPC for two generations of Tianhe (Tianhe-1, Tianhe-1A, Tianhe-2). As the chief engineer of the Supercomputing Center in Guangzhou, he is fully responsible for the construction and application of the Tianhe-2 system. Application software platforms such as astrophysics, atmosphere and ocean, biomedicine, industrial design and manufacturing have been constructed, the Tianhe Starlight platform has been developed and put into actual business use, and a cloud computing and big data platform based on supercomputing has been established. He has led serial projects from national science foundation, as well as department of Science and Technology of Guangdong Province. He has published more than 50 papers and obtained 16 patents. He has made reports at important international supercomputing conferences. He is a member of the HPC Committee of the China Computer Federation (CCF) and the Deputy Chairman of the Supercomputing Committee of the Guangdong Computer Society.