S-1-11 A Cloud-based Platform and Infrastructure Integrating Materials Simulation, Data, HPC, and AI

A Cloud-based Platform and Infrastructure Integrating Materials Simulation, Data, HPC, and AI

Xiaoyu Yang1*, Xinjie Ma2, Lifang Xu2, Xiaoneng. Ran2, Zhongxin Yu2, Zhiyao Bo2, Bo Liu2

1 Computer Network Information Center, Chinese Academy of Sciences

2 Beijing Maigao Matcloud Technology Co. Ltd

 

ABSTRACT: The challenge of using artificial intelligence to help with photoelectron catalysis materials design is the lack of data. While quantum mechanical simulation can be used to produce materials data, the difficulties of using it to produce data still barrier many users. For example, users have to understand Linux, spend time to find computational resources, and storage resources, etc. In particular, once simulation completes, the core materials data has to be extracted from the simulation result. Also, storing the extracted data into the database for sharing and well management requires extra work.

In order to address this need, we developed MatCloud. MatCloud is a Cloud-based computational infrastructure for the integrated management of materials simulation, data and computing resources. It is directly connected to a computing cluster and a materials simulation database, integrating the computing facilities, data, various scripts, and simulation code together to automatically manage the creation and running of simulation jobs, the subsequent extraction of core output information, and the longer-term archival of materials properties data. One of important novelties of MatCloud is that it provides a graphical user interface for end user to create a customised workflow for running materials simulations. Once simulation completes, the required materials properties have been acquired and preserved in the material property database.  The more users use MatCloud, the more simulation data in MatCloud will be accumulated (for using licensed software such as VASP user must provide license). If users do not wish their data open to public, they can set their data not searchable by others.

This talk illustrates challenges of high-throughput simulations, the development of MatCloud, and how MatCloud supports the integrated management of materials simulation, data and computing resources.

 

Keyword: High-throughput Simulation Infrastructure; MatCloud+; Materials Informatics; Integration of Materials simulation, data, HPC and AI.

Brief Introduction of Speaker
Xiaoyu Yang

Prof. Xiaoyu Yang is currently working at Computer Network Information Center, Chinese Academy of Sciences (CAS). He joined CAS in 2012, and awarded the “100 Talent Program” fellowship of Chinese Academy of Sciences. Prof Yang’s research interests currently focus on Material Genome Initiative, which includes: materials informatics, materials genome initiative informatics, materials simulation and data infrastructure etc. Prof. Yang completed his post-doctoral research at the University of Cambridge, UK in 2008. He was previously a Research Associate in the Department of Earth Sciences and affiliated software engineer in Cambridge e-Science Centre, at the University of Cambridge. He joined School of Electronics and Computer Sciences, University of Southampton, UK in 2008, as a Research Engineer. In 2010, he worked in Reading e-Science Center, University of Reading. Prof. Yang has earned an MSc degree in IT (2001) and a PhD degree in Systems Engineering (2006) from "Faculty of Computing Science and Engineering" at the De Montfort University, UK.