XTENDED ABSTRACT: R&D of high-performance materials is an extremely tedious and time-consuming process. Advance in natural language processing (NLP) and text mining technologies make it possible to automatically and efffciently mine valuable data and domain knowledge from massive scientiffc literature. However, the complexity of entity representation and the ambiguity of semantic relation in corpus of materials science make it challenging for NLP models towards the cross-task applications. Focusing on NLP assisting materials knowledge discovery, this presentation introduces the solutions and execution schemes of our research group for speciffc materials knowledge extraction based on pipeline and joint methods. In particular, the role of large language models (LLMs) in the whole process of knowledge extraction will be explored, including how to perform data quality governance through LLMs to address the issue of poor data quality due to differences of manual labeling for knowledge extraction; and how to utilize LLMs to deal with the complex task that rely heavily on various external knowledge, so as to promote multi-objective materials knowledge discovery. Finally, this presentation summarizes and prospects the contributions of LLMs in empowering material knowledge extraction. Keywords:Large Language Model; Knowledge Extraction; Data Quality; Domain Knowledge; Materials R&D
Yue Liu obtained her B.S. and M.S. in computer science from Jiangxi Normal University in 1997 and 2000. She ffnished her Ph.D. in control theory and control engineering from Shanghai University (SHU) in 2005. She has been working with the School of Computer Engineering and Science of SHU since July 2000. During that time, she has been a curriculum R&D manager at the Sybase-SHU IT Institute of Sybase Inc. from July 2003 to July 2004 and a visiting scholar at the University of Melbourne from Sep. 2012 to Sep. 2013. At present, she is a professor of SHU. Her current research interests focus on research of data mining, machine learning, and AI for materials science.