4-10. Tensile properties prediction for reduced activation steel based on feature engineering

4-10. Tensile properties prediction for reduced activation steel based on feature engineering

Chi Zhang1*, Chenchong Wang2, Chunguang Shen2

1. Department of materials science and engineering, school of materials, tsinghua university

2. State key laboratory of rolling technology and continuous rolling automation, Northeastern University

Abstract: The machine learning and artificial intelligence algorithms are attracting more and more attention from materials researchers with the development of materials genetic engineering and integrated computing & design to overcome the constraint and sensitivity to parameters of traditional physical metallurgy models. Supervised and unsupervised algorithms are widely used in predicting performance, composition & structure of potential materials, new phases and research of phase transformation. Although the training of most machine learning models requires more time, the model established has a series of advantages such as high prediction accuracy and wide application range with several orders of magnitude calculation speed compared to traditional methods. Supervised machine learning algorithms applied to materials include random forest(RF), support vector regression(SVM), and multi-layer perceptron artificial neural networks(ANN).

The random forest was used to predict reduced activation steels’ performance in this work. The accurate prediction of tensile properties has great importance for the service life assessment and alloy design of RAFM steels, and the calculated process window for the balance of strength and plasticity could provide guidance for the further design and development of RAFM steels. Machine learning algorithm was used to establish universal models for the prediction of RAFM steels’ yield strength and total elongation. A database with a wide range of compositions and treatment processes of RAFM steels was first established. Then, feature engineering methods were used to select highly correlated features. With the reasonable selection of machine learning algorithms and test/training set partitioning strategy, random forest regressors were trained by the selected features. The prediction results proved that the feature engineering guided random forest regressors had higher accuracy and universality in the prediction of RAFM steels’ yield strength and total elongation.

This study made a preliminary attempt to modify the performance of machine learning algorithms in the field of RAFM steels by using feature engineering guidance. Part of tensile properties prediction results is shown in Fig.2. The main idea of feature engineering was to select and obtain the most useful features, not only remove the useless one as shown in this work, but also introduce the critical features. The properties of steels critically depended on microstructures. However, only composition and processes were used as features to input into most previous traditional machine learning algorithms. It is reasonable to assume that the performance of machine learning algorithms will improve if more microstructure information were used as feature inputs. In the next research, more microstructure information (such as phase fraction, driving force, growth rate, etc.) calculated by various thermodynamic simulation methods will be input into a machine learning model as features. Based on the results in this work, more meaningful researches could be made by feature engineering guided machine learning in other steels and target performance.

Keywords: Performance prediction; Machine learning; Feature engineering

Fig.1 Flow of reduced activation steels’ tensile properties prediction based on machine learning

图1基于机器学习的低活化钢拉伸性能预测计算流程


Fig.2 Prediction results of tensile properties of low activation steel based on machine learning

图2基于机器学习的低活化钢拉伸性能预测计算结果




基于特征工程的低活化钢拉伸性能预测

张弛1*,王晨充2,沈春光2

1. 清华大学材料学院材料科学与工程系,2. 东北大学轧制技术及连轧自动化国家重点实验室

摘要:随着集成计算材料设计和材料基因工程的发展,为了克服传统物理冶金学模型受到的物理机理和参数敏感性的制约,机器学习和人工智能算法越来越受到材料研发领域学者的关注。近年来,利用监督和非监督机器学习算法计算材料性能、探索材料成分分布、鉴别新的材料结构、发现量子相及辨识相和相变的研究日益广泛。尽管多数机器学习模型的训练需要消耗较多的时间成本,但模型建立之后可以获得计算准确性高、适用范围广及尺度范围大等一系列优势,较传统计算方法速度可加快几个数量级。应用于材料上的监督性机器学习算法包括随机森林、支持向量回归及多层感知机人工神经网络等。

在本研究中,随机森林算法被应用于进行低活化钢的性能预测,为后续的低活化钢研发提供理论基础。RAFM钢拉伸性能的准确预测对其服役寿命预估和合金设计有重要意义,综合考虑强度和塑形的工艺窗口计算为RAFM钢的设计和开发提供指导。为了突破传统物理冶金模型的限制,本研究使用机器学习算法建立RAFM钢屈服强度和延伸率的预测模型,建立了涵盖较宽成分范围和热处理工艺的RAFM钢数据库,并用特征工程的方法筛选具有高关联性的特征,使用这些特征进行随机森林回归训练。预测结果证明,特征工程指导的随机森林回归相较于传统的物理冶金模型在RAFM钢屈服强度和延伸率预测上具有更高的准确性和更好的多方面性。

本研究对利用特征工程指导改进RAFM钢领域的机器学习算法性能进行了初步尝试,部分拉伸性能预测结果如图2。特征工程的主要思想是选择和获取最有用的特征——不仅包括如本研究所示的移除无用特征,而且包括引入其他关键特征。钢的组织决定性能,目前的大多数传统机器学习算法只考虑将成分和工艺作为特征输入,因此有理由认为,更多组织信息作为特征输入可以提高机器学习算法性能。在将来的研究中,由多种热动力学模拟方法计算得到的组织信息(包括相分数、驱动力、生长速度等)将作为特征输入到预测钢种性能的机器学习模型中。因此,可以进一步利用特征工程指导机器学习,在本研究的基础上突破钢种和目标性能的限制,开展更多有意义的工作。

关键词:性能预测;机器学习;特征工程

Brief Introduction of Speaker
张弛

工作履历:

2010.03 -至今 清华大学材料学院,任党委副书记负责教工、学生工作

2007.03 - 2010.03 清华大学材料科学与工程系,任业务办主任负责系教务工作

2004.07 - 2007.03 清华大学材料科学与工程系,任学生组长负责本科生工作

2001.10 - 2003.12 日本茨城大学材料科学系, 博士后研究员,从事日本金属研究与发展中心组织的纳米金属项目的研究

1996.09 - 1999.07 清华大学材料科学与工程系96级本科生辅导员

2018.12 - 至今 清华大学材料科学与工程系 教授

2005.12 - 2018.12 清华大学材料科学与工程系 副教授

2004.01 - 2005.12 清华大学材料科学与工程系 助理教授

2001.10 - 2003.12 日本茨城大学材料科学系, 博士后研究员,从事日本金属研究与发展中心组织的纳米金属项目的研究

研究概况: 在金属的固态相变机理、钢铁中析出相界面、结构及性能等方面开展了较深入的理论及实验研究工作,目前的科研方向主要有聚变堆用低活化钢的研究,耐热钢服役期间组织及性能的演化规律,钢铁材料的表面处理技术,高温材料的氧化腐蚀行为等。目前负责一项国家自然基金项目、并参加了多项国家973、863研究工作。此外还负责了国内外多项横向合作项目,发表SCI收论文录约140篇。

Email: chizhang@tsinghua.edu.cn