EXTENDED ABSTRACT: Retrosynthesis is the core of organic chemistry and process design. Recently, the rapid development of artiffcial intelligence (AI) [1,2,3] has promoted the development of various novel machine learning methods to achieve data-driven comprehensive planning. However, chemical process desgin knowledge by data-driven approach for practical synthesis planning has not yet been adequately achieved and remains a challenging problem. The model performance still leave room for improvement for process design purpose and important factors such as cost, yield, and experimental time in retrosynthesis planning are rarely considered in the current framework. In this work, we have proposed a new data-driven framework (ChemPro), which incorporates design parameters such as ligand price score, ASScore, SCScore, and conditional score for one-step and multi-step retrosynthesis predictions. Speciffcally, we design and train the data-driven model using Bayesian optimization for adjusting design parameters to optimize the rank of predicting directions. The experimental results showed that ChemPro successfully predicts synthetic routes based on the speciffed process design parameters, indicating that the synthetic routes with the design parameters were preferred to those without the design parameters. The new framework can improve one-step models signifcantly using the standard USPTO-50k benchmark dataset and achieve 92.21% Top-10 accuracy. The framework is available to be used and released in https://chempro.hourstec.com/.
Keywords: computer-aided synthesis planning, retrosynthesis, artiffcial intelligence
REFERENCES:
[1] Segler MH, Preuss M, Waller MP (2018) Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555:604–610 [2] Coley CW, Rogers L, Green WH, Jensen KF (2018) SCScore: synthetic complexity learned from a reaction corpus. J Chem Inf Model 58:252–261 [3] Lan, Z., Zeng, Z., Hong, B., Liu, Z., & Ma, F. (2023). RCsearcher: Reaction Center Identiffcation in Retrosynthesis via Deep Q-Learning. arXiv e-prints, arXiv-2301.
Zuo Zeng, founder and CEO of HOURS technology. Zuo Zeng completed Master's Degree at Carnegie Mellon University in 2015 and PhD at Auburn University in 2020, followed by postdoctoral at Georgia Institute of Technology in 2021. Zuo Zeng focus on data driven modelling for chemical synthesis process design. He has published over 30 papers in these ffelds and was receipt of PSE2018 Young Researcher Award (2018), Suzhou Leading Talents (2021), Jiangsu Double Creation Plan (2023).