787. A Framework for Selecting Machine Learning Models for Real-Time Process Monitoring with Inline FTIR, UV-Vis, and NMR Spectroscopy
An Su, Yingying Cheng, Jingyi Wu, Yuxi Zhan, Jiahui Li, Hongliang Duan, Yuanyuan Xie, Guijun Zhang, Weike Su, chemrxiv, (2026), 10.26434/chemrxiv.10001876/v1
The integration of real-time process analytical technology (PAT) with machine learning is critical for advancing automated chemical manufacturing. However, a systematic framework for selecting the optimal analytical model for a given spectroscopic technique is currently lacking. This study addresses this gap by establishing and validating a reaction, instrument, and model matching strategy. We systematically evaluated twelve machine learning models, including linear, tree based, and neural network architectures, across three complementary inline spectroscopic techniques: Fourier Transform Infrared (FTIR), Ultraviolet Visible (UV-Vis), and Nuclear Magnetic Resonance (NMR) spectroscopy. Our results demonstrate that the best performing models are those whose algorithmic strengths align with the inherent physical properties of the spectral data. For the high dimensional, collinear data from FTIR, dimensionality reduction models like Partial Least Squares (PLS) excelled. For the sparse, linear data from UV-Vis, both linear models and a two-dimensional Convolutional Neural Network (CNN) acting as a peak detector achieved near perfect predictions. For the structurally rich, high resolution NMR data, regularization methods and advanced neural networks proved most effective. This work provides a clear, interpretable, and transferable framework for developing robust, data driven quantitative methods, paving the way for more reliable and intelligent process control in the chemical and pharmaceutical industries.