Analytical Sciences


Abstract − Analytical Sciences, 24(5), 647 (2008).

Random Subspace Regression Ensemble for Near-Infrared Spectroscopic Calibration of Tobacco Samples
Chao TAN,*,** Menglong LI,* and Xin QIN**
*College of Chemistry, Sichuan University, Chengdu, Sichuan 610064, People's Republic of China
**Department of Chemistry and Chemical Engineering, Yibin University, Yibin, Sichuan 644007, People's Republic of China
An ensemble, a model-independent technique based on combining several models for classification/regression tasks, allows us to achieve a high accuracy that is often not achievable with single models. Such combinations have gained increasing attention in many fields. This paper proposes the use of random subspace (RS)-based regression ensemble as an alternative method for near-infrared (NIR) spectroscopic calibration of tobacco samples. Because of the considerable reduction of variables in a random subspace, multiple linear regression (MLR) is used as the base algorithm and the method is therefore also referred to as RS-MLR. The overall performance of the proposed RS-MLR method is compared to those of partial least square regression (PLSR), kernel principal component regression (KPCR) and kernel partial least square regression (KPLSR). The results reveal that the RS-MLR method not only has a simple concept but also can produce a more parsimonious and more accurate calibration model than PLSR, KPCR and KPLSR, at a lower computational cost. Besides, we also found that the RS-MLR method is very appropriate for the so-called small sample problems and that the calibration models built by RS-MLR are less sensitive to overfitting.