Strategies for Constructing Near-Infrared Spectral Libraries for the Identification of Drug Substances

Abstract
The suitability of near-infrared reflectance analysis (NIRA) as a noninvasive method for “real-time” spectral verification of drug substances is evaluated. Strategies for development and optimization of reference libraries by different mathematical means are discussed. In order to compare data preprocessing techniques, factor-based libraries were constructed with the use of raw and second-derivative spectra of 17 benzodiazepines collected over the wavelength range from 1100 to 2500 nm. Validation of the models by predicting the identity of test set samples using the correlation coefficient proved that using the derivatives enhanced the library's selectivity dramatically, as the recognition rates increased from 25 to 100%. Two pattern recognition methods, correlation coefficient and distance, were applied to confirm the identity of 117 drugs by using libraries based on full-range second-derivative spectra. Recognition rates of 99.2% were obtained from a factor-based library with the use of the correlation coefficient. Even structural analogues could be reliably classified among highly dissimilar drugs. The construction of sub-libraries consisting solely of similar drugs offered no advantages. Considerable data reduction by principal component analysis (PCA) made NIRA a rapid and effective method of analysis providing a high degree of reliability.