Application of Feature Extraction Methods for Chemical Risk Classification in the Pharmaceutical Industry
Open Access
- 26 August 2021
- Vol. 21 (17), 5753
- https://doi.org/10.3390/s21175753
Abstract
The features that are used in the classification process are acquired from sensor data on the production site (associated with toxic, physicochemical properties) and also a dataset associated with cybersecurity that may affect the above-mentioned risk. These are large datasets, so it is important to reduce them. The author’s motivation was to develop a method of assessing the dimensionality of features based on correlation measures and the discriminant power of features allowing for a more accurate reduction of their dimensions compared to the classical Kaiser criterion and assessment of scree plot. The method proved to be promising. The results obtained in the experiments demonstrate that the quality of classification after extraction is better than using classical criteria for estimating the number of components and features. Experiments were carried out for various extraction methods, demonstrating that the rotation of factors according to centroids of a class in this classification task gives the best risk assessment of chemical threats. The classification quality increased by about 7% compared to a model where feature extraction was not used and resulted in an improvement of 4% compared to the classical PCA method with the Kaiser criterion, with an evaluation of the scree plot. Furthermore, it has been shown that there is a certain subspace of cybersecurity features, which complemented with the features of the concentration of volatile substances, affects the risk assessment of chemical hazards. The identified cybersecurity factors are the number of packets lost, incorrect Logins, incorrect sensor responses, increased email spam, and excessive traffic in the computer network. To visualize the speed of classification in real-time, simulations were carried out for various systems used in Industry 4.0.Keywords
Funding Information
- Narodowe Centrum Nauki (2017/27/B/ST6/01325)
This publication has 55 references indexed in Scilit:
- Short Scales – Five Misunderstandings and Ways to Overcome ThemJournal of Individual Differences, 2014
- A new look at Horn’s parallel analysis with ordinal variables.Psychological Methods, 2013
- Automatic exudate detection with improved Naïve-bayes classifierPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2012
- A K-nearest neighbours method based on imprecise probabilitiesSoft Computing, 2011
- Dimensionality assessment of ordered polytomous items with parallel analysis.Psychological Methods, 2011
- Exploring the Sensitivity of Horn's Parallel Analysis to the Distributional Form of Random DataMultivariate Behavioral Research, 2009
- Support-vector networksMachine Learning, 1995
- Classification and Regression Trees.Biometrics, 1984
- A rationale and test for the number of factors in factor analysisPsychometrika, 1965
- The Application of Electronic Computers to Factor AnalysisEducational and Psychological Measurement, 1960