Predicting Anatomical Therapeutic Chemical Drug Classes from 17 molecules’ Properties of Drugs by Multi-Label Binary Relevance Approach with MLSMOTE
- 26 December 2021
- conference paper
- conference paper
- Published by Association for Computing Machinery (ACM)
Abstract
Anatomical Therapeutic Chemical (ATC) classes prediction is one of the prominent activities in the costly and tedious pipeline of drug discovery where machine learning plays an important role by minimizing the cost and time of prediction. Most of the existing research have been done to predict ATC classes from the chemical-chemical association, side-effects, target proteins, gene expressions, chemical structures, drug targets, and textual information of drugs. However, the capability of 17 molecules’ properties have not yet been explored to predict drug ATC classes. The current work proposes a methodology for predicting the drug ATC classes using the 17 molecules’ properties. ATC classes prediction is a multi-label classification task and therefore, a binary relevance strategy has been employed to solve this issue with four basic machine learning classifiers, namely K-Nearest Neighbour (KNN), Extra Tree Classifier (ETC), Random Forest (RF), and Decision Tree (DT). The common problem of multi-label datasets is class imbalance which is addressed using the MLSMOTE (Multi-Label Synthetic Minority Over-Sampling Technique). The proposed methodology exhibits promising results, and it achieved the accuracy ranging from 96.90% to 98.06%, which indicates that 17 molecules’ properties are good enough in efficient prediction of ATC classes.Keywords
This publication has 23 references indexed in Scilit:
- Prediction of drug’s Anatomical Therapeutic Chemical (ATC) code by integrating drug–domain networkJournal of Biomedical Informatics, 2015
- MLSMOTE: Approaching imbalanced multilabel learning through synthetic instance generationKnowledge-Based Systems, 2015
- Similarity-based prediction for Anatomical Therapeutic Chemical classification of drugs by integrating multiple data sourcesBioinformatics, 2015
- SuperPred: update on drug classification and target predictionNucleic Acids Research, 2014
- Predicting Anatomical Therapeutic Chemical (ATC) Classification of Drugs by Integrating Chemical-Chemical Interactions and SimilaritiesPLOS ONE, 2012
- An extensive experimental comparison of methods for multi-label learningPattern Recognition, 2012
- Classifier chains for multi-label classificationMachine Learning, 2011
- A side effect resource to capture phenotypic effects of drugsMolecular Systems Biology, 2010
- Concept-Based Semi-Automatic Classification of DrugsJournal of Chemical Information and Modeling, 2009
- The use of the area under the ROC curve in the evaluation of machine learning algorithmsPattern Recognition, 1997