Predicting Anatomical Therapeutic Chemical Drug Classes from 17 molecules’ Properties of Drugs by Multi-Label Binary Relevance Approach with MLSMOTE

26 December 2021

conference paper
conference paper
Published by Association for Computing Machinery (ACM)

https://doi.org/10.1145/3512452.3512453

Abstract

Anatomical Therapeutic Chemical (ATC) classes prediction is one of the prominent activities in the costly and tedious pipeline of drug discovery where machine learning plays an important role by minimizing the cost and time of prediction. Most of the existing research have been done to predict ATC classes from the chemical-chemical association, side-effects, target proteins, gene expressions, chemical structures, drug targets, and textual information of drugs. However, the capability of 17 molecules’ properties have not yet been explored to predict drug ATC classes. The current work proposes a methodology for predicting the drug ATC classes using the 17 molecules’ properties. ATC classes prediction is a multi-label classification task and therefore, a binary relevance strategy has been employed to solve this issue with four basic machine learning classifiers, namely K-Nearest Neighbour (KNN), Extra Tree Classifier (ETC), Random Forest (RF), and Decision Tree (DT). The common problem of multi-label datasets is class imbalance which is addressed using the MLSMOTE (Multi-Label Synthetic Minority Over-Sampling Technique). The proposed methodology exhibits promising results, and it achieved the accuracy ranging from 96.90% to 98.06%, which indicates that 17 molecules’ properties are good enough in efficient prediction of ATC classes.

Keywords

This publication has 23 references indexed in Scilit:

Prediction of drug’s Anatomical Therapeutic Chemical (ATC) code by integrating drug–domain network
Journal of Biomedical Informatics, 2015
MLSMOTE: Approaching imbalanced multilabel learning through synthetic instance generation
Knowledge-Based Systems, 2015
Similarity-based prediction for Anatomical Therapeutic Chemical classification of drugs by integrating multiple data sources
Bioinformatics, 2015
SuperPred: update on drug classification and target prediction
Nucleic Acids Research, 2014
Predicting Anatomical Therapeutic Chemical (ATC) Classification of Drugs by Integrating Chemical-Chemical Interactions and Similarities
PLOS ONE, 2012
An extensive experimental comparison of methods for multi-label learning
Pattern Recognition, 2012
Classifier chains for multi-label classification
Machine Learning, 2011
A side effect resource to capture phenotypic effects of drugs
Molecular Systems Biology, 2010
Concept-Based Semi-Automatic Classification of Drugs
Journal of Chemical Information and Modeling, 2009
The use of the area under the ROC curve in the evaluation of machine learning algorithms
Pattern Recognition, 1997

Cited by 6 articles