Model selection for logistic regression via association rules analysis
- 1 August 2013
- journal article
- research article
- Published by Taylor & Francis Ltd in Journal of Statistical Computation and Simulation
- Vol. 83 (8), 1415-1428
- https://doi.org/10.1080/00949655.2012.662231
Abstract
Interaction is very common in reality, but has received little attention in logistic regression literature. This is especially true for higher-order interactions. In conventional logistic regression, interactions are typically ignored. We propose a model selection procedure by implementing an association rules analysis. We do this by (1) exploring the combinations of input variables which have significant impacts to response (via association rules analysis); (2) selecting the potential (low- and high-order) interactions; (3) converting these potential interactions into new dummy variables; and (4) performing variable selections among all the input variables and the newly created dummy variables (interactions) to build up the optimal logistic regression model. Our model selection procedure establishes the optimal combination of main effects and potential interactions. The comparisons are made through thorough simulations. It is shown that the proposed method outperforms the existing methods in all cases. A real-life example is discussed in detail to demonstrate the proposed method.Keywords
This publication has 18 references indexed in Scilit:
- Variable selection in logistic regression for detecting SNP–SNP interactions: the rheumatoid arthritis exampleEuropean Journal of Human Genetics, 2008
- Automatic determination of diseases related to lymph system from lymphography data using principles component analysis (PCA), fuzzy weighting pre-processing and ANFISExpert Systems with Applications, 2007
- A logistic regression modelling approach to business opportunity assessmentInternational Journal of Six Sigma and Competitive Advantage, 2007
- The Adaptive Lasso and Its Oracle PropertiesJournal of the American Statistical Association, 2006
- Mapping non-wood forest product (matsutake mushrooms) using logistic regression and a GIS expert systemEcological Modelling, 2006
- Is Agricultural Activity Linked to the Incidence of Human West Nile Virus?American Journal of Preventive Medicine, 2006
- Analysis of cerebral microvascular architecture—application to cortical and subcortical vessels in rat brainJournal of Neuroscience Methods, 2004
- Least angle regressionThe Annals of Statistics, 2004
- Comments on «Wavelets in statistics: A review» by A. AntoniadisStatistical Methods & Applications, 1997
- A new look at the statistical model identificationIEEE Transactions on Automatic Control, 1974