Gradient lasso for Cox proportional hazards model
Open Access
- 15 May 2009
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 25 (14), 1775-1781
- https://doi.org/10.1093/bioinformatics/btp322
Abstract
Motivation: There has been an increasing interest in expressing a survival phenotype (e.g. time to cancer recurrence or death) or its distribution in terms of a subset of the expression data of a subset of genes. Due to high dimensionality of gene expression data, however, there is a serious problem of collinearity in fitting a prediction model, e.g. Cox's proportional hazards model. To avoid the collinearity problem, several methods based on penalized Cox proportional hazards models have been proposed. However, those methods suffer from severe computational problems, such as slow or even failed convergence, because of high-dimensional matrix inversions required for model fitting. We propose to implement the penalized Cox regression with a lasso penalty via the gradient lasso algorithm that yields faster convergence to the global optimum than do other algorithms. Moreover the gradient lasso algorithm is guaranteed to converge to the optimum under mild regularity conditions. Hence, our gradient lasso algorithm can be a useful tool in developing a prediction model based on high-dimensional covariates including gene expression data. Results: Results from simulation studies showed that the prediction model by gradient lasso recovers the prognostic genes. Also results from diffuse large B-cell lymphoma datasets and Norway/Stanford breast cancer dataset indicate that our method is very competitive compared with popular existing methods by Park and Hastie and Goeman in its computational time, prediction and selectivity. Availability: R package glcoxph is available at http://datamining.dongguk.ac.kr/R/glcoxph. Contact:park463@uos.ac.krKeywords
This publication has 19 references indexed in Scilit:
- A Gradient-Based Optimization Algorithm for LASSOJournal of Computational and Graphical Statistics, 2008
- A note on path-based variable selection in the penalized proportional hazards modelBiometrika, 2008
- Adaptive Lasso for Cox's proportional hazards modelBiometrika, 2007
- CASPAR: a hierarchical bayesian approach to predict survival times in cancer from gene expression dataBioinformatics, 2006
- Prediction by Supervised Principal ComponentsJournal of the American Statistical Association, 2006
- Microarray gene expression data with linked survival phenotypes: diffuse large-B-cell lymphoma revisitedBiostatistics, 2005
- Semi-Supervised Methods to Predict Patient Survival from Gene Expression DataPLoS Biology, 2004
- Least angle regressionThe Annals of Statistics, 2004
- Repeated observation of breast tumor subtypes in independent gene expression data setsProceedings of the National Academy of Sciences of the United States of America, 2003
- Exploring the new world of the genome with DNA microarraysNature Genetics, 1999