A comparison of several regression models for analysing cost of CABG surgery
- 19 August 2003
- journal article
- research article
- Published by Wiley in Statistics in Medicine
- Vol. 22 (17), 2799-2815
- https://doi.org/10.1002/sim.1442
Abstract
Investigators in clinical research are often interested in determining the association between patient characteristics and cost of medical or surgical treatment. However, there is no uniformly agreed upon regression model with which to analyse cost data. The objective of the current study was to compare the performance of linear regression, linear regression with log‐transformed cost, generalized linear models with Poisson, negative binomial and gamma distributions, median regression, and proportional hazards models for analysing costs in a cohort of patients undergoing CABG surgery. The study was performed on data comprising 1959 patients who underwent CABG surgery in Calgary, Alberta, between June 1994 and March 1998. Ten of 21 patient characteristics were significantly associated with cost of surgery in all seven models. Eight variables were not significantly associated with cost of surgery in all seven models. Using mean squared prediction error as a loss function, proportional hazards regression and the three generalized linear models were best able to predict cost in independent validation data. Using mean absolute error, linear regression with log‐transformed cost, proportional hazards regression, and median regression to predict median cost, were best able to predict cost in independent validation data. Since the models demonstrated good consistency in identifying factors associated with increased cost of CABG surgery, any of the seven models can be used for identifying factors associated with increased cost of surgery. However, the magnitude of, and the interpretation of, the coefficients vary across models. Researchers are encouraged to consider a variety of candidate models, including those better known in the econometrics literature, rather than begin data analysis with one regression model selected a priori. The final choice of regression model should be made after a careful assessment of how best to assess predictive ability and should be tailored to the particular data in question. Copyright © 2003 John Wiley & Sons, Ltd.Keywords
This publication has 24 references indexed in Scilit:
- Identification of risk factors for increased cost, charges, and length of stay for cardiac patientsThe Annals of Thoracic Surgery, 2000
- Identifying Pre- and Postoperative Predictors of Cost and Length of Stay for Coronary Artery Bypass SurgeryAmerican Journal of Medical Quality, 1999
- Sequence Effects, Health Profiles, and the QALY ModelMedical Decision Making, 1998
- Predicting hospital costs for first-time coronary artery bypass grafting from preoperative and postoperative variablesThe American Journal of Cardiology, 1994
- Adapting a clinical comorbidity index for use with ICD-9-CM administrative databasesJournal of Clinical Epidemiology, 1992
- Data SplittingThe American Statistician, 1990
- Determinants of hospital charges for coronary artery bypass surgery: The economic consequences of postoperative complicationsThe American Journal of Cardiology, 1990
- Smearing Estimate: A Nonparametric Retransformation MethodJournal of the American Statistical Association, 1983
- Robust Locally Weighted Regression and Smoothing ScatterplotsJournal of the American Statistical Association, 1979
- Regression QuantilesEconometrica, 1978