Statistical Methods in Medical Research
ISSN / EISSN : 0962-2802 / 1477-0334
Published by: SAGE Publications (10.1177)
Total articles ≅ 2,494
Latest articles in this journal
Statistical Methods in Medical Research; https://doi.org/10.1177/09622802211041759
The accelerated failure time model is an alternative to the Cox proportional hazards model in survival analysis. However, conclusions regarding the associations of prognostic factors with event times are valid only if the underlying modeling assumptions are met. In contrast to several flexible methods for relaxing the proportional hazards and linearity assumptions in the Cox model, formal investigation of the constant-over-time time ratio and linearity assumptions in the accelerated failure time model has been limited. Yet, in practice, prognostic factors may have time-dependent and/or nonlinear effects. Furthermore, parametric accelerated failure time models require correct specification of the baseline hazard function, which is treated as a nuisance parameter in the Cox proportional hazards model, and is rarely known in practice. To address these challenges, we propose a flexible extension of the accelerated failure time model where unpenalized regression B-splines are used to model (i) the baseline hazard function of arbitrary shape, (ii) the time-dependent covariate effects on the hazard, and (iii) nonlinear effects for continuous covariates. Simulations evaluate the accuracy of the time-dependent and/or nonlinear estimates, and of the resulting survival functions, in multivariable settings. The proposed flexible extension of the accelerated failure time model is applied to re-assess the effects of prognostic factors on mortality after septic shock.
Statistical Methods in Medical Research; https://doi.org/10.1177/09622802211037076
Non-proportional hazards data are routinely encountered in randomized clinical trials. In such cases, classic Cox proportional hazards model can suffer from severe power loss, with difficulty in interpretation of the estimated hazard ratio since the treatment effect varies over time. We propose CauchyCP, an omnibus test of change-point Cox regression models, to overcome both challenges while detecting signals of non-proportional hazards patterns. Extensive simulation studies demonstrate that, compared to existing treatment comparison tests under non-proportional hazards, the proposed CauchyCP test (a) controls the type I error better at small [Formula: see text] levels ([Formula: see text]); (b) increases the power of detecting time-varying effects; and (c) is more computationally efficient than popular methods like MaxCombo for large-scale data analysis. The superior performance of CauchyCP is further illustrated using retrospective analyses of two randomized clinical trial datasets and a pharmacogenetic biomarker study dataset. The R package CauchyCP is publicly available on CRAN.
Statistical Methods in Medical Research; https://doi.org/10.1177/09622802211037071
Ultrahigh-dimensional gene features are often collected in modern cancer studies in which the number of gene features [Formula: see text] is extremely larger than sample size [Formula: see text]. While gene expression patterns have been shown to be related to patients’ survival in microarray-based gene expression studies, one has to deal with the challenges of ultrahigh-dimensional genetic predictors for survival predicting and genetic understanding of the disease in precision medicine. The problem becomes more complicated when two types of survival endpoints, distant metastasis-free survival and overall survival, are of interest in the study and outcome data can be subject to semi-competing risks due to the fact that distant metastasis-free survival is possibly censored by overall survival but not vice versa. Our focus in this paper is to extract important features, which have great impacts on both distant metastasis-free survival and overall survival jointly, from massive gene expression data in the semi-competing risks setting. We propose a model-free screening method based on the ranking of the correlation between gene features and the joint survival function of two endpoints. The method accounts for the relationship between two endpoints in a simply defined utility measure that is easy to understand and calculate. We show its favorable theoretical properties such as the sure screening and ranking consistency, and evaluate its finite sample performance through extensive simulation studies. Finally, an application to classifying breast cancer data clearly demonstrates the utility of the proposed method in practice.
Statistical Methods in Medical Research; https://doi.org/10.1177/09622802211037073
Sample size calculations for cluster-randomised trials require inclusion of an inflation factor taking into account the intra-cluster correlation coefficient. Often, estimates of the intra-cluster correlation coefficient are taken from pilot trials, which are known to have uncertainty about their estimation. Given that the value of the intra-cluster correlation coefficient has a considerable influence on the calculated sample size for a main trial, the uncertainty in the estimate can have a large impact on the ultimate sample size and consequently, the power of a main trial. As such, it is important to account for the uncertainty in the estimate of the intra-cluster correlation coefficient. While a commonly adopted approach is to utilise the upper confidence limit in the sample size calculation, this is a largely inefficient method which can result in overpowered main trials. In this paper, we present a method of estimating the sample size for a main cluster-randomised trial with a continuous outcome, using numerical methods to account for the uncertainty in the intra-cluster correlation coefficient estimate. Despite limitations with this initial study, the findings and recommendations in this paper can help to improve sample size estimations for cluster randomised controlled trials by accounting for uncertainty in the estimate of the intra-cluster correlation coefficient. We recommend this approach be applied to all trials where there is uncertainty in the intra-cluster correlation coefficient estimate, in conjunction with additional sources of information to guide the estimation of the intra-cluster correlation coefficient.
Statistical Methods in Medical Research; https://doi.org/10.1177/09622802211031615
Missing data is a common issue in epidemiological databases. Among the different ways of dealing with missing data, multiple imputation has become more available in common statistical software packages. However, the incompatibility between the imputation and substantive model, which can arise when the associations between variables in the substantive model are not taken into account in the imputation models or when the substantive model is itself nonlinear, can lead to invalid inference. Aiming at analysing population-based cancer survival data, we extended the multiple imputation substantive model compatible-fully conditional specification (SMC-FCS) approach, proposed by Bartlett et al. in 2015 to accommodate excess hazard regression models. The proposed approach was compared with the standard fully conditional specification multiple imputation procedure and with the complete-case analysis using a simulation study. The SMC-FCS approach produced unbiased estimates in both scenarios tested, while the fully conditional specification produced biased estimates and poor empirical coverages probabilities. The SMC-FCS algorithm was then used for handling missing data in the evaluation of socioeconomic inequalities in survival from colorectal cancer patients diagnosed in the North Region of Portugal. The analysis using SMC-FCS showed a clearer trend in higher excess hazards for patients coming from more deprived areas. The proposed algorithm was implemented in R software and is presented as Supplementary Material.
Statistical Methods in Medical Research; https://doi.org/10.1177/09622802211037078
Social distancing is an important public health intervention to reduce or interrupt the sustained community transmission of emerging infectious pathogens, such as severe acute respiratory syndrome-coronavirus-2 during the coronavirus disease 2019 pandemic. Contact matrices are typically used when evaluating such public health interventions to account for the heterogeneity in social mixing of individuals, but the surveys used to obtain the number of contacts often lack detailed information on the time individuals spend on daily activities. The present work addresses this problem by combining the large-scale empirical data of a social contact survey and a time-use survey to estimate contact matrices by age group (0--15, 16--24, 25–44, 45–64, 65+ years) and daily activity (work, schooling, transportation, and four leisure activities: social visits, bar/cafe/restaurant visits, park visits, and non-essential shopping). This augmentation allows exploring the impact of fewer contacts when individuals reduce the time they spend on selected daily activities as well as when lifting such restrictions again. For illustration, the derived matrices were then applied to an age-structured dynamic-transmission model of coronavirus disease 2019. Findings show how contact matrices can be successfully augmented with time-use data to inform the relative reductions in contacts by activity, which allows for more fine-grained mixing patterns and infectious disease modelling.
Statistical Methods in Medical Research; https://doi.org/10.1177/09622802211037074
In many imaging studies, each case is reviewed by human readers and characterized according to one or more features. Often, the inter-reader agreement of the feature indications is of interest in addition to their diagnostic accuracy or association with clinical outcomes. Complete designs in which all participating readers review all cases maximize efficiency and guarantee estimability of agreement metrics for all pairs of readers but often involve a heavy reading burden. Assigning readers to cases using balanced incomplete block designs substantially reduces reading burden by having each reader review only a subset of cases, while still maintaining estimability of inter-reader agreement for all pairs of readers. Methodology for data analysis and power and sample size calculations under balanced incomplete block designs is presented and applied to simulation studies and an actual example. Simulation studies results suggest that such designs may reduce reading burdens by >40% while in most scenarios incurring a <20% increase in the standard errors and a <8% and <20% reduction in power to detect between-modality differences in diagnostic accuracy and [Formula: see text] statistics, respectively.
Statistical Methods in Medical Research; https://doi.org/10.1177/09622802211037075
Propensity score matching is widely used to determine the effects of treatments in observational studies. Competing risk survival data are common to medical research. However, there is a paucity of propensity score matching studies related to competing risk survival data with missing causes of failure. In this study, we provide guidelines for estimating the treatment effect on the cumulative incidence function when using propensity score matching on competing risk survival data with missing causes of failure. We examined the performances of different methods for imputing the data with missing causes. We then evaluated the gain from the missing cause imputation in an extensive simulation study and applied the proposed data imputation method to the data from a study on the risk of hepatocellular carcinoma in patients with chronic hepatitis B and chronic hepatitis C.
Statistical Methods in Medical Research; https://doi.org/10.1177/09622802211038754
Machine learning algorithms are increasingly used in the clinical literature, claiming advantages over logistic regression. However, they are generally designed to maximize the area under the receiver operating characteristic curve. While area under the receiver operating characteristic curve and other measures of accuracy are commonly reported for evaluating binary prediction problems, these metrics can be misleading. We aim to give clinical and machine learning researchers a realistic medical example of the dangers of relying on a single measure of discriminatory performance to evaluate binary prediction questions. Prediction of medical complications after surgery is a frequent but challenging task because many post-surgery outcomes are rare. We predicted post-surgery mortality among patients in a clinical registry who received at least one aortic valve replacement. Estimation incorporated multiple evaluation metrics and algorithms typically regarded as performing well with rare outcomes, as well as an ensemble and a new extension of the lasso for multiple unordered treatments. Results demonstrated high accuracy for all algorithms with moderate measures of cross-validated area under the receiver operating characteristic curve. False positive rates were [Formula: see text]1%, however, true positive rates were [Formula: see text]7%, even when paired with a 100% positive predictive value, and graphical representations of calibration were poor. Similar results were seen in simulations, with the addition of high area under the receiver operating characteristic curve ([Formula: see text]90%) accompanying low true positive rates. Clinical studies should not primarily report only area under the receiver operating characteristic curve or accuracy.
Statistical Methods in Medical Research; https://doi.org/10.1177/09622802211037070
The area under the receiver operating characteristic curve is a widely used measure for evaluating the performance of a diagnostic test. Common approaches for inference on area under the receiver operating characteristic curve are usually based upon approximation. For example, the normal approximation based inference tends to suffer from the problem of low accuracy for small sample size. Frequentist empirical likelihood based approaches for area under the receiver operating characteristic curve estimation may perform better, but are usually conducted through approximation in order to reduce the computational burden, thus the inference is not exact. By contrast, we proposed an exact inferential procedure by adapting the empirical likelihood into a Bayesian framework and draw inference from the posterior samples of the area under the receiver operating characteristic curve obtained via a Gibbs sampler. The full conditional distributions within the Gibbs sampler only involve empirical likelihoods with linear constraints, which greatly simplify the computation. To further enhance the applicability and flexibility of the Bayesian empirical likelihood, we extend our method to the estimation of partial area under the receiver operating characteristic curve, comparison of multiple tests, and the doubly robust estimation of area under the receiver operating characteristic curve in the presence of missing test results. Simulation studies confirm the desirable performance of the proposed methods, and a real application is presented to illustrate its usefulness.