Multilevel clustering approach driven by continuous glucose monitoring data for further classification of type 2 diabetes

Open Access

24 February 2021

journal article
research article
Published by BMJ in BMJ Open Diabetes Research & Care

Vol. 9 (1), e001869
https://doi.org/10.1136/bmjdrc-2020-001869

Abstract

Introduction Mining knowledge from continuous glucose monitoring (CGM) data to classify highly heterogeneous patients with type 2 diabetes according to their characteristics remains unaddressed. A refined clustering method that retrieves hidden information from CGM data could provide a viable method to identify patients with different degrees of dysglycemia and clinical phenotypes. Research design and methods From Shanghai Jiao Tong University Affiliated Sixth People’s Hospital, we selected 908 patients with type 2 diabetes (18–83 years) who wore blinded CGM sensors (iPro2, Medtronic, California, USA). Participants were clustered based on CGM data during a 24-hour period by our method. The first level extracted the knowledge-based and statistics-based features to describe CGM signals from multiple perspectives. The Fisher score and variables cluster analysis were applied to fuse features into low dimensions at the second level. The third level divided subjects into subgroups with different clinical phenotypes. The four subgroups of patients were determined by clinical phenotypes. Results Four subgroups of patients with type 2 diabetes with significantly different statistical features and clinical phenotypes were identified by our method. In particular, individuals in cluster 1 were characterized by the lowest glucose level factor and glucose fluctuation factor, and the highest negative glucose factor and C peptide index. By contrast, cluster 2 had the highest glucose level factor and the lowest C peptide index. Cluster 4 was characterized by the greatest degree of glucose fluctuation factor, was the most insulin-sensitive, and had the lowest insulin resistance. Cluster 3 ranked in the middle concerning the CGM-derived metrics and clinical phenotypes compared with those of the other three groups. Conclusion A novel multilevel clustering approach for knowledge mining from CGM data in type 2 diabetes is presented. The results demonstrate that subgroups are adequately distinguished with notable statistical and clinical differences.

Keywords

Funding Information

National Natural Science Foundation of China (61903071, 61973067)
National Key R&D Program of China (2018YFC2001004)
Shanghai Municipal Education Commission—Gaofeng Clinical Medicine Grant Support (20161430)

This publication has 29 references indexed in Scilit:

Cluster analysis and its application to healthcare claims data: a study of end-stage renal disease patients who initiated hemodialysis
BMC Nephrology, 2016
Diagnosis and Classification of Diabetes Mellitus
Diabetes Care, 2013
The use and efficacy of continuous glucose monitoring in type 1 diabetes treated with insulin pump therapy: a randomised controlled trial
Diabetologia, 2012
Optimal Sampling Intervals to Assess Long-Term Glycemic Control Using Continuous Glucose Monitoring
Diabetes Technology & Therapeutics, 2011
Standards of Medical Care in Diabetes—2010
Diabetes Care, 2010
Glycemic Variability: The Third Component of the Dysglycemia in Diabetes. Is it Important? How to Measure it?
Journal of Diabetes Science and Technology, 2008
Cluster-wise assessment of cluster stability
Computational Statistics & Data Analysis, 2007
Euclidean Distance as a Similarity Metric for Principal Component Analysis
Monthly Weather Review, 2001
Correct Homeostasis Model Assessment (HOMA) Evaluation Uses the Computer Program
Diabetes Care, 1998
Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values
Data Mining and Knowledge Discovery, 1998

Cited by 10 articles