Abstract
In this paper we review methods of cluster analysis in the context of classifying patients on the basis of clinical and/or laboratory type observations. Both hierarchical and non-hierarchical methods of clustering are considered, although the emphasis is on the latter type, with particular attention devoted to the mixture likelihood-based approach. For the purposes of dividing a given data set into g clusters, this approach fits a mixture model of g components, using the method of maximum likelihood. It thus provides a sound statistical basis for clustering. The important but difficult question of how many clusters are there in the data can be addressed within the framework of standard statistical theory, although theoretical and computational difficulties still remain. Two case studies, involving the cluster analysis of some haemophilia and diabetes data respectively, are reported to demonstrate the mixture likelihood-based approach to clustering.

This publication has 28 references indexed in Scilit: