6. The Simultaneous Decision(s) about the Number of Lower- and Higher-Level Classes in Multilevel Latent Class Analysis

1 August 2010

journal article
Published by SAGE Publications in Sociological Methodology

Vol. 40 (1), 247-283
https://doi.org/10.1111/j.1467-9531.2010.01231.x

Abstract

Recently, several types of extensions of the latent class (LC) model have been developed for the analysis of data sets having a multilevel structure. The most popular variant is the multilevel LC model with finite mixture distributions at multiple levels of a hierarchical structure; that is, with LCs for both lower-level units (e.g. individuals, citizens, or patients) and higher-level units (e.g. groups, regions, or hospitals). A problem in the application of this model is that determining the number of LCs is much more complicated than in standard (single-level) LC analysis because it involves multiple, nonindependent decisions. We propose a three-step model-fitting procedure for deciding about the number of higher- and lower-level classes. We also investigate the performance of information criteria (BIC, AIC, CAIC, and AIC3) in the context of multilevel LC analysis, with different types of response variables. A specific difficulty associated with using BIC and CAIC in any type of multilevel analysis is that these measures contain the sample size in their formulae, and we investigate whether this should be the number of groups, the number of individuals, or either the number of groups or individuals depending on whether one has to decide about model features concerning the higher or lower level. The three main conclusions of our simulations studies are that (1) the proposed three-step model-fitting strategy works rather well, (2) the number of higher-level units (K) is the preferred sample size for BIC and CAIC, both for decisions about higher- and lower-level classes, and (3) with categorical indicators, AIC3 and BIC based on the higher-level sample size are the preferred measures for deciding about the number of LCs at both the higher and lower level. With continuous indicators, BIC(K) performs better than AIC3. AIC performs best in very specific situations—namely, with poorly separated classes and categorical indicators.

Keywords

This publication has 32 references indexed in Scilit:

Latent Class Models for Marketing Strategies
Methodology, 2009
Development and individual differences in transitive reasoning: A fuzzy trace theory approach
Developmental Review, 2007
Country and consumer segmentation: multi-level latent class analysis of financial product ownership
International Journal of Research in Marketing, 2004
A Comparison of Segment Retention Criteria for Finite Mixture Logit Models
Journal of Marketing Research, 2003
Assessing a mixture model for clustering with the integrated completed likelihood
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000
A General Maximum Likelihood Analysis of Variance Components in Generalized Linear Models
Biometrics, 1999
Mixture-Model Cluster Analysis Using Model Selection Criteria and a New Informational Measure of Complexity
Published by Springer Science and Business Media LLC ,1994
Choosing the Number of Component Clusters in the Mixture-Model Using a New Informational Complexity Criterion of the Inverse-Fisher Information Matrix
Published by Springer Science and Business Media LLC ,1993
Model selection and Akaike's Information Criterion (AIC): The general theory and its analytical extensions
Psychometrika, 1987
A new look at the statistical model identification
IEEE Transactions on Automatic Control, 1974

Cited by 91 articles