Surname lists to identify South Asian and Chinese ethnicity from secondary data in Ontario, Canada: a validation study
Open Access
- 15 May 2010
- journal article
- research article
- Published by Springer Science and Business Media LLC in BMC Medical Research Methodology
- Vol. 10 (1), 42
- https://doi.org/10.1186/1471-2288-10-42
Abstract
Surname lists are useful for identifying cohorts of ethnic minority patients from secondary data sources. This study sought to develop and validate lists to identify people of South Asian and Chinese origin. Comprehensive lists of South Asian and Chinese surnames were reviewed to identify those that uniquely belonged to the ethnic minority group. Surnames that were common in other populations, communities or ethnic groups were specifically excluded. These surname lists were applied to the Registered Persons Database, a registry of the health card numbers assigned to all residents of the Canadian province of Ontario, so that all residents were assigned to South Asian ethnicity, Chinese ethnicity or the General Population. Ethnic assignment was validated against self-identified ethnicity through linkage with responses to the Canadian Community Health Survey. The final surname lists included 9,950 South Asian surnames and 1,133 Chinese surnames. All 16,688,384 current and former residents of Ontario were assigned to South Asian ethnicity, Chinese ethnicity or the General Population based on their surnames. Among 69,859 respondents to the Canadian Community Health Survey, both lists performed extremely well when compared against self-identified ethnicity: positive predictive value was 89.3% for the South Asian list, and 91.9% for the Chinese list. Because surnames shared with other ethnic groups were deliberately excluded from the lists, sensitivity was lower (50.4% and 80.2%, respectively). These surname lists can be used to identify cohorts of people with South Asian and Chinese origins from secondary data sources with a high degree of accuracy. These cohorts could then be used in epidemiologic and health service research studies of populations with South Asian and Chinese origins.Keywords
This publication has 15 references indexed in Scilit:
- Development and Validation of a Surname List to Define Chinese EthnicityMedical Care, 2006
- Understanding sensitivity and specificity with the right side of the brainBMJ, 2003
- Development and validation of a computerized South Asian Names and Group Recognition Algorithm (SANGRA) for use in British health-related studiesPublished by Oxford University Press (OUP) ,2001
- How to Find Chinese Research Participants: Use of a Phonologically Based Surname Search MethodCanadian Journal of Public Health, 2001
- An assessment of the Nam Pehchan computer program for the identification of names of south Asian ethnic originJournal of Public Health, 1999
- Classifying ethnicity utilizing the Canadian mortality data baseEthnicity & Health, 1997
- Prevalences of diabetes and cardiovascular disease risk factors in Hindu Indian subcommunities in Tanzania.BMJ, 1991
- Relation of central obesity and insulin resistance with high diabetes prevalence and cardiovascular risk in South AsiansThe Lancet, 1991
- Telephone Directory Listings of Presumptive Chinese SurnamesEpidemiology, 1990
- Patterns of mortality among migrants to England and Wales from the Indian subcontinent.BMJ, 1984