Head/Tail Breaks: A New Classification Scheme for Data with a Heavy-Tailed Distribution

Top Cited Papers

1 August 2013

journal article
research article
Published by Informa UK Limited in The Professional Geographer

Vol. 65 (3), 482-494
https://doi.org/10.1080/00330124.2012.700499

Abstract

This article introduces a new classification scheme—head/tail breaks—to find groupings or hierarchy for data with a heavy-tailed distribution. The heavy-tailed distributions are heavily right skewed, with a minority of large values in the head and a majority of small values in the tail, commonly characterized by a power law, a lognormal, or an exponential function. For example, a country's population is often distributed in such a heavy-tailed manner, with a minority of people (e.g., 20 percent) in the countryside and the vast majority (e.g., 80 percent) in urban areas. This new classification scheme partitions all of the data values around the mean into two parts and continues the process iteratively for the values (above the mean) in the head until the head part values are no longer heavy-tailed distributed. Thus, the number of classes and the class intervals are both naturally determined. I therefore claim that the new classification scheme is more natural than the natural breaks in finding the groupings or hierarchy for data with a heavy-tailed distribution. I demonstrate the advantages of the head/tail breaks method over Jenks's natural breaks in capturing the underlying hierarchy of the data.

Keywords

This publication has 18 references indexed in Scilit:

Street hierarchies: a minority of streets account for a majority of traffic flow
International Journal of Geographical Information Science, 2009
Self-organized natural roads for predicting traffic flow: a sensitivity study
Journal of Statistical Mechanics: Theory and Experiment, 2008
Topological Analysis of Urban Street Networks
Environment and Planning B: Planning and Design, 2004
Evaluation of Methods for Classifying Epidemiological Data on Choropleth Maps in Series
Annals of the American Association of Geographers, 2002
A universal rule for the distribution of sizes
Environment and Planning B: Planning and Design, 1999
The Selection of Class Intervals
Transactions of the Institute of British Geographers, 1977
Choropleth Maps Without Class Intervals?
Geographical Analysis, 1973
GENERALIZATION IN STATISTICAL MAPPING
Annals of the American Association of Geographers, 1963
On Grouping for Maximum Homogeneity
Journal of the American Statistical Association, 1958
Population-Density Maps of the United States: Techniques and Patterns
Geographical Review, 1943

Cited by 252 articles