Comparison and benchmark of name-to-gender inference services
Top Cited Papers
Open Access
- 16 July 2018
- journal article
- research article
- Published by PeerJ in PeerJ Computer Science
- Vol. 4, e156
- https://doi.org/10.7717/peerj-cs.156
Abstract
The increased interest in analyzing and explaining gender inequalities in tech, media, and academia highlights the need for accurate inference methods to predict a person’s gender from their name. Several such services exist that provide access to large databases of names, often enriched with information from social media profiles, culture-specific rules, and insights from sociolinguistics. We compare and benchmark five name-to-gender inference services by applying them to the classification of a test data set consisting of 7,076 manually labeled names. The compiled names are analyzed and characterized according to their geographical and cultural origin. We define a series of performance metrics to quantify various types of classification errors, and define a parameter tuning procedure to search for optimal values of the services’ free parameters. Finally, we perform benchmarks of all services under study regarding several scenarios where a particular metric is to be optimized.Keywords
Funding Information
- Grants Programme of the International Council for Science (ICSU)
This publication has 13 references indexed in Scilit:
- FollowBiasPublished by Association for Computing Machinery (ACM) ,2017
- The Effect of Gender in the Publication Patterns in MathematicsPLOS ONE, 2016
- Trends and comparison of female first authorship in high impact medical journals: observational study (1994-2014)BMJ, 2016
- Inferring Gender from Names on the WebPublished by Association for Computing Machinery (ACM) ,2016
- A Data Set for Social Diversity Studies of GitHub TeamsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- Bibliometrics: Global gender disparities in scienceNature, 2013
- The Role of Gender in Scholarly AuthorshipPLOS ONE, 2013
- Gender-specific patterns in patenting and publishingResearch Policy, 2009
- Iranian women in science: a gender study of scientific productivity in an Islamic countryAslib Proceedings, 2008
- Scientific and Technological Performance by GenderPublished by Springer Science and Business Media LLC ,2006