Deep Learning Meets Private Talk: Conversational AI Can Predict Speaker Traits by Eavesdropping for Only 30 Seconds

5 September 2021

conference paper
conference paper
Published by Association for Computing Machinery (ACM) in Mensch und Computer 2021

https://doi.org/10.1145/3473856.3474012

Abstract

Conversational AI such as smart speakers placed in home environments can accidentally activate and record people’s talk for a short time. What can such devices learn about people by listening in on ongoing conversations? Taking two commonly used speaker traits as an example, we present the results of an experiment that simulates Conversational AI eavesdropping on ongoing talk using transcriptions of naturalistic conversations in private settings. We show that a currently popular type of deep learning-based system can reliably predict if a speaker is “young”, “old”, “female” or “male” (age=99%, gender=82%) based on what they say in around 30 seconds. Our results exemplify how powerful current big data language models are when it comes to data-driven predictions of personal information based on how people talk, even when listening only for a short time. We conclude the experiment with a critical comment on the increasingly pervasive use of such user modeling technology to compute speaker traits, touching upon some potential ethical concerns, bias, and privacy issues.

Keywords

This publication has 5 references indexed in Scilit:

The Spoken BNC2014
Corpus Studies of Language Through Time, 2022
Dual humanness and trust in conversational AI: A person-centered approach
Computers in Human Behavior, 2021
Algorithmic injustice: a relational ethics approach
Patterns, 2021
Alexa, Are You Listening?
Proceedings of the ACM on Human-Computer Interaction, 2018
Addressing Ethical Concerns of Big Data as a Prerequisite for a Sustainable Big Data Industry
International Journal of Interdisciplinary Telecommunications and Networking, 2018

Cited by 1 article