Privacy-preserving speech processing: cryptographic and string-matching frameworks show promise

Abstract
Speech is one of the most private forms of communication. People do not like to be eavesdropped on. They will frequently even object to being recorded; in fact, in many places it is illegal to record people speaking in public, even when it is acceptable to capture their images on video [1]. Yet, when a person uses a speech-based service such as a voice authentication system or a speech recognition service, they must grant the service complete access to their voice recordings. This exposes the user to abuse, with security, privacy and economic implications. For instance, the service could extract information such as gender, ethnicity, and even the emotional state of the user from the recording-factors not intended to be exposed by the user-and use them for undesired purposes. The recordings may be edited to create fake recordings that the user never spoke, or to impersonate them for other services. Even derivatives from the voice are risky to expose. For example, a voice-authentication service could make unauthorized use of the models or voice prints it has for users to try to identify their presence in other media such as YouTube.

This publication has 27 references indexed in Scilit: