Research on End-to-end Voiceprint Recognition Model Based on Convolutional Neural Network

26 August 2021

journal article
research article
Published by River Publishers in Journal of Web Engineering

Vol. 20 (5), 1573-1585
https://doi.org/10.13052/jwe1540-9589.20511

Abstract

Speech signal is a time-varying signal, which is greatly affected by individual and environment. In order to improve the end-to-end voice print recognition rate, it is necessary to preprocess the original speech signal to some extent. An end-to-end voiceprint recognition algorithm based on convolutional neural network is proposed. In this algorithm, the convolution and down-sampling of convolutional neural network are used to preprocess the speech signals in end-to-end voiceprint recognition. The one-dimensional and two-dimensional convolution operations were established to extract the characteristic parameters of Meier frequency cepstrum coefficient from the preprocessed signals, and the classical universal background model was used to model the recognition model of voice print. In this study, the principle of end-to-end voiceprint recognition was firstly analyzed, and the process of end-to-end voice print recognition, end-to-end voice print recognition features and Res-FD-CNN network structure were studied. Then the convolutional neural network recognition model was constructed, and the data were preprocessed to form the convolutional layer in frequency domain and the algorithm was tested.

Keywords

This publication has 3 references indexed in Scilit:

Voxceleb: Large-scale speaker verification in the wild
Computer Speech & Language, 2019
[Palm vein recognition based on end-to-end convolutional neural network].
2019
3D Convolutional Neural Networks for Human Action Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012

Cited by 2 articles