Optimized Features Extraction from Spectral and Temporal Features for Identifying the Telugu Dialects by Using GMM and HMM

Abstract
Telugu language is one of the historical languages and belongs to the Dravidian family. It contains three dialects named Telangana, Costa Andhra, and Rayalaseema. This paper identifies the dialects of the Telugu language. MFCC, Delta MFCC, and Delta-Delta MFCC are applied with 39 feature vectors for the dialect identification. In addition, ZCR is also applied to identify the dialects. At last combined all the MFCC and ZCR features. A standard database is created to identify the dialects of the Telugu language. Different statistical methods like HMM and GMM are applied for the classification purpose. To improve the accuracy of the model, dimensionality reduction technique PCA is applied to reduce the number of features extracted from the speech signal and applied to models. In this work, with the application of dimensionality reduction, there is an increase in the accuracy of models observed.