Derin Öğrenme Mimarilerinde Akustik ve Fonotaktik Öznitelikleri Kullanan Türkçe Ağız Tanıma
- 31 July 2020
- journal article
- Published by International Journal of Informatics Technologies in Bilişim Teknolojileri Dergisi
- Vol. 13 (3), 207-216
- https://doi.org/10.17671/gazibtd.668023
Abstract
Dialects are local forms of speech separated by a certain rate from a standard language. Dialect recognition is one of the popular topics studied in speech recognition. In particular, the spoken dialect is asked to be identified first in order to improve the performance of large scale speech recognition systems. The phonetic differences of speech can be determined by examining the acoustic properties at the physical level. Features such as Log mel-spectrograms are used for this purpose. In addition, the phonotactic term corresponds to the arrangement rules of phonemes in a language/dialect. Phoneme sequences and the frequency of this sequence vary from dialect to dialect. Phoneme sequences are obtained by phoneme recognizers. Another topic that has become popular in recent years is deep learning neural networks. Convolutional Neural Networks (CNN), which is a special kind of deep learning neural networks, are often used in image and speech recognition. Long Short-Term Memory Neural Networks (LSTM) is a deep learning neural network model that produces more successful results than n-gram models in language modeling. In this study, the classification of Turkish dialects with CNN and LSTM type neural networks in terms of acoustic and phonotactic features were discussed. Also, LSTM neural networks are used for language modeling in phonotactic approach. In the experimental study, the proposed approaches were tested and interpreted on the Turkish Dialects Dataset that we collected. As a result of the study, it has been observed that the approaches used reaches 85.1% accuracy rate for Turkish dialect recognition.Keywords
This publication has 35 references indexed in Scilit:
- American dialect identification using phonotactic and prosodic featuresPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- Study of Senone-Based Deep Neural Network Approaches for Spoken Language RecognitionIEEE/ACM Transactions on Audio, Speech, and Language Processing, 2015
- Convolutional Neural Networks for Speech RecognitionIEEE/ACM Transactions on Audio, Speech, and Language Processing, 2014
- Automatic language identification using deep neural networksPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- Human and computer recognition of regional accents and ethnic groups from British English speechComputer Speech & Language, 2013
- Cortical competition during language discriminationNeuroImage, 2008
- Framewise phoneme classification with bidirectional LSTM and other neural network architecturesNeural Networks, 2005
- Long Short-Term MemoryNeural Computation, 1997
- Comparison of four approaches to automatic language identification of telephone speechIEEE Transactions on Speech and Audio Processing, 1996
- An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network TrajectoriesNeural Computation, 1990