Statistical Machine Translation Pada Bahasa Lampung Dialek Api Ke Bahasa Indonesia

Abstract
In this research, automatic translation of the Lampung dialect into Indonesian was carried out using the statistical machine translation (SMT) approach. Translation of the Lampung language to Indonesian can be done by using a dictionary. Another alternative is to use the Lampung parallel body corpus and its translation in Indonesian with the SMT approach. The SMT approach is carried out in several phases. Starting from the pre-processing phase which is the initial stage to prepare a parallel corpus. Then proceed with the training phase, namely the parallel corpus processing phase to obtain a language model and translation model. Then the testing phase, and ends with the evaluation phase. SMT testing uses 25 single sentences without out-of-vocabulary (OOV), 25 single sentences with OOV, 25 compound sentences without OOV and 25 compound sentences with OOV. The results of testing the translation of Lampung sentences into Indonesian shows the accuracy of the Bilingual Evaluation Undestudy (BLEU) obtained is 77.07% in 25 single sentences without out-of-vocabulary (OOV), 72.29% in 25 single sentences with OOV, 79.84% at 25 compound sentences without OOV and 80.84% at 25 compound sentences with OOV.