Verse Search System for Sound Differences in the Qur’an Based on the Text of Phonetic Similarities

Abstract
Al-Qur'an has a lot of content, so the system of searching for verses of the Al-Qur’an is needed because if it is done manually it will be difficult. One of the search systems for the verses of the Al-Qur'an in accordance with Indonesia’s pronunciation is Lafzi. The Lafzi system can search for verse fragments using keywords in Latin characters. Lafzi has been developed into Lafzi +, wherein the Lafzi + system can be used to search verses of the Al-Qur’an with different sounds on stop signs. However, the Lafzi+ can only overcome the difference in the sound of the stop sign and cannot be applied throughout Al-Qur’an. Based on these problems, the system needs to be developed to overcome the differences in sound in the middle of the verse and can be applied throughout the Al-Qur’an. The method used in the process of searching for the verse is the N-gram method. The N-gram used in this research is trigram. The process flow of this system is first normalized in the phonetic coding process after normalized then tokenization of trigrams and then trigrams are matched between the query and the corpus and entered into the ranking process to get an output candidate. In the making process, the LIS (Longest Increasing Subsequence) method is used to get an orderly and strict trigram sequence. The highest order score will be the top output. The results of this study obtained a recall value of 100% and MAP of 87%.