Cyberbullying Detection on Twitter using Support Vector Machine Classification Method

Abstract
Bullying is when someone or a group of individuals is continuously attacked. Because of the advancement of the internet, it has become very easy for society to engage in harmful acts of bullying by attacking a person or group of people who can hurt the victim, this is known as cyberbullying. Twitter is a social media platform that may be used by the society to share information and can also be used to perpetrate cyberbullying actions by sending messages (tweets) that addressed to the victims. This final project was developing a system to detect cyberbullying on Twitter. The system uses the Support Vector Machine method to classify whether the tweets that are shared include cyberbullying or not. In addition, this research also uses Term Frequency-Inverse Document Frequency (TF-IDF) and N-gram feature extraction for data that has gone through the pre-processing stage. In collecting data, the author crawled tweets based on the keywords 'jelek', 'bodoh', 'goblok', 'brengsek', 'bangsat', 'memalukan', 'laknat', 'bacot' and 'pelacur'. The best performance results of the research is 76.2% accuracy, 73.2% precision, 78.2% recall and 75.6% F1-Score generated by the RBF kernel with a total of n=1