TWITTER BUZZER DETECTION SYSTEM USING TWEET SIMILARITY FEATURE AND SUPPORT VECTOR MACHINE

Ahmad Mustofa, Fitrah Maharani Humaira, Myrna Ermawati, Peni Sriwahyu Natasari, Akhmad Arif Kurdianto, Aries Alfian Prasetyo, A Labib Fardany Faisal

Abstract


Over the past few years, people have been able to get and share information through social media easily. Some of that information can be a false issue created by a buzzer account that intends to influence people into a specific opinion. Politicians often use social media to maintain a good image in society by utilizing buzzer accounts. The main characteristic of a buzzer account is that they upload the same content repeatedly within a certain period. Before analyzing data taken from social media such as Twitter, we need a buzzer detection system to filter data from buzzer users.  This research attempts to build a buzzer detection system using text processing and classification method. We use the similarity of tweets as a feature for the buzzer detection system by applying Cosine Similarity to the Term Frequency - Inverse Document Frequency (TF-IDF) feature of the tweets. In addition, we will use other features such as the number of followers, number of followings, the intensity of tweets, the ratio of retweets, and the ratio of tweets that contain links as additional features in this study. This research uses these features as inputs to the Support Vector Machine model to determine whether an account is a buzzer or not. This system has promising results by having 89% accuracy, 86.67% precision, 70.91 % recall, and 78% F1-score.

Keywords


social media; buzzer detection; text processing; support vector machine

Full Text:

PDF

References


M. Ibrahim, O. Abdillah, A. F. Wicaksono dan M. Adriani, “Buzzer Detection and Sentiment Analysis for Predicting Presidential Election Results in A Twitter Nation,” dalam International Conference on Data Mining Workshops, 2015. Available: https://doi.org/10.1109/ICDMW.2015.113

“Prediction and analysis of Indonesia Presidential election from Twitter using sentiment analysis,” Journal of Big Data, vol. 5, no. 51, pp. 1-10, 2018. Available: https://doi.org/10.1186/s40537-018-0164-1

T. A. Arafat, I. Budi, R. Mahendra dan D. A. Salehah, “Demograph-ic Analysis of Candidates Supporter in Twitter During Indonesian Presidential Election 2019,” dalam 2020 International Conference on ICT for Smart Society (ICISS), Bandung, 2020. Available: https://doi.org/10.1109/ICISS50791.2020.9307598

H. Wang, D. Can, A. Kazemzadeh, F. Bar dan S. Narayanan, “A System for Real-time Twitter Sentiment Analysis of 2012 U.S. Presi-dential Election Cycle,” dalam ACL '12: Proceedings of the ACL 2012 System Demonstrations, Jeju Island, 2012. Available: https://dl.acm.org/doi/abs/10.5555/2390470.2390490

A. Sarlan, C. Nadam dan S. Basri, “Twitter Sentiment Analysis,” dalam Proceedings of the 6th International Conference on Infor-mation Technology and Multimedia, Putrajaya, 2014. Available: https://doi.org/10.1109/ICIMU.2014.7066632

M. Ghiassi, J. Skinner dan D. Zimbra, “Twitter brand sentiment analysis: A hybrid system using n-gram analysis and dynamic artifi-cial neural network,” Expert Systems with Applications, vol. 40, no. 16, pp. 6266-6288, 2013. Available: https://doi.org/10.1016/j.eswa.2013.05.057

Z. Jianqiang, G. Xiaolin dan Z. Xuejun, “Deep Convolution Neural Networks for Twitter Sentiment Analysis,” IEEE Access, vol. 6, pp. 23253-23260, 2018. Available: https://doi.org/10.1109/ACCESS.2017.2776930

P. Ray dan A. Chakrabarti, “Twitter sentiment analysis for product review using lexicon method,” dalam 2017 International Conference on Data Management, Analytics and Innovation (ICDMAI), Pune, 2017. Available: https://doi.org/10.1109/ICDMAI.2017.8073512

N. Yadav, O. Kudale, S. Gupta, A. Rao dan A. Shitole, “Twitter Sentiment Analysis Using Machine Learning For Product Evalua-tion,” dalam 2020 International Conference on Inventive Computa-tion Technologies (ICICT), Coimbatore, 2020. Available: https://doi.org/10.1109/ICICT48043.2020.9112381

A. Suciati, A. Wibisono dan P. Mursanto, “Twitter Buzzer Detection for Indonesian Presidential Election,” dalam 3rd International Con-ference on Informatics and Computational Sciences, Semarang, 2019. Available: https://doi.org/10.1109/ICICoS48119.2019.8982529

M. Kantepe dan M. C. Ganiz, “Preprocessing Framework for Twit-ter Bot Detection,” dalam International Conference on Computer Science and Engineering, 2017. Available: https://doi.org/10.1109/UBMK.2017.8093483

Y. M. Vianny dan E. B. Setiawan, “Implementation of Rumor Detec-tion on Twitter Using J48 Algorithm,” Jurnal RESTI (Rekayasa Sis-tem Dan Teknologi Informasi), vol. 4, no. 5, pp. 775-781, 2020. Available: https://doi.org/10.29207/resti.v4i5.2059

B. Li dan L. Han, “Distance Weighted Cosine Similarity Measure for Text Classification,” Intelligent Data Engineering and Automated Learning – IDEAL 2013, vol. 8206, pp. 611-618, 2013. Available: https://doi.org/10.1007/978-3-642-41278-3_74

R. A. Lahitani, A. E. Permanasari dan N. A. Setiawan, “Cosine simi-larity to determine similarity measure: Study case in online essay as-sessment,” dalam 2016 4th International Conference on Cyber and IT Service Management, 2016, Bandung. Available: https://doi.org/10.1109/CITSM.2016.7577578

F. Rahutomo, T. Kitasuka dan M. Aritsugi, “Semantic Cosine Similar-ity,” dalam The 7th International Student Conference on Advanced Science and Technology ICAST, Seoul, 2012.

D. Gunawan, C. A. Sembiring dan M. A. Budiman, “The Implemen-tation of Cosine Similarity to Calculate Text Relevance between Two Documents,” Journal of Physics: Conference Series, vol. 978, 2018. Available: https://doi.org/10.1088/1742-6596/978/1/012120

C.-H. HUANG, J. YIN dan F. HOU, “A Text Similarity Measure-ment Combining Word Semantic Information with TF-IDF Method,” Chinese Journal of Computers, no. 5, p. 856—864, 2011. Available: https://doi.org/10.3724/SP.J.1016.2011.00856

K. Park, J. S. Hong dan W. Kim, “A Methodology Combining Cosine Similarity with Classifier for Text Classification,” Applied Artificial Intelligence, vol. 34, no. 5, pp. 396-411, 2020. Available: https://doi.org/10.1080/08839514.2020.1723868

P. Y. Ristanti, A. P. Wibawa dan U. Pujianto, “Cosine Similarity for Title and Abstract of Economic Journal Classification,” dalam 2019 5th International Conference on Science in Information Technology (ICSITech), Yogyakarta, 2019. Available: https://doi.org/10.1109/ICSITech46713.2019.8987547

F. S. Al-Anzi dan D. AbuZeina, “Toward an enhanced Arabic text classification using cosine similarity and Latent Semantic Indexing,” Journal of King Saud University - Computer and Information Sci-ences, vol. 29, no. 2, pp. 189-195, 2017. Available: https://doi.org/10.1016/j.jksuci.2016.04.001

J. A. Panatra, F. B. Chandra, W. Darmawan, H. L. H. S. Warnars, W. H. Utomo dan T. Matsuo, “Buzzer Detection to Maintain Infor-mation Neutrality in 2019 Indonesia Presidential Election,” dalam 2019 8th International Congress on Advanced Applied Informatics (IIAI-AAI), Toyama, 2019. Available: https://doi.org/10.1109/IIAI-AAI.2019.00177

A. Mustofa, H. Tjandrasa dan B. Amaliah, “Deteksi Penyakit Glau-koma pada Citra Fundus Retina Mata Menggunakan Adaptive Thresholding dan Support Vector Machine,” Jurnal Teknik ITS, vol. 5, no. 2, pp. A572-A575, 2016. Available: http://dx.doi.org/10.12962/j23373539.v5i2.18929

R. Taqiuddin, F. A. Bachtiar and W. Purnomo, "Opinion Spam Classi-fication on Steam Review using Support Vector Machine with Lexi-con-Based Features," KINETIK: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, vol. 6, no. 4, 2021. Available: https://doi.org/10.22219/kinetik.v6i4.1323

B. Trstenjak, S. Mikac dan D. Donko, “KNN with TF-IDF Based Framework for Text Categorization,” Procedia Engineering, no. 69, pp. 1356-1364, 2014. Available: https://doi.org/10.1016/j.proeng.2014.03.129

G. Salton dan M. J. McGill, Introduction to Modern Information Retrieval, New York, 1986.

A. Singhal, “Modern Information Retrieval: A Brief Overview,” Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 2001.




DOI: http://dx.doi.org/10.36564/njca.v8i1.306

DOI (PDF): http://dx.doi.org/10.36564/njca.v8i1.306.g110

Refbacks

  • There are currently no refbacks.


Copyright (c) 2023 Ahmad Mustofa

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

NJCA(Nusantara Journal of Computers and Its Applications)
Published by Computer Society of Nahdlatul Ulama, Indonesia.
Office : PO.BOX 1 Paiton Probolinggo kodepos 67291 Jawa Timur, Indonesia

DECREE OF THE MINISTER OF LAW AND HUMAN RIGHTS OF THE REPUBLIC OF INDONESIA
NUMBER AHU-0060541.AH.01.07.YEAR 2016