PARAMETRIZED EVENT ANALYSIS FROM SOCIAL NETWORKS

Abstract
The growth of data in social networks facilitate demand for data analysis. The field of event detection is of increasing interest to researchers. Events from real life are actively discussed in the virtual space. Event detection results can be used in a variety of applications, from digital marketing to collecting data about natural disasters. Thereby, researchers face the emergence of new algorithms along with the improvement of existing solutions in the event detection field. This paper proposes improvements to the SEDTWik (Segment-based Event Detection from Tweets using Wikipedia) algorithm. The SEDTWik algorithm is designed to detect events without contextual guidance. The overall SEDTWik detection process excludes the perspective of a topic, or multi-topic, guided (or semi-supervised) event detection approach. As a result, some interesting narrowly focused events are not detected as they are weakly relevant in a broader context (e.g., Wikipedia) although acquiring relevance within a conditioned context. Therefore, there is a need for an adaptive perspective where data is to be analysed against a set of narrower topics of interest. This paper shows that SEDTWik gains expressive power after being extended with multi-topic semi-supervision. The evaluation of the current proposal uses the well-known corpora with labeled events, Events2012. In the Events2012 dataset used notation category for events, meaning that events are combined by a certain topic. SEDTWik with topic dictionaries was checked across all categories. In the main part of the article, it is also explained the process of topic dictionary construction from Events2012 labeled tweets. At this stage of the research, in all tasks unigrams were used. SEDTWik with dictionaries showed improved accuracy, and more events were found within a certain category.