Identifying Topical Shifts in Twitter Streams: An Integration of Non-negative Matrix Factorisation, Sentiment Analysis and Structural Break Models for Large Scale Data
Abstract: We propose an integration of Non-negative Matrix Factorisation, Sentiment analysis and Structural Break Models to identify significant topical shifts on the social media platform Twitter. For the topic modelling, we compare Latent Dirichlet Allocation and Non-negative Matrix Factorization in terms of their applicability to short text documents. The extraction of sentiment is done by the rule-based VADER model. Structural breaks in the relative frequency and daily sentiments of topics over time are identified with the Bai-Perron model. Combining these methods, we provide a valuable and easy to use exploratory tool for social scientists to study the discourse on Twitter over time. Detecting statistically significant shifts in topics over time enables researchers to perform statistical inference and test hypotheses about the discourse on Twitter. The framework is implemented efficiently to ensure that it can be used on average consumer hardware in a reasonable amount of time. A case study with COVID-19 related tweets in the UK is provided. Our method is validated by linking the topical shifts to real world events by the use of the timestamps of the COVID-19 related tweets.
Keywords: Twitter / Social media / Topic model / Non-negative Matrix Factorisation / Sentiment analysis / Structural Break Models
Scifeed alert for new publicationsNever miss any articles matching your research from any publisher
- Get alerts for new papers matching your research
- Find out the new papers from selected authors
- Updated daily for 49'000+ journals and 6000+ publishers
- Define your Scifeed now
Click here to see the statistics on "Algorithmic Game Theory" .