LogMine
- 24 October 2016
- conference paper
- conference paper
- Published by Association for Computing Machinery (ACM)
- p. 1573-1582
- https://doi.org/10.1145/2983323.2983358
Abstract
Modern engineering incorporates smart technologies in all aspects of our lives. Smart technologies are generating terabytes of log messages every day to report their status. It is crucial to analyze these log messages and present usable information (e.g. patterns) to administrators, so that they can manage and monitor these technologies. Patterns minimally represent large groups of log messages and enable the administrators to do further analysis, such as anomaly detection and event prediction. Although patterns exist commonly in automated log messages, recognizing them in massive set of log messages from heterogeneous sources without any prior information is a significant undertaking. We propose a method, named LogMine, that extracts high quality patterns for a given set of log messages. Our method is fast, memory efficient, accurate, and scalable. LogMine is implemented in map-reduce framework for distributed platforms to process millions of log messages in seconds. LogMine is a robust method that works for heterogeneous log messages generated in a wide variety of systems. Our method exploits algorithmic techniques to minimize the computational overhead based on the fact that log messages are always automatically generated. We evaluate the performance of Log-Mine on massive sets of log messages generated in industrial applications. LogMine has successfully generated patterns which are as good as the patterns generated by exact and un- scalable method, while achieving a 500X speedup. Finally, we describe three applications of the patterns generated by LogMine in monitoring large scale industrial systems.Keywords
This publication has 15 references indexed in Scilit:
- LogDiverPublished by Association for Computing Machinery (ACM) ,2015
- Searching and mining trillions of time series subsequences under dynamic time warpingPublished by Association for Computing Machinery (ACM) ,2012
- The unified logging infrastructure for data analytics at TwitterProceedings of the VLDB Endowment, 2012
- LogBaseProceedings of the VLDB Endowment, 2012
- Parallel data processing with MapReduceACM SIGMOD Record, 2012
- Exact Discovery of Time Series MotifsPublished by Society for Industrial & Applied Mathematics (SIAM) ,2009
- MapReduceCommunications of the ACM, 2008
- Log-based indexing to improve web site searchPublished by Association for Computing Machinery (ACM) ,2007
- OPTICSACM SIGMOD Record, 1999
- Fast subsequence matching in time-series databasesACM SIGMOD Record, 1994