A Hybrid Approach to Retrieve Knowledge from a Document
- 1 January 2020
- journal article
- research article
- Published by IGI Global in International Journal of Knowledge Management
- Vol. 16 (1), 83-100
- https://doi.org/10.4018/ijkm.2020010104
Abstract
The task of retrieving the theme of a document and presenting a shorter form compared to the original text to the user is a challenging assignment. In this article, a hybrid approach to extract knowledge from a text document is presented, in which three key sentence level relationships in association with the Markov clustering algorithm is used to cluster sentences in the document. After clustering, sentences are ranked in each cluster and the highest ranked sentences in each cluster are merged. In the end, to get the final theme of the document, the Gradient boosting technique XGboost is used to compress the newly generated sentence. The DUC-2002 data set is used to evaluate the proposed system and it has been observed that the performance of the proposed system is better than other existing systems. Request access from your librarian to read this article's full text.Keywords
This publication has 14 references indexed in Scilit:
- Big Data, the Internet of Things, and the Revised Knowledge PyramidACM SIGMIS Database: the DATABASE for Advances in Information Systems, 2017
- XGBoostPublished by Association for Computing Machinery (ACM) ,2016
- A Hybrid Approach to Multi-document Summarization of Opinions in ReviewsPublished by Association for Computational Linguistics (ACL) ,2014
- Abstractive Summarization of Product Reviews Using Discourse StructurePublished by Association for Computational Linguistics (ACL) ,2014
- A Revised Knowledge PyramidInternational Journal of Knowledge Management, 2013
- MULTI‐DOCUMENT SUMMARIZATION OF EVALUATIVE TEXTComputational Intelligence, 2012
- Evaluation of clustering algorithms for protein-protein interaction networksBMC Bioinformatics, 2006
- Sentence Fusion for Multidocument News SummarizationComputational Linguistics, 2005
- Learning to paraphrasePublished by Association for Computational Linguistics (ACL) ,2003
- Greedy function approximation: A gradient boosting machine.The Annals of Statistics, 2001