A Hybrid Approach to Retrieve Knowledge from a Document

1 January 2020

journal article
research article
Published by IGI Global in International Journal of Knowledge Management

Vol. 16 (1), 83-100
https://doi.org/10.4018/ijkm.2020010104

Abstract

The task of retrieving the theme of a document and presenting a shorter form compared to the original text to the user is a challenging assignment. In this article, a hybrid approach to extract knowledge from a text document is presented, in which three key sentence level relationships in association with the Markov clustering algorithm is used to cluster sentences in the document. After clustering, sentences are ranked in each cluster and the highest ranked sentences in each cluster are merged. In the end, to get the final theme of the document, the Gradient boosting technique XGboost is used to compress the newly generated sentence. The DUC-2002 data set is used to evaluate the proposed system and it has been observed that the performance of the proposed system is better than other existing systems. Request access from your librarian to read this article's full text.

Keywords

This publication has 14 references indexed in Scilit:

Big Data, the Internet of Things, and the Revised Knowledge Pyramid
ACM SIGMIS Database: the DATABASE for Advances in Information Systems, 2017
XGBoost
Published by Association for Computing Machinery (ACM) ,2016
A Hybrid Approach to Multi-document Summarization of Opinions in Reviews
Published by Association for Computational Linguistics (ACL) ,2014
Abstractive Summarization of Product Reviews Using Discourse Structure
Published by Association for Computational Linguistics (ACL) ,2014
A Revised Knowledge Pyramid
International Journal of Knowledge Management, 2013
MULTI‐DOCUMENT SUMMARIZATION OF EVALUATIVE TEXT
Computational Intelligence, 2012
Evaluation of clustering algorithms for protein-protein interaction networks
BMC Bioinformatics, 2006
Sentence Fusion for Multidocument News Summarization
Computational Linguistics, 2005
Learning to paraphrase
Published by Association for Computational Linguistics (ACL) ,2003
Greedy function approximation: A gradient boosting machine.
The Annals of Statistics, 2001

Cited by 1 article