Improved Term Weighting Technique for Automatic Web Page Classification
Open Access
- 1 January 2016
- journal article
- Published by Scientific Research Publishing, Inc. in Journal of Intelligent Learning Systems and Applications
- Vol. 08 (04), 63-76
- https://doi.org/10.4236/jilsa.2016.84006
Abstract
Automatic web page classification has become inevitable for web directories due to the multitude of web pages in the World Wide Web. In this paper an improved Term Weighting technique is proposed for automatic and effective classification of web pages. The web documents are represented as set of features. The proposed method selects and extracts the most prominent features reducing the high dimensionality problem of classifier. The proper selection of features among the large set improves the performance of the classifier. The proposed algorithm is implemented and tested on a benchmarked dataset. The results show the better performance than most of the existing term weighting techniques.Keywords
This publication has 12 references indexed in Scilit:
- Hybridized term-weighting method for Dark Web classificationNeurocomputing, 2016
- Hybrid dimension reduction by integrating feature selection with feature extraction method for text clusteringExpert Systems with Applications, 2015
- A novel approach for effective web page classificationInternational Journal of Data Mining, Modelling and Management, 2013
- A study of term weighting schemes using class information for text classificationPublished by Association for Computing Machinery (ACM) ,2012
- A semantic term weighting scheme for text categorizationExpert Systems with Applications, 2011
- IMPROVED WEB PAGE IDENTIFICATION METHOD USING NEURAL NETWORKSInternational Journal of Computational Intelligence and Applications, 2011
- Supervised and Traditional Term Weighting Methods for Automatic Text CategorizationIEEE Transactions on Pattern Analysis and Machine Intelligence, 2008
- Web page feature selection and classification using neural networksInformation Sciences, 2004
- Supervised term weighting for automated text categorizationPublished by Association for Computing Machinery (ACM) ,2003
- Training algorithms for linear text classifiersPublished by Association for Computing Machinery (ACM) ,1996