Review of Web Page Classification and Web Content Mining

Chaithra, Dr.G.M. Lingaraju and Dr.S. Jagannatha

Internet is a vast storage of information’s, usually user search for the product and tries to obtain required information of a particular product by searching through all the servers systems available, that contains the required information sources. Web content mining extracts some of the features of a user product and labels it with the attributes in the result. After the information retrieval process Labelling is done which is the process of naming and identifying the attributes. After the process of extraction and labelling the information gained can be used for the thorough analysis user product. Web content mining is integration of data or the information collected from worldwide various data sources by analysing view of customers. In this paper a survey is done on web content mining methods and few application of web content mining are given. The paper also tells about the hypertext features are used to classify the web page such as – text, title, html, anchor text. The initial experimental results conducted from many research as given the results that by combining one or two features improves the classifiers performance.

Volume 11 | Issue 10

Pages: 142-147

DOI: 10.5373/JARDCS/V11I10/20193017