Volume : II, Issue : IX, September - 2013

Template Detection Technique From Assorted Web Pages

Marriboyina Rajendra, S. Suresh Babu

Abstract :

Now a Days unstructured and/or semi–structured machine–readable document automatically plays a major role in Extracting structured information. To achieve publishing productivity many websites are using common templates with contents to populate the information and the major resource as we all know is WWW. Performance of search engine, clustering and classification of web documents got lot of Concentration for Template detection technique, as templates degrade the performance and accuracy of web application for machines because of irrelevant template terms. In this paper, we present novel algorithms for extracting templates from a large number of web documents which are generated from heterogeneous templates. Using the similarity of underlying template structures in the document we cluster the web documents so that template for each cluster is extracted simultaneously. Among the Template detection algorithms we can justify our algorithm is most efficient one.

Keywords :

Article: Download PDF   DOI : 10.36106/ijsr  

Cite This Article:

Marriboyina Rajendra, S. Suresh Babu Template Detection Technique From Assorted Web Pages International Journal of Scientific Research, Vol : 2, Issue : 9 September 2013

Number of Downloads : 703

References :