Paper

Web Crawling Algorithms


Authors:
Aviral Nigam
Abstract
As the size of the Internet is growing rapidly, it has become important to make the search for content faster and more accurate. Without efficient search engines, it would be impossible to get accurate results. To overcome this problem, software called “Web Crawler” is applied which uses various kinds of algorithms to achieve the goal. These algorithms use various kinds of heuristic functions to increase efficiency of the crawlers. A* and Adaptive A* Search are some of the best path finding algorithms. A* uses a Best-First Search and finds the least-cost path from a given initial node to a goal node. In this work, a study has been done on some of the existing Web Crawler algorithms and A*/Adaptive A* methods have been modified to be used in this domain. A*/Adaptive A* methods being heuristic approaches can be used to find desired results in web-like weighted environments. We create a virtual web environment using graphs and compare the time taken to search the desired node from any random node amongst various web crawling algorithms.
Keywords
Web Crawler; A*; Adaptive A*
StartPage
63
EndPage
67
Doi
10.5963/IJCSAI0403001
Download | Back to Issue| Archive