Not only have they been a hit at the box office recently, 'eight-legged freaks' also infest the World Wide Web. There are numerous species of spiders (alias 'bots' and 'wanderers') to be found scuttling along its threads. But these itsy-bitsy virtual critters should neither be screamed at nor squashed. For, just like real spiders, they have a valuable role to play in the information jungle. They are the unsung heroes of online data gathering, enslaved by an all-powerful entity that exploits them mercilessly - the search engine. This digital despot was created to accelerate searches within the Internet. The father of all search engines, who was given the pipe-and-slippers name of Archie, arrived in 1990 to search the evolving Internet for files on FTP (File Transfer Protocol) sites. The next major advance came in the shape of a tool with the equally stuffy name of Veronica, which searched Gopher servers that contained text databases. Then along came a close forerunner of present search engines, the bohemian-sounding World Wide Wanderer, which rocked all over the Web, tracking its growth. This program also succeeded in causing widespread irritation because, when it visited a Web site, it slowed down the performance of the network on which the site resided. The wayward Wanderer eventually gave way to more sophisticated searchers such as the alarmingly named WWW Worm, Jumpstation and Nasa's Repository Based Software Engineering project. The Web's explosion in the mid-1990s spawned a new generation of search engines that remain active today. All these engines rely on spiders, so-called because they usually visit many sites in parallel at the same time, their spindly 'legs' spanning a large area of the Web. While Google has GoogleBot, Altavista has Scooter and Lycos T-Rex. Spiders reach out and grab pages from the Internet. If they find a new page they will take a copy of the data. If they find it. What brings a spider to a page in the first place? The answer is that either an author has begged a search engine to index it, or the spider has found the site by following a link from another page. Just like their biological counterparts, online spiders are paragons of industry - the specimens employed by AltaVista, for instance, will snatch about 10 million pages a day. After a spider gets its pages, it distributes them to another computer program for indexing. This process identifies the text, links and other content in the page and stores it in the search engine database's files so that the database can be searched by keyword and other sophisticated methods on offer. As a result, the page will be found if your search mirrors its content. However, if the author has neglected to register a Web site with the engines and there are no outside links to it, none of the pages on that site will be found. Bad news if the author is touting for business. On the other hand, the author might want to stop spiders visiting certain pages on a site because they are 'under construction', over-personal or just rubbish. Realising this, the elusive figures who create search engines have devised a solution called the Robot Exclusion Standard, which allows an author to tell the robots not to index some pages or to follow certain links. Unobtainable pages are referred to as the Invisible Web, which is estimated to be two to three or more times bigger than the visible Web. So next time you run a search and your spider fails to come up with more than a smattering of hits, do not blame your furry friend. Some relevant information has probably just been placed beyond the reach of its flailing legs. Confused by computer jargon? E-mail firstname.lastname@example.org with your questions.