The first popular Internet search engine was the Wide Area Information Servers (WAIS), developed by Brewster Kahle for Thinking Machines Corporation in 1988. A nice flow chart of the relationship between different search engines from 2001 has been drafted by bruceclay.com.
Search engines enable you to search the Internet for information you’re interested in, such as “home AND garden AND tomatoes” or “exercise AND arthritis“. Different engines combine criteria such as those listed below in various scoring algorithms to determine which sites to return first in response to each query:
- Number of other sites that link to the site.
- The search words are in the Meta tag.
- The search words are in the URL.
- The search words are in the Title.
- The search words are near the top of the page.
- The search words are in the Keyword tag.
- There are repeated occurrences of the search words.
Search engines don’t have time to search the whole web every time you make a query, so they first build up a large database with automatic computer programs called robots that continuously browse the web, twenty-four hours a day, to find new sites and old sites that have been updated. The robots read the text on each page and add it to the search engine’s database. Some search engines record just displayed text, while others index picture tags, link names, and all other textual content.
Mathematically, the data structures that these sites use for storage of their databases are special types of “trees” that are particularly compact and can be searched very quickly. The cost of storage and access time for these trees is usually related to the number of consecutive characters indexed, so most search engines limit the length of any given search word to a defined number of characters, such as 32, although the whole query can usually be much longer.
Resources. The following resources provide more information about search engines.
- The searching section provides tips and techniques to help get the best possible results from search engines in the minimum possible time.