Google Search Engine

PageRank is defined as follows: We assume page A has pages T1…Tn which point to it (i.e., are citations). The parameter d is a damping factor which can be set between 0 and 1. We usually set d to 0.85. There are more details about d in the next section. Also C(A) is defined as the number of links going out of page A. The PageRank of a page A is given as follows:

PR(A) = (1-d) + d (PR(T1)/C(T1) + … + PR(Tn)/C(TN))

– Sergey Brin, Lawrence Page; The Anatomy of a Large-Scale Hypertextual Web Search Engine; 1998.

Google is one of the smartest search engines. Google was originally created by Stanford University students Sergey Brin and Larry Page, and retains a well thought out structure and methodology. The site actually runs on thousands of linked PC’s distributed in centres located in different places around the world, and usually provides relevant results with very fast response times.

Google uses a page ranking system to determine which sites are returned at the top in response to a search query. In Google’s words:

In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote.

This method ensures that the pages most highly recommended by other pages are returned closest to the top of a search listing. This technique is really a kind of centralized analysis of peer-to-peer data, and works quite well.

The Google advanced search page provides a menu driven way to search the Internet, and text equivalents of these features are listed below:

Search Features

Function

Query Example

Results

Boolean

+space +mars -venus

“space” and “mars” but not “venus”

 

space OR mars

“space” or “mars”
Note the OR must be capitalized.

 

space and (mars OR venus)

“space” and either “mars” or “venus”
Always put brackets around OR clauses connected to an AND.

Phrases

+electric +”fastest car”

includes phrase “fastest car”

Fields

allinurl:garden rose

“garden” and “rose” in the URL

  gardens filetype:pdf Searches only PDF format files.
 

inurl:garden

“garden” in the URL

  link:livinginternet.com pages that link to “livinginternet.com”
  related:livinginternet.com pages that are related to “livinginternet.com”
 

site:garden

“garden” in the site domain name

Other Google search options are described below.

  • Languages. You can set your page display preferences to a wide range of languages, and search in international language sets
  • SafeSearch. Provides a safe search option where most adult content pages are filtered out and not returned in search results
  • Wildcards. Google doesn’t support wildcards, so for example instead of searching for “comed*” you need to search for “+comedy +comedian”.

Resources. Additional Google related resources are listed below:

  • johnny.ihackstuff.com — Google searches that display sensitive data mistakenly posted to the web.