og le Go Thanks Internet! n ks ha http://school.discoveryeducation.com/clipart/clip/stk-fgr6.htmlT http://listsoplenty.com/pix/tag/cartoon https://www.facebook.com/ProgrammersJokeshttp://www.feld.com/wp/archives/2009/02/unix-time-1234567890- on-valentines-day.html
Vannevar Bush“wholly new forms ofencyclopedias will appear,ready made with a mesh ofassociative trails runningthrough them, ready to bedropped into the memex andthere ampliﬁed” -- “As we may think” The Atlantic, July 1945
Sir Tim Berners-Lee“We should work towards auniversal linked informationsystem … to allow a place forany information or referenceone felt was important and away of ﬁnding it afterwards.” -- Founding proposal for “the mesh”, 1989
… the mesh became the web … the web became a mess... “ﬁnding it afterwards”? Hah!
Larry Page "Sergey Brin• Grad students at Stanford• Worked with Terry Winograd (artiﬁcial intelligence)• Created a web-search algorithm called “backrub”• Spun-off a company “Googol”• Worth about $20 billion each
A cartoon websearch primer1. Crawl webpages2. Analyze webpage text (information retrieval)3. Analyze webpage links4. Fit measures to human evaluations5. Produce rankings6. Continuously update
Andrei Markov• Studied sequences of random variables.• The probability that the random variable takes a particular value only depends on it’s current value.• The “page id” is the “random variable” in the Markov chain!
Oskar Perron"Georg Frobenius• Simultaneously discovered when a Markov chain has an “average” • The “average” of the web? It’s the probability of ﬁnding the random surfer at a page.• In 1907
What pages areimportant?Perron and Frobenius proved thefollowing algorithm alwaysconverges to a solution…set prob[i] = 0 for all pagesset p to a random pagefor t = 1 to ... increment prob[p] if rand() < alpha, set p to a random neighbor of p else, set p to a random page
Richard von Mises• Created “the power method”• An efﬁcient algorithm to “average” a Markov chain• It updated the probabilities of all pages at once.“Praktische Verfahren der Gleichungsauﬂösung”"R. von Mises and H. Pollaczek-Geiringer, 1929
What pages areimportant?Using the von Mises method …set prob[i] = 1/n for all pagesfor t = 1 to about 80 set newprob[i] = 0 for all pages for all links from page i to page j set newprob[j] += prob[i]/deg[i] for all pages I set prob[i] = alpha*newprob[i] + (1-alpha)/n
That algorithm underlyingGoogle’s analysis of the web isfrom 1929!
A new status index (1953)"Leo KatzA paper about how information spreads in groups … “For example, the information that the new high-school principal is unmarried and handsome mightoccasion a violent reaction in a ladies garden cluband hardly a ripple of interest in a luncheon group ofthe local chamber of commerce. On the other hand,the luncheon group might be anything but apatheticin its response to information concerning a fractionalchange in credit buying restrictions announced by thefederal government.”
Gene Golub Popularized numerical computing with matrices via the informal “Golub thesis” “anything worth computing can be stated as a matrix problem” William Kahan Formalized IEEE-754 ﬂoating point arithmetic. Make it possible to compute with probabilities as “real numbers” instead of discrete counts.
CreditsMost pictures taken from Google image search.Original idea from Massimo Franceschet.“PageRank: Standing on the shoulders of giants”