Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Basics of Search Engines and Algorithms

4,518 views

Published on

Web Trainings Academy presents the Part 1 of the SEO Training Series. Learn about the concepts of Search Engines, Architecture, Serp and Search Algorithm Updates. Presented by Mohammed Azharuddin

Basics of Search Engines and Algorithms

  1. 1. Search Engine Optimization How Search Engine Works ? Presented by Mohammed Azharuddin
  2. 2. Contact Info • Facebook: Md Azharuddin Barkati • Twitter : mdazhar01 • Gmail : azhar.itguy@gmail.com
  3. 3. History of Search • 1990 – Archi Query Form – FTP based file search engine • Feb 1993 – Excite.com – General word relation based search • Oct 1993 – AliWeb – Manual submission engine • Jan 1994 – Altavista – First natural language search engine
  4. 4. • Jan 1996 – Backrup – Started by Larry Page and Segrey Brin • Sep 15 1997 – Google.com – First search engine with Page Rank Technology • 1997 – Yandex.com – Russian based search engine • 1998 – MSN Search – Microsoft Rival to Google
  5. 5. • 2000 – Baidu.com – Chinese based search engine • 2008 – duckduckgo.com – Non tracking search engine • 2009 – Bing.com – Microsoft Rival to Google • 2010 – Blekko.com – Spam and Virus free search http://www.searchenginehistory.com/ http://www.google.co.in/about/company/history/ http://www.wordstream.com/articles/internet-search-engines-history
  6. 6. The Google Story
  7. 7. Search Engine Architecture • Every search engine is based on following –Crawling –Indexing –Algorithms –Results –Fight Spam
  8. 8. Google Architecture http://infolab.stanford.edu/~backrub/google.html
  9. 9. Search Engine Architecture Crawler Store Indexer 100 Million GB indexes indexes Search Interface Algorithms (Programs) trash trash trash Sorted based on Content / Factors WWW 60 Trillion Pages Or 60 Lakh CroreLive Google Example
  10. 10. Algorithms • Programs and Formulas to get relevant results – Page Rank – Spelling Check – Synonym check – Auto complete – Query Understanding – Safe Search – User Context
  11. 11. Page Rank Algorithm • Google's first algorithms, which looks at links between pages to determine their relevance. • PR is a number generated for each page available in Google Index • PR Toolbar Range – NA to 10 (Best Rank) : This is based on Log Scale of 0 – 10 • Real Page rank is calculated based on number of pages in index, which can be 0.15 to Trillions
  12. 12. Toolbar Vs. Real PR Toolbar Real PR 0 0 - 10 1 100 - 1,000 2 1,000 – 10,000 3 10,000 – 100000 4 100000 – 1000000 5 1000000 - 10000000 http://www.webworkshop.net/pagerank_calculator.php3
  13. 13. PR Formula Updated Formula Old Formula D = Damping Factor ; PR(N) = PR of Linking Site ; L(N) : No of Outbound Links
  14. 14. Example http://en.wikipedia.org/wiki/PageRank http://www.cs.princeton.edu/~chazelle/courses/BIB/pagerank.htm
  15. 15. Fighting Spam • Spam refers to websites which uses un ethical practices for Search Rankings • To fight the spam Google release updates frequently called as “Algorithm Updates” • Google changes its search algorithm around 500 – 600 times every year. • Some of them are major and few are minor updates
  16. 16. Major Updates
  17. 17. • Panda Update - February 23, 2011 – This algorithm target the sites with thin content, content farms, duplicate content, sites with high ad-to-content ratios, and a number of other quality issues. – Affected 12% queries on launch – Recent update : Panda 4 – May 19 2013
  18. 18. • Penguin Update – April 24, 2012 – This algorithm target the sites which over optimize the websites, uses excessive links. – Affected 3% queries on launch – Recent update : Pengiun 2.1 – Oct 4 2014
  19. 19. Humming Bird Update – August 2013 • This algorithm understands the context of the query by analyzing the words in query • It can automatically rewrite the query internally based on certain words like “Near”, Vs, How to, Where, Who is …. Etc • Many queries are provided as “ONE BOX ANSWERS” to give the quick answers.
  20. 20. How it Works ? User Query Query Translator Modified Query Index
  21. 21. One Box Answers Queries • When is Independence of India • Time in India or Time in Toronto • 1$ to INR • 1Mile to Kms • Banana Vs. Apple • Who is wife of Bill Gates • What is my IP • who invented www • Show me pictures of taj mahal
  22. 22. Search Engine Results Page (SERP)
  23. 23. Types of Results Paid Results PPC Ads Comparison Ads Shopping Ads Non Paid Results Organic Web News Results Image Results Local Results Video Results Site Links Schema Data
  24. 24. Click Through Rate (CTR) • CTR is a measure to understand how many users are clicking on the site from SERP • CTR helps to understand the user response • The top four positions “above the fold” for many desktop users, receive 83% of first page organic clicks. CTR = (No of Clicks/No of Impressions)x100
  25. 25. 2011
  26. 26. 2012 CTR Results
  27. 27. Branded Vs. Un Branded
  28. 28. Thank you Give us your feedback

×