SEARCH ENGINE DEPENDENCY
   AND ITS INFLUENCE ON
      DATA QUALITY
           By Ronan CHARDONNEAU
Index

            I - Introduction to the world of search engines
                 II - Risks of search engines dependenc...
The World of Search
               engines



I-Introduction   II-Risks   III-Solutions   IV-Future   V-Conclusion
Market configuration
          TOP 10 Search websites in the world for August 2007




            Target: users more than...
Leaders per country




                                                Source: map made using data on « Alexa the
       ...
A win or lose market




I-Introduction   II-Risks   III-Solutions   IV-Future   V-Conclusion
Approximation of language contents
          available on Internet




                                                   ...
What has already been proved
   • Studies are showing that Internet is the main
   information provider (at least in Europ...
Brief summary
   • Google is the market leader, followers are far;
   • 8 search engine leaders and probably eight
   cont...
Risks of search engine
        dependency and its influence on
                 data quality




I-Introduction   II-Risks...
Definition
  The behaviour of not reconsidering the results coming from one single
  search engine.

  It normally starts ...
• Who is Google? Well... It is our friend;
      • We can carry it everywhere, relevant, convenient(quick
      display, s...
Consequences

      • If you don’t know how to deal with it:
           - You will never use his true capacities;
        ...
Advertisement
      • Search engines economical model is based on
      advertisement (99% of Google revenues are based on...
Google is not an isolated case
      • Baidu dependency in China and Yandex dependency in
      Russia;
      • Seznam dep...
Brief summary

      • Search engine dependency is confortable and then
      understandable;
      • But for many reasons...
How to solve the equation?




I-Introduction   II-Risks   III-Solutions   IV-Future   V-Conclusion
First point

    •    If an answer exist... we should look for it;
    •    At the moment there is no miracle solution
   ...
Three pillars
                       Learn how to use the technology




    Technological awareness                     B...
Concrete case: Google
Learn how to use the technology:
• Make advanced research:
   – Simple Boolean operators («  », link...
Concrete case: Google
  Technological awareness:
        By performing better at search you will discover new
   technolog...
Technological awareness: Google
                                                  Google Advanced Search




             ...
Technological awareness: How to
             select the best
•     Search engine market is a world of buzz:




•     Wher...
Start to look at what Google does not have

     •    Real time information: the Twitter example

                        ...
Start to look at what Google does not have

     •    Finding similar websites: Who is like it?

                         ...
Start to look at what Google does not have

Another way of searching information: Social bookmarking



                  ...
Start to look at what Google does not have


Graphical display




I-Introduction   II-Risks   III-Solutions   IV-Future  ...
Start to look where
                 Google is not the best
     Look for specialized search engines
     - People: 123 Pe...
How to improve data quality on
                  the Internet?
     •    Triangle method: Locating three independent
     ...
Brief summary

     •   Learn how to use, change your habits, be aware;
     •   Be curious
     •   Think about another w...
Future of Google and
                 information research




I-Introduction   II-Risks   III-Solutions   IV-Future   V-C...
Semantic search
     •    You get feed instead of entering your
          request;
     •    Everything is talking about S...
Some issues to fix
     • How to well index pictures? Are solutions such
       as Google labeler are the best???
     • H...
A Google which will have to change
•     Too much information on the Internet;
•     A Google which is collapsing and prov...
But a Google more and more
              present in our life
     •    Forecasts are going in that sense;
     •    Develo...
The question is just how they
                  will do it?




         Google in 1998                      Google 11 yea...
Brief summary

     • Google will be with us in the future and we
       have to get used to it;
     • Information resear...
Conclusion




I-Introduction   II-Risks    III-Solutions   IV-Future   V-Conclusion
What you have to keep in mind
     •    At least if you are dependent you should
          be well dependent;
     •    Ap...
Recommandations
 Master thesis about search engine dependency:
    - http://www.pandia.com/index.html
 List of search engi...
Thank you for your attention




http://moteurs-de-recherches-alternatifs.blogspot.com
Upcoming SlideShare
Loading in...5
×

Search Engine Dependency Conference

1,102

Published on

Conference slides about search engine dependency and its influence on data quality

Published in: Technology, Design
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,102
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
16
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Search Engine Dependency Conference

  1. 1. SEARCH ENGINE DEPENDENCY AND ITS INFLUENCE ON DATA QUALITY By Ronan CHARDONNEAU
  2. 2. Index I - Introduction to the world of search engines II - Risks of search engines dependency III - How to solve the equation? IV - Future of Google and information research V - Conclusion I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  3. 3. The World of Search engines I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  4. 4. Market configuration TOP 10 Search websites in the world for August 2007 Target: users more than 15 year-old, home and at work Source: comscore qSearch 2.0 I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  5. 5. Leaders per country Source: map made using data on « Alexa the Web information company (2008) ». I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  6. 6. A win or lose market I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  7. 7. Approximation of language contents available on Internet Source: Internet world Stats I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  8. 8. What has already been proved • Studies are showing that Internet is the main information provider (at least in Europe and America); • When surfing on the Internet search engines are the most used websites; • People trust search engines results; • When making research on the Internet people are mainly using one single search engine; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  9. 9. Brief summary • Google is the market leader, followers are far; • 8 search engine leaders and probably eight continents on Internet; • A market defined by the adoption of standards (<50%) to search; • Contents are mainly in English, importance of Chinese, quality contents in Japanese, German and Korean; • Internet users cannot live without search engines and are loyal to a specific one; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  10. 10. Risks of search engine dependency and its influence on data quality I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  11. 11. Definition The behaviour of not reconsidering the results coming from one single search engine. It normally starts when you hear sentences such as: - quot;Why should I bother using other search engines because I find everything I want with Google?quot; - Do I really have some risks when I am using Google? - All countries in the world have Google in their top 100 or less; - Google has been recognized as the most powerful brand; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  12. 12. • Who is Google? Well... It is our friend; • We can carry it everywhere, relevant, convenient(quick display, services associated); • But: – You have to know how to deal with it; – You have to know its limits; – You have to know its potential; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  13. 13. Consequences • If you don’t know how to deal with it: - You will never use his true capacities; - You will probably take the first information which is displayed; • If you don't know its limits: - And cannot find the information you will may think that the information does not exist; - You may even think that the technology does not exist elsewhere; • If you don’t know its potential: - You will not improve at performing research; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  14. 14. Advertisement • Search engines economical model is based on advertisement (99% of Google revenues are based on it); • However studies are showing that some categories of adults (non Internet generations) do not make the difference between commercial and non commercial links; • Some search engines are more commercial than others; • The more you know a search engine (Google) and the more you can practise Search Engine Optimization; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  15. 15. Google is not an isolated case • Baidu dependency in China and Yandex dependency in Russia; • Seznam dependency in Czech Republic; • Naver dependency in South Korea; • Yahoo dependency in Japan and many others Asian countries; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  16. 16. Brief summary • Search engine dependency is confortable and then understandable; • But for many reasons it goes for a mass consumption information (blog phenomenon, advertisement…) which is not the best ones; • In our countries it is Google dependency but keep in mind that Europe and Americas are not the center of the world; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  17. 17. How to solve the equation? I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  18. 18. First point • If an answer exist... we should look for it; • At the moment there is no miracle solution for lazy search; • But there are ways to get closer to the answer; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  19. 19. Three pillars Learn how to use the technology Technological awareness Breaking the habits I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  20. 20. Concrete case: Google Learn how to use the technology: • Make advanced research: – Simple Boolean operators («  », links:, define:, ?, *, ~,…) ; – Complex request: ?intitle:index.of? quot;quot; -filetype:html -filetype:asp -wiki -ringtone -filetype:htm -posts -lyrics -filetype:shtml -filetype:php -filetype:doc -filetype:pdf -filetype:txt mpeg wma avi wmv – Google Advanced search; • Using other Google services such as Google Alerts; • Use sub Google search engines such as Google Scholars; Breaking the habits: - Get used to practice what you learnt and force yourself to do so; - Results are coming and you get used to it; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  21. 21. Concrete case: Google Technological awareness: By performing better at search you will discover new technologies that you will have to learn. For example: Google Alerts tell you that a new search engine is coming up and then you try it; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  22. 22. Technological awareness: Google Google Advanced Search Do you know iGoogle? Google Ads When Google promotes its own technology good chances that it is worthwhile I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  23. 23. Technological awareness: How to select the best • Search engine market is a world of buzz: • Where every search engine want to beat Google; • But are they really providing a technical revolution? I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  24. 24. Start to look at what Google does not have • Real time information: the Twitter example When Google starts to be interested in one's technology it should then be a good one I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  25. 25. Start to look at what Google does not have • Finding similar websites: Who is like it? Unfortunately it is working only for popular websites I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  26. 26. Start to look at what Google does not have Another way of searching information: Social bookmarking Advantages: you find unindexed websites; Disadvantages: rubbish websites, advertisement? I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  27. 27. Start to look at what Google does not have Graphical display I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  28. 28. Start to look where Google is not the best Look for specialized search engines - People: 123 People, CV gadget, Pipl… - Jobs: Indeed, JobiJoba… - Tutorials: Tutosearch, … - Torrent: Toorgle, … - Scientific information: Scirus,… - Information in a specific language: Yandex for Russian, Baidu for Chinese…. I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  29. 29. How to improve data quality on the Internet? • Triangle method: Locating three independent sources that point to the same answer; • Recent events in Tibet showed how it was important to look at different sources of information and even out of your own country; Source 1: Washington Post Source 2: Le Parisien Source 3: AntiCnn.com I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  30. 30. Brief summary • Learn how to use, change your habits, be aware; • Be curious • Think about another way to look for information; • Three dependent sources of information; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  31. 31. Future of Google and information research I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  32. 32. Semantic search • You get feed instead of entering your request; • Everything is talking about Semantic search; • But it is mature yet, a buzz world again (there are not a lot of suggestions); • Poor results if developped on scratch (poor index) if developped by huge companies (few suggestions); I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  33. 33. Some issues to fix • How to well index pictures? Are solutions such as Google labeler are the best??? • How to index videos? • How to index sounds? I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  34. 34. A Google which will have to change • Too much information on the Internet; • A Google which is collapsing and providing more and more sub search engines; • The development of high bandwidth connection which mean graphical interface; • A technological awareness which is difficult to transmitt; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  35. 35. But a Google more and more present in our life • Forecasts are going in that sense; • Development of OS on cell phones, Web browser, Web software application (Google slides, Google « excel »....) I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  36. 36. The question is just how they will do it? Google in 1998 Google 11 years after I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  37. 37. Brief summary • Google will be with us in the future and we have to get used to it; • Information research will be more and more assisted but you will still be in late if you do not perform advanced research; • In a short future some issues will still be there (indexing of pictures…) I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  38. 38. Conclusion I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  39. 39. What you have to keep in mind • At least if you are dependent you should be well dependent; • Apply the triangle method; • Reconsider on each time the information process (think differently); I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  40. 40. Recommandations Master thesis about search engine dependency: - http://www.pandia.com/index.html List of search engines: - http://www.pandia.com/powersearch/index.html - http://www.philb.com/whichengine.htm To know more about search engines: Pandia search: - www.pandiasearch.com Documentaries: - Google: Behind the screen by IJsbrand van Veelen http://www.youtube.com/watch?v=TBNDYggyesc&hl=fr - The Great Firewall of China http://www.youtube.com/watch?v=IWsXhNJFj78&hl=fr I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  41. 41. Thank you for your attention http://moteurs-de-recherches-alternatifs.blogspot.com
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×