Search Engine Dependency Conference

  • 1,027 views
Uploaded on

Conference slides about search engine dependency and its influence on data quality

Conference slides about search engine dependency and its influence on data quality

More in: Technology , Design
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,027
On Slideshare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
15
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. SEARCH ENGINE DEPENDENCY AND ITS INFLUENCE ON DATA QUALITY By Ronan CHARDONNEAU
  • 2. Index I - Introduction to the world of search engines II - Risks of search engines dependency III - How to solve the equation? IV - Future of Google and information research V - Conclusion I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 3. The World of Search engines I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 4. Market configuration TOP 10 Search websites in the world for August 2007 Target: users more than 15 year-old, home and at work Source: comscore qSearch 2.0 I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 5. Leaders per country Source: map made using data on « Alexa the Web information company (2008) ». I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 6. A win or lose market I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 7. Approximation of language contents available on Internet Source: Internet world Stats I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 8. What has already been proved • Studies are showing that Internet is the main information provider (at least in Europe and America); • When surfing on the Internet search engines are the most used websites; • People trust search engines results; • When making research on the Internet people are mainly using one single search engine; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 9. Brief summary • Google is the market leader, followers are far; • 8 search engine leaders and probably eight continents on Internet; • A market defined by the adoption of standards (<50%) to search; • Contents are mainly in English, importance of Chinese, quality contents in Japanese, German and Korean; • Internet users cannot live without search engines and are loyal to a specific one; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 10. Risks of search engine dependency and its influence on data quality I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 11. Definition The behaviour of not reconsidering the results coming from one single search engine. It normally starts when you hear sentences such as: - quot;Why should I bother using other search engines because I find everything I want with Google?quot; - Do I really have some risks when I am using Google? - All countries in the world have Google in their top 100 or less; - Google has been recognized as the most powerful brand; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 12. • Who is Google? Well... It is our friend; • We can carry it everywhere, relevant, convenient(quick display, services associated); • But: – You have to know how to deal with it; – You have to know its limits; – You have to know its potential; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 13. Consequences • If you don’t know how to deal with it: - You will never use his true capacities; - You will probably take the first information which is displayed; • If you don't know its limits: - And cannot find the information you will may think that the information does not exist; - You may even think that the technology does not exist elsewhere; • If you don’t know its potential: - You will not improve at performing research; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 14. Advertisement • Search engines economical model is based on advertisement (99% of Google revenues are based on it); • However studies are showing that some categories of adults (non Internet generations) do not make the difference between commercial and non commercial links; • Some search engines are more commercial than others; • The more you know a search engine (Google) and the more you can practise Search Engine Optimization; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 15. Google is not an isolated case • Baidu dependency in China and Yandex dependency in Russia; • Seznam dependency in Czech Republic; • Naver dependency in South Korea; • Yahoo dependency in Japan and many others Asian countries; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 16. Brief summary • Search engine dependency is confortable and then understandable; • But for many reasons it goes for a mass consumption information (blog phenomenon, advertisement…) which is not the best ones; • In our countries it is Google dependency but keep in mind that Europe and Americas are not the center of the world; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 17. How to solve the equation? I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 18. First point • If an answer exist... we should look for it; • At the moment there is no miracle solution for lazy search; • But there are ways to get closer to the answer; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 19. Three pillars Learn how to use the technology Technological awareness Breaking the habits I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 20. Concrete case: Google Learn how to use the technology: • Make advanced research: – Simple Boolean operators («  », links:, define:, ?, *, ~,…) ; – Complex request: ?intitle:index.of? quot;quot; -filetype:html -filetype:asp -wiki -ringtone -filetype:htm -posts -lyrics -filetype:shtml -filetype:php -filetype:doc -filetype:pdf -filetype:txt mpeg wma avi wmv – Google Advanced search; • Using other Google services such as Google Alerts; • Use sub Google search engines such as Google Scholars; Breaking the habits: - Get used to practice what you learnt and force yourself to do so; - Results are coming and you get used to it; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 21. Concrete case: Google Technological awareness: By performing better at search you will discover new technologies that you will have to learn. For example: Google Alerts tell you that a new search engine is coming up and then you try it; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 22. Technological awareness: Google Google Advanced Search Do you know iGoogle? Google Ads When Google promotes its own technology good chances that it is worthwhile I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 23. Technological awareness: How to select the best • Search engine market is a world of buzz: • Where every search engine want to beat Google; • But are they really providing a technical revolution? I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 24. Start to look at what Google does not have • Real time information: the Twitter example When Google starts to be interested in one's technology it should then be a good one I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 25. Start to look at what Google does not have • Finding similar websites: Who is like it? Unfortunately it is working only for popular websites I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 26. Start to look at what Google does not have Another way of searching information: Social bookmarking Advantages: you find unindexed websites; Disadvantages: rubbish websites, advertisement? I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 27. Start to look at what Google does not have Graphical display I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 28. Start to look where Google is not the best Look for specialized search engines - People: 123 People, CV gadget, Pipl… - Jobs: Indeed, JobiJoba… - Tutorials: Tutosearch, … - Torrent: Toorgle, … - Scientific information: Scirus,… - Information in a specific language: Yandex for Russian, Baidu for Chinese…. I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 29. How to improve data quality on the Internet? • Triangle method: Locating three independent sources that point to the same answer; • Recent events in Tibet showed how it was important to look at different sources of information and even out of your own country; Source 1: Washington Post Source 2: Le Parisien Source 3: AntiCnn.com I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 30. Brief summary • Learn how to use, change your habits, be aware; • Be curious • Think about another way to look for information; • Three dependent sources of information; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 31. Future of Google and information research I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 32. Semantic search • You get feed instead of entering your request; • Everything is talking about Semantic search; • But it is mature yet, a buzz world again (there are not a lot of suggestions); • Poor results if developped on scratch (poor index) if developped by huge companies (few suggestions); I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 33. Some issues to fix • How to well index pictures? Are solutions such as Google labeler are the best??? • How to index videos? • How to index sounds? I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 34. A Google which will have to change • Too much information on the Internet; • A Google which is collapsing and providing more and more sub search engines; • The development of high bandwidth connection which mean graphical interface; • A technological awareness which is difficult to transmitt; I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 35. But a Google more and more present in our life • Forecasts are going in that sense; • Development of OS on cell phones, Web browser, Web software application (Google slides, Google « excel »....) I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 36. The question is just how they will do it? Google in 1998 Google 11 years after I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 37. Brief summary • Google will be with us in the future and we have to get used to it; • Information research will be more and more assisted but you will still be in late if you do not perform advanced research; • In a short future some issues will still be there (indexing of pictures…) I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 38. Conclusion I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 39. What you have to keep in mind • At least if you are dependent you should be well dependent; • Apply the triangle method; • Reconsider on each time the information process (think differently); I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 40. Recommandations Master thesis about search engine dependency: - http://www.pandia.com/index.html List of search engines: - http://www.pandia.com/powersearch/index.html - http://www.philb.com/whichengine.htm To know more about search engines: Pandia search: - www.pandiasearch.com Documentaries: - Google: Behind the screen by IJsbrand van Veelen http://www.youtube.com/watch?v=TBNDYggyesc&hl=fr - The Great Firewall of China http://www.youtube.com/watch?v=IWsXhNJFj78&hl=fr I-Introduction II-Risks III-Solutions IV-Future V-Conclusion
  • 41. Thank you for your attention http://moteurs-de-recherches-alternatifs.blogspot.com