SlideShare a Scribd company logo
Why we need an independent index of the Web
Dirk Lewandowski
dirk.lewandowski@haw-hamburg.de
http://www.bui.haw-hamburg.de/lewandowski.html
@Dirk_Lew
Society of the Query Conference, Amsterdam, 7/11/2013
The “local copy” of the Web
•  Web Indexing
–  New, changed, deleted document
–  “Holy grail” of keeping the index complete and current
Risvik, K. M., & Michelsen, R. (2002). Search engines and web dynamics. Computer Networks, 39(3), 289–302.
Representation of documents in a search engine
Referring documents à Document à Metadata (examplex)
heading1
heading 2
Anchor text
Anchor text
Anchor text
From the source code
- Title
- Description
- Keywords
- Author
From the document
(document info)
- Length
- Date
- Decay
- Name of the author
From the Web
- PageRank
- Number of citations
The User’s Perspective
•  Everyone uses search engines (Purcell, Brenner & Raine, 2012; van Eimeren &
Frees, 2012)
•  Market is dominated by Google (ComScore data)
•  Users rely on
–  Google’s method of ordering results
–  Google’s method of collecting data
à If Google hasn’t seen it — and indexed it — or kept it up to date, it
can’t be found with a search query.
Freshness of Web search engines
(see Lewandowski, Wahlig & Meyer-Bautor, 2006; Lewandowski, 2008)
Original (as of yesterday) Google‘s copy (as of yesterday)
What about the alternatives to Google?
•  Many “seems to be” search engines
–  Accessing the data of another search engine
–  Representing nothing more than an alternative user interface to one of the more
well-known engines
–  In many cases, that turns out to be Google
–  E.g., in Germany, we can see that the major internet portals T-Online, GMX,
AOL, and web.de all display results obtained from Google
Why is one search engine not enough?
•  We need more than one search engine to ensure that a broad range of
opinions are represented in the search market.
•  Users should have the choice between different worldviews which originate
as a product of algorithm-based search result generation
•  Ideology-free search algorithms are simply not possible
Alternative Search Engine Indexes
•  There are only a handful of search engines that operate their own indexes,
due to costs and technical complexity
•  Search engines start-ups
–  Use an existing external index
–  Focus on a specialised topic (which requires only a small index)
–  Aggregate data from different search engines (meta search engine)
•  Actual search engine startups like Blekko and Duck Duck Go are more the
exception than the rule
Partner model
•  “Real” search engine providers such as Google and Bing operate their own
search engines but also provide their search results to partners
•  All the major web portals have now embraced this model.
•  Income through ads; revenue-sharing
•  Attractiveness of the model
–  The search engine provider encounters only minimal costs
–  The operator of the portal no longer needs to go to the great expense of running
its own search engine.
–  The partner index model has served to thin out the competition in the search
industry.
Access to Search Engine Indexes
•  Application programming interfaces (APIs)
–  No direct access to the search engine index
–  Limited number of top results which have already been ranked by the search
engine provider
–  Access via APIs is similar to what is occurring at the meta-search engines
–  The representation of the document in the source search engine is also not
included
Alternative Search Engines
•  What constitutes an “alternative search engine”?
–  All search engines that are not Google? (“Google Killers“, e.g., Cuil)
–  Some alternatives are not perceived as such because they are considered to be
simply the same as Google (e.g., Bing)
–  Search engines which explicitly position themselves as an alternative to Google
through a regional approach (e.g., Seekport)
–  New approaches to search / “Real alternatives”: Alternative approaches to
gathering and representing web content
Public Support for Search Engine Technology?
•  Quaero/Theseus: Funding a “Google Killer”?
–  Quaero: Technologies for multimedia searching.
–  Theseus: Semantic technologies for business-to-business applications (without
focusing exclusively on search).
•  The proposal to provide government funding for search engine technology
has been subject to intense criticism in the past
•  Establish a single alternative?
•  A number of factors which would cause it to fail
–  Poor marketing
–  Graphic design of the user interface
–  ...
•  Regardless of the reason, a failure of the new search engine would result in
the entire publicly funded initiative failing.
Economic perspective
•  Only the largest internet companies are able to afford large indexes.
•  Microsoft is the only company besides Google to possess a comprehensive
search engine index.
•  Yahoo gave up on its own index several years ago
•  It appears as though operating a dedicated index is attractive to practically
no one — and there are hardly any candidates with the necessary financial
resources in any case
The Solution
•  Create the conditions that will make establishing alternative search engines
possible
•  We can expect that the possibilities it presents would benefit a number of
different companies, individuals, and institutions.
•  The result will be fair competition to develop the best concepts for using the
data provided by the index.
Vision
•  “An index of the web that can be accessed at fair conditions for
everyone”
–  “Everyone” means that anyone who is interested can access the index.
–  “Fair conditions” does not mean that access to the index must be free of
charge for everyone. A certain number of document requests per day
should be available at no cost in order to promote non-profit projects.
–  “Access” to the index can be defined as the ability to automatically
query the index with ease.
–  The concept “index of the web” is intended to cover as much of the web
as possible
Funding and operation
•  Funding
–  This type of project cannot be supported by any one country alone. The only
feasible option is a pan-European initiative.
•  Who would operate the index?
–  Existing research institution or newly-founded institution
–  The operator of the index should not obtain the exclusive right to determine the
way in which the documents are used or made available (à Board of trustees)
Conclusion: Advantages of an independent index of the web
•  Motivate companies, institutions, and developers pursuing personal projects
to create their own search applications.
•  The data available on the web is so boundless that it lends itself to
countless applications in a broad range of fields.
•  Enable applications we are not yet capable of even imagining.
•  An open structure, transparency with respect to access, and the assurance
of permanent availability thanks to state sponsorship would lay the
groundwork for innovation.
Thank you
Prof. Dr. Dirk Lewandowski
Hochschule für Angewandte Wissenschaften
Hamburg
dirk.lewandowski@haw-hamburg,de
Twitter: Dirk_Lew
http://www.bui.haw-hamburg.de/lewandowski.html
http://www.searchstudies.org

More Related Content

What's hot

Call for Papers - International Journal of Data Mining & Knowledge Management...
Call for Papers - International Journal of Data Mining & Knowledge Management...Call for Papers - International Journal of Data Mining & Knowledge Management...
Call for Papers - International Journal of Data Mining & Knowledge Management...
IJDKP
 
MOVING presentation at JSI
MOVING presentation at JSIMOVING presentation at JSI
MOVING presentation at JSI
MOVING Project
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
IJDKP
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
IJDKP
 
General Introduction to the Oxford e-Research Centre
General Introduction to the Oxford e-Research CentreGeneral Introduction to the Oxford e-Research Centre
General Introduction to the Oxford e-Research Centre
David Wallom
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
albert ca
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
IJDKP
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
albert ca
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
IJDKP
 
Understanding Open Access
Understanding Open AccessUnderstanding Open Access
Understanding Open Access
Sanjaya Mishra
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
IJDKP
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
IJDKP
 

What's hot (12)

Call for Papers - International Journal of Data Mining & Knowledge Management...
Call for Papers - International Journal of Data Mining & Knowledge Management...Call for Papers - International Journal of Data Mining & Knowledge Management...
Call for Papers - International Journal of Data Mining & Knowledge Management...
 
MOVING presentation at JSI
MOVING presentation at JSIMOVING presentation at JSI
MOVING presentation at JSI
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
General Introduction to the Oxford e-Research Centre
General Introduction to the Oxford e-Research CentreGeneral Introduction to the Oxford e-Research Centre
General Introduction to the Oxford e-Research Centre
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
Understanding Open Access
Understanding Open AccessUnderstanding Open Access
Understanding Open Access
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 

Viewers also liked

Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (2)
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (2)Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (2)
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (2)Dirk Lewandowski
 
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (1)
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (1)Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (1)
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (1)Dirk Lewandowski
 
Perspektiven eines Open Web Index
Perspektiven eines Open Web IndexPerspektiven eines Open Web Index
Perspektiven eines Open Web Index
Dirk Lewandowski
 
Neue Trends: Google, SEO und Co.?
Neue Trends: Google, SEO und Co.?Neue Trends: Google, SEO und Co.?
Neue Trends: Google, SEO und Co.?Dirk Lewandowski
 
Wie Suchmaschinen die Inhalte des Web interpretieren
Wie Suchmaschinen die Inhalte des Web interpretierenWie Suchmaschinen die Inhalte des Web interpretieren
Wie Suchmaschinen die Inhalte des Web interpretieren
Dirk Lewandowski
 
Wie entwickeln sich Suchmaschinen heute, was kommt morgen?
Wie entwickeln sich Suchmaschinen heute, was kommt morgen?Wie entwickeln sich Suchmaschinen heute, was kommt morgen?
Wie entwickeln sich Suchmaschinen heute, was kommt morgen?
Dirk Lewandowski
 
Suchmaschinen verstehen
Suchmaschinen verstehenSuchmaschinen verstehen
Suchmaschinen verstehen
Dirk Lewandowski
 

Viewers also liked (7)

Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (2)
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (2)Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (2)
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (2)
 
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (1)
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (1)Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (1)
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (1)
 
Perspektiven eines Open Web Index
Perspektiven eines Open Web IndexPerspektiven eines Open Web Index
Perspektiven eines Open Web Index
 
Neue Trends: Google, SEO und Co.?
Neue Trends: Google, SEO und Co.?Neue Trends: Google, SEO und Co.?
Neue Trends: Google, SEO und Co.?
 
Wie Suchmaschinen die Inhalte des Web interpretieren
Wie Suchmaschinen die Inhalte des Web interpretierenWie Suchmaschinen die Inhalte des Web interpretieren
Wie Suchmaschinen die Inhalte des Web interpretieren
 
Wie entwickeln sich Suchmaschinen heute, was kommt morgen?
Wie entwickeln sich Suchmaschinen heute, was kommt morgen?Wie entwickeln sich Suchmaschinen heute, was kommt morgen?
Wie entwickeln sich Suchmaschinen heute, was kommt morgen?
 
Suchmaschinen verstehen
Suchmaschinen verstehenSuchmaschinen verstehen
Suchmaschinen verstehen
 

Similar to Why we need an independent index of the Web

Alternatives to Google
Alternatives to GoogleAlternatives to Google
Alternatives to Google
Dirk Lewandowski
 
Exploring Search Engines and their usage online
Exploring Search Engines and their usage onlineExploring Search Engines and their usage online
Exploring Search Engines and their usage online
Mohammad Usman
 
Web-Oriented Architecture (WOA)
Web-Oriented Architecture (WOA)Web-Oriented Architecture (WOA)
Web-Oriented Architecture (WOA)
thetechnicalweb
 
Search Engines
Search EnginesSearch Engines
Search Engines
Chidanand Byahatti
 
Design Issues for Search Engines and Web Crawlers: A Review
Design Issues for Search Engines and Web Crawlers: A ReviewDesign Issues for Search Engines and Web Crawlers: A Review
Design Issues for Search Engines and Web Crawlers: A Review
IOSR Journals
 
Google Case Analysis
Google Case AnalysisGoogle Case Analysis
Google Case Analysis
Lior Agassi
 
Social shopping with semantic power
Social shopping with semantic powerSocial shopping with semantic power
Social shopping with semantic power
Jesse Wang
 
Maruti gollapudi cv
Maruti gollapudi cvMaruti gollapudi cv
Maruti gollapudi cv
Maruti Gollapudi
 
Optus improves customer experience
Optus improves customer experienceOptus improves customer experience
Optus improves customer experience
Sushant Arora
 
Google Whitepaper - Project Border
Google Whitepaper - Project BorderGoogle Whitepaper - Project Border
Google Whitepaper - Project Border
Amit Rampurkar
 
PPT 3 Web Analytics (1).pptx
PPT 3 Web Analytics (1).pptxPPT 3 Web Analytics (1).pptx
PPT 3 Web Analytics (1).pptx
DevChaudhari15
 
Digital Marketing Course Week 6: Search Engine Optimization (SEO)
Digital Marketing Course Week 6: Search Engine Optimization (SEO)Digital Marketing Course Week 6: Search Engine Optimization (SEO)
Digital Marketing Course Week 6: Search Engine Optimization (SEO)
Ayca Turhan
 
FIWARE Wednesday Webinars - Cities as Enablers of the Data Economy: Smart Dat...
FIWARE Wednesday Webinars - Cities as Enablers of the Data Economy: Smart Dat...FIWARE Wednesday Webinars - Cities as Enablers of the Data Economy: Smart Dat...
FIWARE Wednesday Webinars - Cities as Enablers of the Data Economy: Smart Dat...
FIWARE
 
Keyword research tools for Search Engine Optimisation (SEO)
Keyword research tools for Search Engine Optimisation (SEO)Keyword research tools for Search Engine Optimisation (SEO)
Keyword research tools for Search Engine Optimisation (SEO)
Duncan MacGruer
 
A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...
butest
 
A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...
butest
 
The Enterprise Search Market in a Nutshell
The Enterprise Search Market in a NutshellThe Enterprise Search Market in a Nutshell
The Enterprise Search Market in a Nutshell
Dr. Haxel Consult
 
Google Analytics SDDU Seminar
Google Analytics SDDU SeminarGoogle Analytics SDDU Seminar
Google Analytics SDDU Seminar
James Little
 
KB Seminars: Working with Technology - Product Management; 10/13
KB Seminars: Working with Technology - Product Management; 10/13KB Seminars: Working with Technology - Product Management; 10/13
KB Seminars: Working with Technology - Product Management; 10/13
MDIF
 
talk for HK SME center about web3.0 , AI, mobile apps
talk for HK SME center about web3.0 , AI, mobile appstalk for HK SME center about web3.0 , AI, mobile apps
talk for HK SME center about web3.0 , AI, mobile apps
Alex Hung
 

Similar to Why we need an independent index of the Web (20)

Alternatives to Google
Alternatives to GoogleAlternatives to Google
Alternatives to Google
 
Exploring Search Engines and their usage online
Exploring Search Engines and their usage onlineExploring Search Engines and their usage online
Exploring Search Engines and their usage online
 
Web-Oriented Architecture (WOA)
Web-Oriented Architecture (WOA)Web-Oriented Architecture (WOA)
Web-Oriented Architecture (WOA)
 
Search Engines
Search EnginesSearch Engines
Search Engines
 
Design Issues for Search Engines and Web Crawlers: A Review
Design Issues for Search Engines and Web Crawlers: A ReviewDesign Issues for Search Engines and Web Crawlers: A Review
Design Issues for Search Engines and Web Crawlers: A Review
 
Google Case Analysis
Google Case AnalysisGoogle Case Analysis
Google Case Analysis
 
Social shopping with semantic power
Social shopping with semantic powerSocial shopping with semantic power
Social shopping with semantic power
 
Maruti gollapudi cv
Maruti gollapudi cvMaruti gollapudi cv
Maruti gollapudi cv
 
Optus improves customer experience
Optus improves customer experienceOptus improves customer experience
Optus improves customer experience
 
Google Whitepaper - Project Border
Google Whitepaper - Project BorderGoogle Whitepaper - Project Border
Google Whitepaper - Project Border
 
PPT 3 Web Analytics (1).pptx
PPT 3 Web Analytics (1).pptxPPT 3 Web Analytics (1).pptx
PPT 3 Web Analytics (1).pptx
 
Digital Marketing Course Week 6: Search Engine Optimization (SEO)
Digital Marketing Course Week 6: Search Engine Optimization (SEO)Digital Marketing Course Week 6: Search Engine Optimization (SEO)
Digital Marketing Course Week 6: Search Engine Optimization (SEO)
 
FIWARE Wednesday Webinars - Cities as Enablers of the Data Economy: Smart Dat...
FIWARE Wednesday Webinars - Cities as Enablers of the Data Economy: Smart Dat...FIWARE Wednesday Webinars - Cities as Enablers of the Data Economy: Smart Dat...
FIWARE Wednesday Webinars - Cities as Enablers of the Data Economy: Smart Dat...
 
Keyword research tools for Search Engine Optimisation (SEO)
Keyword research tools for Search Engine Optimisation (SEO)Keyword research tools for Search Engine Optimisation (SEO)
Keyword research tools for Search Engine Optimisation (SEO)
 
A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...
 
A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...
 
The Enterprise Search Market in a Nutshell
The Enterprise Search Market in a NutshellThe Enterprise Search Market in a Nutshell
The Enterprise Search Market in a Nutshell
 
Google Analytics SDDU Seminar
Google Analytics SDDU SeminarGoogle Analytics SDDU Seminar
Google Analytics SDDU Seminar
 
KB Seminars: Working with Technology - Product Management; 10/13
KB Seminars: Working with Technology - Product Management; 10/13KB Seminars: Working with Technology - Product Management; 10/13
KB Seminars: Working with Technology - Product Management; 10/13
 
talk for HK SME center about web3.0 , AI, mobile apps
talk for HK SME center about web3.0 , AI, mobile appstalk for HK SME center about web3.0 , AI, mobile apps
talk for HK SME center about web3.0 , AI, mobile apps
 

More from Dirk Lewandowski

The Need for and fundamentals of an Open Web Index
The Need for and fundamentals of an Open Web IndexThe Need for and fundamentals of an Open Web Index
The Need for and fundamentals of an Open Web Index
Dirk Lewandowski
 
In a World of Biased Search Engines
In a World of Biased Search EnginesIn a World of Biased Search Engines
In a World of Biased Search Engines
Dirk Lewandowski
 
EIN ANDERER BLICK AUF GOOGLE: Wie interpretieren Nutzer/innen die Suchergebni...
EIN ANDERER BLICK AUF GOOGLE: Wie interpretieren Nutzer/innen die Suchergebni...EIN ANDERER BLICK AUF GOOGLE: Wie interpretieren Nutzer/innen die Suchergebni...
EIN ANDERER BLICK AUF GOOGLE: Wie interpretieren Nutzer/innen die Suchergebni...
Dirk Lewandowski
 
Künstliche Intelligenz bei Suchmaschinen
Künstliche Intelligenz bei SuchmaschinenKünstliche Intelligenz bei Suchmaschinen
Künstliche Intelligenz bei Suchmaschinen
Dirk Lewandowski
 
Analysing search engine data on socially relevant topics
Analysing search engine data on socially relevant topicsAnalysing search engine data on socially relevant topics
Analysing search engine data on socially relevant topics
Dirk Lewandowski
 
Google Assistant, Alexa & Co.: Wie sich die Welt der Suche verändert
Google Assistant, Alexa & Co.: Wie sich die Welt der Suche verändertGoogle Assistant, Alexa & Co.: Wie sich die Welt der Suche verändert
Google Assistant, Alexa & Co.: Wie sich die Welt der Suche verändert
Dirk Lewandowski
 
Suchverhalten und die Grenzen von Suchdiensten
Suchverhalten und die Grenzen von SuchdienstenSuchverhalten und die Grenzen von Suchdiensten
Suchverhalten und die Grenzen von Suchdiensten
Dirk Lewandowski
 
Können Nutzer echte Suchergebnisse von Werbung in Suchmaschinen unterscheiden?
Können Nutzer echte Suchergebnisse von Werbung in Suchmaschinen unterscheiden?Können Nutzer echte Suchergebnisse von Werbung in Suchmaschinen unterscheiden?
Können Nutzer echte Suchergebnisse von Werbung in Suchmaschinen unterscheiden?
Dirk Lewandowski
 
Are Ads on Google search engine results pages labeled clearly enough?
Are Ads on Google search engine results pages labeled clearly enough?Are Ads on Google search engine results pages labeled clearly enough?
Are Ads on Google search engine results pages labeled clearly enough?
Dirk Lewandowski
 
Search Engine Bias - sollen wir Googles Suchergebnissen vertrauen?
Search Engine Bias - sollen wir Googles Suchergebnissen vertrauen?Search Engine Bias - sollen wir Googles Suchergebnissen vertrauen?
Search Engine Bias - sollen wir Googles Suchergebnissen vertrauen?
Dirk Lewandowski
 
Internet-Suchmaschinen: Aktueller Stand und Entwicklungsperspektiven
Internet-Suchmaschinen: Aktueller Stand und EntwicklungsperspektivenInternet-Suchmaschinen: Aktueller Stand und Entwicklungsperspektiven
Internet-Suchmaschinen: Aktueller Stand und EntwicklungsperspektivenDirk Lewandowski
 
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Dirk Lewandowski
 
Verwendung von Skalenbewertungen in der Evaluierung von Suchmaschinen
Verwendung von Skalenbewertungen in der Evaluierung von SuchmaschinenVerwendung von Skalenbewertungen in der Evaluierung von Suchmaschinen
Verwendung von Skalenbewertungen in der Evaluierung von SuchmaschinenDirk Lewandowski
 
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (3)
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (3)Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (3)
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (3)Dirk Lewandowski
 
Medientage 2013: Die Zukunft der Suche
Medientage 2013: Die Zukunft der SucheMedientage 2013: Die Zukunft der Suche
Medientage 2013: Die Zukunft der SucheDirk Lewandowski
 
Suchmaschinen: Googlerisierung der Gesellschaft
Suchmaschinen: Googlerisierung der GesellschaftSuchmaschinen: Googlerisierung der Gesellschaft
Suchmaschinen: Googlerisierung der GesellschaftDirk Lewandowski
 
Wie beeinflussen Suchmaschinen den Informationsmarkt?
Wie beeinflussen Suchmaschinen den Informationsmarkt?Wie beeinflussen Suchmaschinen den Informationsmarkt?
Wie beeinflussen Suchmaschinen den Informationsmarkt?Dirk Lewandowski
 
Warum wir Alternativen zu Google benötigen
Warum wir Alternativen zu Google benötigenWarum wir Alternativen zu Google benötigen
Warum wir Alternativen zu Google benötigenDirk Lewandowski
 

More from Dirk Lewandowski (20)

The Need for and fundamentals of an Open Web Index
The Need for and fundamentals of an Open Web IndexThe Need for and fundamentals of an Open Web Index
The Need for and fundamentals of an Open Web Index
 
In a World of Biased Search Engines
In a World of Biased Search EnginesIn a World of Biased Search Engines
In a World of Biased Search Engines
 
EIN ANDERER BLICK AUF GOOGLE: Wie interpretieren Nutzer/innen die Suchergebni...
EIN ANDERER BLICK AUF GOOGLE: Wie interpretieren Nutzer/innen die Suchergebni...EIN ANDERER BLICK AUF GOOGLE: Wie interpretieren Nutzer/innen die Suchergebni...
EIN ANDERER BLICK AUF GOOGLE: Wie interpretieren Nutzer/innen die Suchergebni...
 
Künstliche Intelligenz bei Suchmaschinen
Künstliche Intelligenz bei SuchmaschinenKünstliche Intelligenz bei Suchmaschinen
Künstliche Intelligenz bei Suchmaschinen
 
Analysing search engine data on socially relevant topics
Analysing search engine data on socially relevant topicsAnalysing search engine data on socially relevant topics
Analysing search engine data on socially relevant topics
 
Google Assistant, Alexa & Co.: Wie sich die Welt der Suche verändert
Google Assistant, Alexa & Co.: Wie sich die Welt der Suche verändertGoogle Assistant, Alexa & Co.: Wie sich die Welt der Suche verändert
Google Assistant, Alexa & Co.: Wie sich die Welt der Suche verändert
 
Suchverhalten und die Grenzen von Suchdiensten
Suchverhalten und die Grenzen von SuchdienstenSuchverhalten und die Grenzen von Suchdiensten
Suchverhalten und die Grenzen von Suchdiensten
 
Können Nutzer echte Suchergebnisse von Werbung in Suchmaschinen unterscheiden?
Können Nutzer echte Suchergebnisse von Werbung in Suchmaschinen unterscheiden?Können Nutzer echte Suchergebnisse von Werbung in Suchmaschinen unterscheiden?
Können Nutzer echte Suchergebnisse von Werbung in Suchmaschinen unterscheiden?
 
Are Ads on Google search engine results pages labeled clearly enough?
Are Ads on Google search engine results pages labeled clearly enough?Are Ads on Google search engine results pages labeled clearly enough?
Are Ads on Google search engine results pages labeled clearly enough?
 
Search Engine Bias - sollen wir Googles Suchergebnissen vertrauen?
Search Engine Bias - sollen wir Googles Suchergebnissen vertrauen?Search Engine Bias - sollen wir Googles Suchergebnissen vertrauen?
Search Engine Bias - sollen wir Googles Suchergebnissen vertrauen?
 
Internet-Suchmaschinen: Aktueller Stand und Entwicklungsperspektiven
Internet-Suchmaschinen: Aktueller Stand und EntwicklungsperspektivenInternet-Suchmaschinen: Aktueller Stand und Entwicklungsperspektiven
Internet-Suchmaschinen: Aktueller Stand und Entwicklungsperspektiven
 
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
 
Verwendung von Skalenbewertungen in der Evaluierung von Suchmaschinen
Verwendung von Skalenbewertungen in der Evaluierung von SuchmaschinenVerwendung von Skalenbewertungen in der Evaluierung von Suchmaschinen
Verwendung von Skalenbewertungen in der Evaluierung von Suchmaschinen
 
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (3)
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (3)Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (3)
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (3)
 
Nutzer verstehen
Nutzer verstehenNutzer verstehen
Nutzer verstehen
 
Medientage 2013: Die Zukunft der Suche
Medientage 2013: Die Zukunft der SucheMedientage 2013: Die Zukunft der Suche
Medientage 2013: Die Zukunft der Suche
 
Suchmaschinen: Googlerisierung der Gesellschaft
Suchmaschinen: Googlerisierung der GesellschaftSuchmaschinen: Googlerisierung der Gesellschaft
Suchmaschinen: Googlerisierung der Gesellschaft
 
Wie beeinflussen Suchmaschinen den Informationsmarkt?
Wie beeinflussen Suchmaschinen den Informationsmarkt?Wie beeinflussen Suchmaschinen den Informationsmarkt?
Wie beeinflussen Suchmaschinen den Informationsmarkt?
 
Web-Index-Workshop 2014
Web-Index-Workshop 2014Web-Index-Workshop 2014
Web-Index-Workshop 2014
 
Warum wir Alternativen zu Google benötigen
Warum wir Alternativen zu Google benötigenWarum wir Alternativen zu Google benötigen
Warum wir Alternativen zu Google benötigen
 

Recently uploaded

假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
cuobya
 
7 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 20247 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 2024
Danica Gill
 
Ready to Unlock the Power of Blockchain!
Ready to Unlock the Power of Blockchain!Ready to Unlock the Power of Blockchain!
Ready to Unlock the Power of Blockchain!
Toptal Tech
 
制作原版1:1(Monash毕业证)莫纳什大学毕业证成绩单办理假
制作原版1:1(Monash毕业证)莫纳什大学毕业证成绩单办理假制作原版1:1(Monash毕业证)莫纳什大学毕业证成绩单办理假
制作原版1:1(Monash毕业证)莫纳什大学毕业证成绩单办理假
ukwwuq
 
Should Repositories Participate in the Fediverse?
Should Repositories Participate in the Fediverse?Should Repositories Participate in the Fediverse?
Should Repositories Participate in the Fediverse?
Paul Walk
 
Discover the benefits of outsourcing SEO to India
Discover the benefits of outsourcing SEO to IndiaDiscover the benefits of outsourcing SEO to India
Discover the benefits of outsourcing SEO to India
davidjhones387
 
留学学历(UoA毕业证)奥克兰大学毕业证成绩单官方原版办理
留学学历(UoA毕业证)奥克兰大学毕业证成绩单官方原版办理留学学历(UoA毕业证)奥克兰大学毕业证成绩单官方原版办理
留学学历(UoA毕业证)奥克兰大学毕业证成绩单官方原版办理
bseovas
 
Gen Z and the marketplaces - let's translate their needs
Gen Z and the marketplaces - let's translate their needsGen Z and the marketplaces - let's translate their needs
Gen Z and the marketplaces - let's translate their needs
Laura Szabó
 
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
uehowe
 
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaalmanuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
wolfsoftcompanyco
 
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
3a0sd7z3
 
[HUN][hackersuli] Red Teaming alapok 2024
[HUN][hackersuli] Red Teaming alapok 2024[HUN][hackersuli] Red Teaming alapok 2024
[HUN][hackersuli] Red Teaming alapok 2024
hackersuli
 
Design Thinking NETFLIX using all techniques.pptx
Design Thinking NETFLIX using all techniques.pptxDesign Thinking NETFLIX using all techniques.pptx
Design Thinking NETFLIX using all techniques.pptx
saathvikreddy2003
 
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
zoowe
 
留学挂科(UofM毕业证)明尼苏达大学毕业证成绩单复刻办理
留学挂科(UofM毕业证)明尼苏达大学毕业证成绩单复刻办理留学挂科(UofM毕业证)明尼苏达大学毕业证成绩单复刻办理
留学挂科(UofM毕业证)明尼苏达大学毕业证成绩单复刻办理
uehowe
 
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
zyfovom
 
Search Result Showing My Post is Now Buried
Search Result Showing My Post is Now BuriedSearch Result Showing My Post is Now Buried
Search Result Showing My Post is Now Buried
Trish Parr
 
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
vmemo1
 
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
ysasp1
 
一比一原版(USYD毕业证)悉尼大学毕业证如何办理
一比一原版(USYD毕业证)悉尼大学毕业证如何办理一比一原版(USYD毕业证)悉尼大学毕业证如何办理
一比一原版(USYD毕业证)悉尼大学毕业证如何办理
k4ncd0z
 

Recently uploaded (20)

假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
 
7 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 20247 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 2024
 
Ready to Unlock the Power of Blockchain!
Ready to Unlock the Power of Blockchain!Ready to Unlock the Power of Blockchain!
Ready to Unlock the Power of Blockchain!
 
制作原版1:1(Monash毕业证)莫纳什大学毕业证成绩单办理假
制作原版1:1(Monash毕业证)莫纳什大学毕业证成绩单办理假制作原版1:1(Monash毕业证)莫纳什大学毕业证成绩单办理假
制作原版1:1(Monash毕业证)莫纳什大学毕业证成绩单办理假
 
Should Repositories Participate in the Fediverse?
Should Repositories Participate in the Fediverse?Should Repositories Participate in the Fediverse?
Should Repositories Participate in the Fediverse?
 
Discover the benefits of outsourcing SEO to India
Discover the benefits of outsourcing SEO to IndiaDiscover the benefits of outsourcing SEO to India
Discover the benefits of outsourcing SEO to India
 
留学学历(UoA毕业证)奥克兰大学毕业证成绩单官方原版办理
留学学历(UoA毕业证)奥克兰大学毕业证成绩单官方原版办理留学学历(UoA毕业证)奥克兰大学毕业证成绩单官方原版办理
留学学历(UoA毕业证)奥克兰大学毕业证成绩单官方原版办理
 
Gen Z and the marketplaces - let's translate their needs
Gen Z and the marketplaces - let's translate their needsGen Z and the marketplaces - let's translate their needs
Gen Z and the marketplaces - let's translate their needs
 
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
 
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaalmanuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
 
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
 
[HUN][hackersuli] Red Teaming alapok 2024
[HUN][hackersuli] Red Teaming alapok 2024[HUN][hackersuli] Red Teaming alapok 2024
[HUN][hackersuli] Red Teaming alapok 2024
 
Design Thinking NETFLIX using all techniques.pptx
Design Thinking NETFLIX using all techniques.pptxDesign Thinking NETFLIX using all techniques.pptx
Design Thinking NETFLIX using all techniques.pptx
 
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
 
留学挂科(UofM毕业证)明尼苏达大学毕业证成绩单复刻办理
留学挂科(UofM毕业证)明尼苏达大学毕业证成绩单复刻办理留学挂科(UofM毕业证)明尼苏达大学毕业证成绩单复刻办理
留学挂科(UofM毕业证)明尼苏达大学毕业证成绩单复刻办理
 
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
 
Search Result Showing My Post is Now Buried
Search Result Showing My Post is Now BuriedSearch Result Showing My Post is Now Buried
Search Result Showing My Post is Now Buried
 
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
 
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
 
一比一原版(USYD毕业证)悉尼大学毕业证如何办理
一比一原版(USYD毕业证)悉尼大学毕业证如何办理一比一原版(USYD毕业证)悉尼大学毕业证如何办理
一比一原版(USYD毕业证)悉尼大学毕业证如何办理
 

Why we need an independent index of the Web

  • 1. Why we need an independent index of the Web Dirk Lewandowski dirk.lewandowski@haw-hamburg.de http://www.bui.haw-hamburg.de/lewandowski.html @Dirk_Lew Society of the Query Conference, Amsterdam, 7/11/2013
  • 2. The “local copy” of the Web •  Web Indexing –  New, changed, deleted document –  “Holy grail” of keeping the index complete and current Risvik, K. M., & Michelsen, R. (2002). Search engines and web dynamics. Computer Networks, 39(3), 289–302.
  • 3. Representation of documents in a search engine Referring documents à Document à Metadata (examplex) heading1 heading 2 Anchor text Anchor text Anchor text From the source code - Title - Description - Keywords - Author From the document (document info) - Length - Date - Decay - Name of the author From the Web - PageRank - Number of citations
  • 4. The User’s Perspective •  Everyone uses search engines (Purcell, Brenner & Raine, 2012; van Eimeren & Frees, 2012) •  Market is dominated by Google (ComScore data) •  Users rely on –  Google’s method of ordering results –  Google’s method of collecting data à If Google hasn’t seen it — and indexed it — or kept it up to date, it can’t be found with a search query.
  • 5. Freshness of Web search engines (see Lewandowski, Wahlig & Meyer-Bautor, 2006; Lewandowski, 2008) Original (as of yesterday) Google‘s copy (as of yesterday)
  • 6. What about the alternatives to Google? •  Many “seems to be” search engines –  Accessing the data of another search engine –  Representing nothing more than an alternative user interface to one of the more well-known engines –  In many cases, that turns out to be Google –  E.g., in Germany, we can see that the major internet portals T-Online, GMX, AOL, and web.de all display results obtained from Google
  • 7. Why is one search engine not enough? •  We need more than one search engine to ensure that a broad range of opinions are represented in the search market. •  Users should have the choice between different worldviews which originate as a product of algorithm-based search result generation •  Ideology-free search algorithms are simply not possible
  • 8. Alternative Search Engine Indexes •  There are only a handful of search engines that operate their own indexes, due to costs and technical complexity •  Search engines start-ups –  Use an existing external index –  Focus on a specialised topic (which requires only a small index) –  Aggregate data from different search engines (meta search engine) •  Actual search engine startups like Blekko and Duck Duck Go are more the exception than the rule
  • 9. Partner model •  “Real” search engine providers such as Google and Bing operate their own search engines but also provide their search results to partners •  All the major web portals have now embraced this model. •  Income through ads; revenue-sharing •  Attractiveness of the model –  The search engine provider encounters only minimal costs –  The operator of the portal no longer needs to go to the great expense of running its own search engine. –  The partner index model has served to thin out the competition in the search industry.
  • 10. Access to Search Engine Indexes •  Application programming interfaces (APIs) –  No direct access to the search engine index –  Limited number of top results which have already been ranked by the search engine provider –  Access via APIs is similar to what is occurring at the meta-search engines –  The representation of the document in the source search engine is also not included
  • 11. Alternative Search Engines •  What constitutes an “alternative search engine”? –  All search engines that are not Google? (“Google Killers“, e.g., Cuil) –  Some alternatives are not perceived as such because they are considered to be simply the same as Google (e.g., Bing) –  Search engines which explicitly position themselves as an alternative to Google through a regional approach (e.g., Seekport) –  New approaches to search / “Real alternatives”: Alternative approaches to gathering and representing web content
  • 12. Public Support for Search Engine Technology? •  Quaero/Theseus: Funding a “Google Killer”? –  Quaero: Technologies for multimedia searching. –  Theseus: Semantic technologies for business-to-business applications (without focusing exclusively on search). •  The proposal to provide government funding for search engine technology has been subject to intense criticism in the past •  Establish a single alternative? •  A number of factors which would cause it to fail –  Poor marketing –  Graphic design of the user interface –  ... •  Regardless of the reason, a failure of the new search engine would result in the entire publicly funded initiative failing.
  • 13. Economic perspective •  Only the largest internet companies are able to afford large indexes. •  Microsoft is the only company besides Google to possess a comprehensive search engine index. •  Yahoo gave up on its own index several years ago •  It appears as though operating a dedicated index is attractive to practically no one — and there are hardly any candidates with the necessary financial resources in any case
  • 14. The Solution •  Create the conditions that will make establishing alternative search engines possible •  We can expect that the possibilities it presents would benefit a number of different companies, individuals, and institutions. •  The result will be fair competition to develop the best concepts for using the data provided by the index.
  • 15. Vision •  “An index of the web that can be accessed at fair conditions for everyone” –  “Everyone” means that anyone who is interested can access the index. –  “Fair conditions” does not mean that access to the index must be free of charge for everyone. A certain number of document requests per day should be available at no cost in order to promote non-profit projects. –  “Access” to the index can be defined as the ability to automatically query the index with ease. –  The concept “index of the web” is intended to cover as much of the web as possible
  • 16. Funding and operation •  Funding –  This type of project cannot be supported by any one country alone. The only feasible option is a pan-European initiative. •  Who would operate the index? –  Existing research institution or newly-founded institution –  The operator of the index should not obtain the exclusive right to determine the way in which the documents are used or made available (à Board of trustees)
  • 17. Conclusion: Advantages of an independent index of the web •  Motivate companies, institutions, and developers pursuing personal projects to create their own search applications. •  The data available on the web is so boundless that it lends itself to countless applications in a broad range of fields. •  Enable applications we are not yet capable of even imagining. •  An open structure, transparency with respect to access, and the assurance of permanent availability thanks to state sponsorship would lay the groundwork for innovation.
  • 18. Thank you Prof. Dr. Dirk Lewandowski Hochschule für Angewandte Wissenschaften Hamburg dirk.lewandowski@haw-hamburg,de Twitter: Dirk_Lew http://www.bui.haw-hamburg.de/lewandowski.html http://www.searchstudies.org