SlideShare a Scribd company logo
http://www.dkd.de




Freitag, 15. Juni 12
dkd
                       development
                       kommunikation
                       design




Freitag, 15. Juni 12
Welcome
         TYPO3 Conference
         Quebec Canada

         Olivier Dobberkau, Founder and CIO dkd
         Member of the Expert Advisory Board TYPO3 Assoc.
         Twitter @T3RevNeverend
         olivier.dobberkau@dkd.de




Freitag, 15. Juni 12
Everything You Always Wanted to
         Know About Search in TYPO3.
         But Were Afraid to Ask




Freitag, 15. Juni 12
Woody Allen

        Inspiration for this Talk:

        Woody Allen Movie: „Everything You Always
        Wanted to Know About Sex * But Were Afraid to
        Ask“




Freitag, 15. Juni 12
Woody Allen

        Inspiration for this Talk:

        Woody Allen Movie: „Everything You Always
        Wanted to Know About Sex * But Were Afraid to
        Ask“

        Internet Movie Database:
        http://www.imdb.com/title/tt0068555/

Freitag, 15. Juni 12
Agenda

               A short history of Search
               Slang
               The need to Search
               Who is searching and what is (s)he searching for?
               Search in TYPO3 with Apache Solr
               Questions & Answers




Freitag, 15. Juni 12
History

        A short trip in the History of Searchsolutions in
        times of IT.


        Really short, really lots of missing facts and not
        scientific at all.



Freitag, 15. Juni 12
Scratch your own itch, IBM.

               At the beginning was the Mainframe
               IBM develops in 1969 STAIRS (storage and
               information retrieval system)
               Fulltext Search for Terminal Applications
               Performance: „far below anyone‘s expectations“
               First use in the DOJ Case againts IBM
               Source: A history of online information services,
               1963-1976 von Charles P. Bourne,Trudi Bellardo
               Hahn


Freitag, 15. Juni 12
Internet years are dog years

               The Internet changes the needs in Fulltextsearch.
               With Lycos, Alltheweb, Infoseek, Excite and
               Altavista Searchpages compete in solving the
               „How do i find something in the Internet?“
               Its a race for the love of the seeking internet
               users in 1995.
               Yahoo tries to be the Directory of Websites




Freitag, 15. Juni 12
And then came GOOGLE

               Who does not know about Googles Secret?
               The Anatomy of a Large-Scale Hypertextual Web
               Search Engine
               http://infolab.stanford.edu/~backrub/google.html
               Visionary Paper
               The named technologies and principles are
               industry standard and are still changing our IT
               Industry. (Map reduce, Big data & Pagerank)
               A must read!


Freitag, 15. Juni 12
Slang




Freitag, 15. Juni 12
Its all about words!

               Irformation Retrieval (IR)
               Term versus Query
               Index
               Recall & Precision
               Relevancy
               Index, Inverted Index & Posting List
               Recency & Authority




Freitag, 15. Juni 12
The need to Search



        What leads us when we search?
        How do we search?
        How does what we find change us?




Freitag, 15. Juni 12
People are like Bears
         (only less fur)
               How do we search?
               Marcia Bates, 1989
               THE DESIGN OF BROWSING AND BERRYPICKING
               TECHNIQUES FOR THE ONLINE SEARCH
               INTERFACE
               http://pages.gseis.ucla.edu/faculty/bates/
               berrypicking.html
               Every search can be described with this




Freitag, 15. Juni 12
Marcia J. Bates Berrypicking techniques for the online search interface (1989)

Freitag, 15. Juni 12
Carrots & Sticks

               Search Behavior Patterns, John Ferrara
               http://www.boxesandarrows.com/view/search-
               behavior
                       Domain Expertise
                       Search Expertise
                       Cognitive Style
                       Goal Type
                       Mode of seeking
                       Situational idiosyncrasies

Freitag, 15. Juni 12
Neo: The Matrix

               Matrix of Scope/Style of information needs




                               Scope & Type -Tyler Tate. Sohn et al. Church & Smyth
                       http://twigkit.com/blog/2011/12/06/mobile-information-needs.html



Freitag, 15. Juni 12
Search = Success for your Website

               Benefits for your Visitors & Users
                       They will find it on your Website
                       Serendipity
                       Better and faster knowledge transfer
               Business benefits
                       ROI
                       Agility
                       Awareness and Enablement


Freitag, 15. Juni 12
TYPO3 & Search

        Shameless Plug: Apache Solr for TYPO3

        I still have some „I love Indexed Search“ Buttons to
        giveaway.




Freitag, 15. Juni 12
Solr-Components




                              Query

                   Indexing             Analysis

                              Results
                                         Additional
                                        Components

Freitag, 15. Juni 12
Indexing




Freitag, 15. Juni 12
What can be indexed?

               TYPO3 Content
               TYPO3 Databases (TCA Tables)
               External Websites
               RSS-Feeds
               Files
               ...




Freitag, 15. Juni 12
Indexing Features

               Synonyms
               Stopwords
               Protected words
               External Content
                       RSS
                       Microsites
                       Application Data
                       ...


Freitag, 15. Juni 12
Query




Freitag, 15. Juni 12
Query Options

               Operators
                       “+” and “-” to add or exclude terms
                       Soon “and” und “or” to combine terms
                       Quotes to tie words together
                       ie. “This is a Search with many Terms”

               Diacritical Characters
                       cuvée = cuvee
                       Søren = Sören = Soeren = Sœren = Soren



Freitag, 15. Juni 12
Query

               Takes care of Access Control Rights
               Autocomplete
               Did you mean?




Freitag, 15. Juni 12
Results




Freitag, 15. Juni 12
Results

               Searchresults linking to a result
               Page Browser
               Sorting
                       Relavancy (Score)
                       Author
                       Date (cr_date of TYPO3 Page)
                       your own criterias




Freitag, 15. Juni 12
Results

               View-Helper to display additional Information like
               Custom Prices & Preview images.
               Preset Filters so that Facets are activated with a
               Query




Freitag, 15. Juni 12
Results

               Field Boosting (Terms in certain Field score
               higher. Can be freely set)
               Boost-Functions (Functions on values of
               documents. I.e. newer documents are more
               ranked higher)
               Query-Manipulation (Can be changed before
               they hit Solr)
               Elevation (Paid content)




Freitag, 15. Juni 12
Results

               Template Engine: flexible Template to customize
               your results listing fast and easily
               Search word highlighting
               Spell-Checking: "Did you mean?"
               Common Searches
               Recent Searches




Freitag, 15. Juni 12
Facets

               Type-Facets
                       Author
                       Type of Document (Pages, News, Files & many more)
               Range-Facets (Work in Progress)
               (ie. 1-10 $ or Slider)

               Hierarchical Facets
               (Great if you have lots of categorized Data like in News or
               Filerepositories)

               Facets can be combined with each other
               (ie. Show me all red & blue shoes)




Freitag, 15. Juni 12
Facets

               Geo-Search (work in progress)
               (i.e If you want to search and display the location of your data of
               a certain type: Stores, Servicepoints, Bus-stops )

               Geo-IP based on IP of your visitor
               (ie: Where is the next salespoint for your products)

               Facets are TYPO3 content objects
               (can be manipulated with typoscript i.e Gifbuilder)

               Filters can be preset
               (You can preset certain facets)

               ...




Freitag, 15. Juni 12
Analysis




Freitag, 15. Juni 12
Analysis

               Query Logging
               Stats on Queries (Work in Progress)
               Userbased Ranking (Work in Progress)
               Integration with analytics tools posible
               Roll your own
               There might be a Solr Server feature coming
               up ...




Freitag, 15. Juni 12
Additional Components




Freitag, 15. Juni 12
Additional Components

               More like this Component on the Details page
               can show related additional documents
               Its possible to access Indexed Data
               Nutch Crawler to Index non TYPO3 Websites
               Data Import Handler




Freitag, 15. Juni 12
dkd
                                development
                                kommunikation
                                design




                       Thank You! Merci.
Freitag, 15. Juni 12
Quellenangaben

                       Lucene Scoring for dummies: http://
                       www.supermind.org/blog/378/lucene-scoring-
                       for-dummies
                       Fotos: Søren Schaffstein




Freitag, 15. Juni 12

More Related Content

Similar to Everything you always wanted to know about search in typo3

You, me and Opendata - v2
You, me and Opendata - v2You, me and Opendata - v2
You, me and Opendata - v2
Thiago Rondon
 
Big Data Is Not the Insight: The Language Of Discovery:
Big Data Is Not the Insight: The Language Of Discovery: Big Data Is Not the Insight: The Language Of Discovery:
Big Data Is Not the Insight: The Language Of Discovery:
Joe Lamantia
 
Data Science: Expediting Use of Data by Business Users with Self-service Disc...
Data Science: Expediting Use of Data by Business Users with Self-service Disc...Data Science: Expediting Use of Data by Business Users with Self-service Disc...
Data Science: Expediting Use of Data by Business Users with Self-service Disc...
Denodo
 
Matt Bailey
Matt BaileyMatt Bailey
Matt Bailey
YouToo Social Media
 
Startds9.19.17sd
Startds9.19.17sdStartds9.19.17sd
Startds9.19.17sd
Thinkful
 
Where the Rubber Hits the Road: Real-World Stories of Force Ranking Stuff
Where the Rubber Hits the Road: Real-World Stories of Force Ranking StuffWhere the Rubber Hits the Road: Real-World Stories of Force Ranking Stuff
Where the Rubber Hits the Road: Real-World Stories of Force Ranking Stuff
Todd Olson
 
Who is a data scientist
Who is a data scientist  Who is a data scientist
Who is a data scientist
prateek kumar
 
D92-198gstindspdx
D92-198gstindspdxD92-198gstindspdx
D92-198gstindspdx
Thinkful
 
Data sci sd-11.6.17
Data sci sd-11.6.17Data sci sd-11.6.17
Data sci sd-11.6.17
Thinkful
 
Search Analytics: Diagnosing what ails your site
Search Analytics:  Diagnosing what ails your siteSearch Analytics:  Diagnosing what ails your site
Search Analytics: Diagnosing what ails your site
Louis Rosenfeld
 
Search Analytics: Conversations with Your Customers
Search Analytics: Conversations with Your CustomersSearch Analytics: Conversations with Your Customers
Search Analytics: Conversations with Your Customers
richwig
 
Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018
mark madsen
 
Search Analytics: Diagnosing what ails your site
Search Analytics:  Diagnosing what ails your siteSearch Analytics:  Diagnosing what ails your site
Search Analytics: Diagnosing what ails your site
Louis Rosenfeld
 
Search Analytics: Powerful diagnostics for your site
Search Analytics:  Powerful diagnostics for your siteSearch Analytics:  Powerful diagnostics for your site
Search Analytics: Powerful diagnostics for your site
Louis Rosenfeld
 
Search Analytics for Fun and Profit
Search Analytics for Fun and ProfitSearch Analytics for Fun and Profit
Search Analytics for Fun and Profit
Louis Rosenfeld
 
Evaluating And Downloading Images (Graham Turnbull) Scran
Evaluating And Downloading Images (Graham Turnbull) ScranEvaluating And Downloading Images (Graham Turnbull) Scran
Evaluating And Downloading Images (Graham Turnbull) Scran
The 4C Initiative
 
Deck 92-146 (3)
Deck 92-146 (3)Deck 92-146 (3)
Deck 92-146 (3)
Thinkful
 
Making things findable
Making things findableMaking things findable
Making things findable
Peter Mika
 
Getstarteddssd12717sd
Getstarteddssd12717sdGetstarteddssd12717sd
Getstarteddssd12717sd
Thinkful
 
Big data in the web
Big data in the webBig data in the web
Big data in the web
caise2013
 

Similar to Everything you always wanted to know about search in typo3 (20)

You, me and Opendata - v2
You, me and Opendata - v2You, me and Opendata - v2
You, me and Opendata - v2
 
Big Data Is Not the Insight: The Language Of Discovery:
Big Data Is Not the Insight: The Language Of Discovery: Big Data Is Not the Insight: The Language Of Discovery:
Big Data Is Not the Insight: The Language Of Discovery:
 
Data Science: Expediting Use of Data by Business Users with Self-service Disc...
Data Science: Expediting Use of Data by Business Users with Self-service Disc...Data Science: Expediting Use of Data by Business Users with Self-service Disc...
Data Science: Expediting Use of Data by Business Users with Self-service Disc...
 
Matt Bailey
Matt BaileyMatt Bailey
Matt Bailey
 
Startds9.19.17sd
Startds9.19.17sdStartds9.19.17sd
Startds9.19.17sd
 
Where the Rubber Hits the Road: Real-World Stories of Force Ranking Stuff
Where the Rubber Hits the Road: Real-World Stories of Force Ranking StuffWhere the Rubber Hits the Road: Real-World Stories of Force Ranking Stuff
Where the Rubber Hits the Road: Real-World Stories of Force Ranking Stuff
 
Who is a data scientist
Who is a data scientist  Who is a data scientist
Who is a data scientist
 
D92-198gstindspdx
D92-198gstindspdxD92-198gstindspdx
D92-198gstindspdx
 
Data sci sd-11.6.17
Data sci sd-11.6.17Data sci sd-11.6.17
Data sci sd-11.6.17
 
Search Analytics: Diagnosing what ails your site
Search Analytics:  Diagnosing what ails your siteSearch Analytics:  Diagnosing what ails your site
Search Analytics: Diagnosing what ails your site
 
Search Analytics: Conversations with Your Customers
Search Analytics: Conversations with Your CustomersSearch Analytics: Conversations with Your Customers
Search Analytics: Conversations with Your Customers
 
Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018
 
Search Analytics: Diagnosing what ails your site
Search Analytics:  Diagnosing what ails your siteSearch Analytics:  Diagnosing what ails your site
Search Analytics: Diagnosing what ails your site
 
Search Analytics: Powerful diagnostics for your site
Search Analytics:  Powerful diagnostics for your siteSearch Analytics:  Powerful diagnostics for your site
Search Analytics: Powerful diagnostics for your site
 
Search Analytics for Fun and Profit
Search Analytics for Fun and ProfitSearch Analytics for Fun and Profit
Search Analytics for Fun and Profit
 
Evaluating And Downloading Images (Graham Turnbull) Scran
Evaluating And Downloading Images (Graham Turnbull) ScranEvaluating And Downloading Images (Graham Turnbull) Scran
Evaluating And Downloading Images (Graham Turnbull) Scran
 
Deck 92-146 (3)
Deck 92-146 (3)Deck 92-146 (3)
Deck 92-146 (3)
 
Making things findable
Making things findableMaking things findable
Making things findable
 
Getstarteddssd12717sd
Getstarteddssd12717sdGetstarteddssd12717sd
Getstarteddssd12717sd
 
Big data in the web
Big data in the webBig data in the web
Big data in the web
 

More from Olivier Dobberkau

Meet TYPO3 Vienna - Solr die Suchmachine für TYPO3
Meet TYPO3 Vienna - Solr die Suchmachine für TYPO3Meet TYPO3 Vienna - Solr die Suchmachine für TYPO3
Meet TYPO3 Vienna - Solr die Suchmachine für TYPO3
Olivier Dobberkau
 
Apache Solr for TYPO3: More than a search engine
Apache Solr for TYPO3: More than a search engineApache Solr for TYPO3: More than a search engine
Apache Solr for TYPO3: More than a search engine
Olivier Dobberkau
 
TYPO3 v8 LTS in the cloud
TYPO3 v8 LTS in the cloudTYPO3 v8 LTS in the cloud
TYPO3 v8 LTS in the cloud
Olivier Dobberkau
 
With a little help from my friends (english)
With a little help  from my friends (english)With a little help  from my friends (english)
With a little help from my friends (english)
Olivier Dobberkau
 
With a little help from my friends
With a little help from my friendsWith a little help from my friends
With a little help from my friends
Olivier Dobberkau
 
TYPO3 & You
TYPO3 & YouTYPO3 & You
TYPO3 & You
Olivier Dobberkau
 
Sonnenschein für ihre Website
Sonnenschein für ihre WebsiteSonnenschein für ihre Website
Sonnenschein für ihre Website
Olivier Dobberkau
 
Apache Solr Revisited 2015
Apache Solr Revisited 2015Apache Solr Revisited 2015
Apache Solr Revisited 2015
Olivier Dobberkau
 
TYPO3 Camp Poznan - Solr Usecases with Hosted Solr
TYPO3 Camp Poznan - Solr Usecases with Hosted SolrTYPO3 Camp Poznan - Solr Usecases with Hosted Solr
TYPO3 Camp Poznan - Solr Usecases with Hosted Solr
Olivier Dobberkau
 
Your Content hides a treasure (and you might have not found it) - ForgetIT Pr...
Your Content hides a treasure (and you might have not found it) - ForgetIT Pr...Your Content hides a treasure (and you might have not found it) - ForgetIT Pr...
Your Content hides a treasure (and you might have not found it) - ForgetIT Pr...
Olivier Dobberkau
 
TYPO3 and CMIS
TYPO3 and CMISTYPO3 and CMIS
TYPO3 and CMIS
Olivier Dobberkau
 
ForgetIT: Beyond the page: Giving content a meaning and value
ForgetIT: Beyond the page: Giving content a meaning and valueForgetIT: Beyond the page: Giving content a meaning and value
ForgetIT: Beyond the page: Giving content a meaning and value
Olivier Dobberkau
 
ForgetIT Project TYPO3Camp Milano 2014
ForgetIT Project TYPO3Camp Milano 2014ForgetIT Project TYPO3Camp Milano 2014
ForgetIT Project TYPO3Camp Milano 2014
Olivier Dobberkau
 
Explain TYPO3 Association March 2014
Explain TYPO3 Association March 2014Explain TYPO3 Association March 2014
Explain TYPO3 Association March 2014
Olivier Dobberkau
 
Apache Solr for TYPO3 CMS 101
Apache Solr for TYPO3 CMS 101Apache Solr for TYPO3 CMS 101
Apache Solr for TYPO3 CMS 101
Olivier Dobberkau
 
EXPLAIN #t3a
EXPLAIN #t3aEXPLAIN #t3a
EXPLAIN #t3a
Olivier Dobberkau
 
Outside the Box - Panel on CMS at TYPO3 Camp Mallorca
Outside the Box - Panel on CMS at TYPO3 Camp MallorcaOutside the Box - Panel on CMS at TYPO3 Camp Mallorca
Outside the Box - Panel on CMS at TYPO3 Camp Mallorca
Olivier Dobberkau
 
Status & Outlook on EXT:solr for TYPO3 CMS
Status & Outlook on EXT:solr for TYPO3 CMSStatus & Outlook on EXT:solr for TYPO3 CMS
Status & Outlook on EXT:solr for TYPO3 CMS
Olivier Dobberkau
 
The future of CMS @T3UNI 2013 Annecy France
The future of CMS @T3UNI 2013 Annecy FranceThe future of CMS @T3UNI 2013 Annecy France
The future of CMS @T3UNI 2013 Annecy France
Olivier Dobberkau
 
Digital dark age - Are we doing enough to preserve our website heritage?
Digital dark age - Are we doing enough to preserve our website heritage?Digital dark age - Are we doing enough to preserve our website heritage?
Digital dark age - Are we doing enough to preserve our website heritage?
Olivier Dobberkau
 

More from Olivier Dobberkau (20)

Meet TYPO3 Vienna - Solr die Suchmachine für TYPO3
Meet TYPO3 Vienna - Solr die Suchmachine für TYPO3Meet TYPO3 Vienna - Solr die Suchmachine für TYPO3
Meet TYPO3 Vienna - Solr die Suchmachine für TYPO3
 
Apache Solr for TYPO3: More than a search engine
Apache Solr for TYPO3: More than a search engineApache Solr for TYPO3: More than a search engine
Apache Solr for TYPO3: More than a search engine
 
TYPO3 v8 LTS in the cloud
TYPO3 v8 LTS in the cloudTYPO3 v8 LTS in the cloud
TYPO3 v8 LTS in the cloud
 
With a little help from my friends (english)
With a little help  from my friends (english)With a little help  from my friends (english)
With a little help from my friends (english)
 
With a little help from my friends
With a little help from my friendsWith a little help from my friends
With a little help from my friends
 
TYPO3 & You
TYPO3 & YouTYPO3 & You
TYPO3 & You
 
Sonnenschein für ihre Website
Sonnenschein für ihre WebsiteSonnenschein für ihre Website
Sonnenschein für ihre Website
 
Apache Solr Revisited 2015
Apache Solr Revisited 2015Apache Solr Revisited 2015
Apache Solr Revisited 2015
 
TYPO3 Camp Poznan - Solr Usecases with Hosted Solr
TYPO3 Camp Poznan - Solr Usecases with Hosted SolrTYPO3 Camp Poznan - Solr Usecases with Hosted Solr
TYPO3 Camp Poznan - Solr Usecases with Hosted Solr
 
Your Content hides a treasure (and you might have not found it) - ForgetIT Pr...
Your Content hides a treasure (and you might have not found it) - ForgetIT Pr...Your Content hides a treasure (and you might have not found it) - ForgetIT Pr...
Your Content hides a treasure (and you might have not found it) - ForgetIT Pr...
 
TYPO3 and CMIS
TYPO3 and CMISTYPO3 and CMIS
TYPO3 and CMIS
 
ForgetIT: Beyond the page: Giving content a meaning and value
ForgetIT: Beyond the page: Giving content a meaning and valueForgetIT: Beyond the page: Giving content a meaning and value
ForgetIT: Beyond the page: Giving content a meaning and value
 
ForgetIT Project TYPO3Camp Milano 2014
ForgetIT Project TYPO3Camp Milano 2014ForgetIT Project TYPO3Camp Milano 2014
ForgetIT Project TYPO3Camp Milano 2014
 
Explain TYPO3 Association March 2014
Explain TYPO3 Association March 2014Explain TYPO3 Association March 2014
Explain TYPO3 Association March 2014
 
Apache Solr for TYPO3 CMS 101
Apache Solr for TYPO3 CMS 101Apache Solr for TYPO3 CMS 101
Apache Solr for TYPO3 CMS 101
 
EXPLAIN #t3a
EXPLAIN #t3aEXPLAIN #t3a
EXPLAIN #t3a
 
Outside the Box - Panel on CMS at TYPO3 Camp Mallorca
Outside the Box - Panel on CMS at TYPO3 Camp MallorcaOutside the Box - Panel on CMS at TYPO3 Camp Mallorca
Outside the Box - Panel on CMS at TYPO3 Camp Mallorca
 
Status & Outlook on EXT:solr for TYPO3 CMS
Status & Outlook on EXT:solr for TYPO3 CMSStatus & Outlook on EXT:solr for TYPO3 CMS
Status & Outlook on EXT:solr for TYPO3 CMS
 
The future of CMS @T3UNI 2013 Annecy France
The future of CMS @T3UNI 2013 Annecy FranceThe future of CMS @T3UNI 2013 Annecy France
The future of CMS @T3UNI 2013 Annecy France
 
Digital dark age - Are we doing enough to preserve our website heritage?
Digital dark age - Are we doing enough to preserve our website heritage?Digital dark age - Are we doing enough to preserve our website heritage?
Digital dark age - Are we doing enough to preserve our website heritage?
 

Recently uploaded

It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...
Zilliz
 
Tailored CRM Software Development for Enhanced Customer Insights
Tailored CRM Software Development for Enhanced Customer InsightsTailored CRM Software Development for Enhanced Customer Insights
Tailored CRM Software Development for Enhanced Customer Insights
SynapseIndia
 
Opencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of MünsterOpencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of Münster
Matthias Neugebauer
 
Zaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdfZaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdf
AmandaCheung15
 
Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024
siddu769252
 
The History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal EmbeddingsThe History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal Embeddings
Zilliz
 
BLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
BLOCKCHAIN TECHNOLOGY - Advantages and DisadvantagesBLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
BLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
SAI KAILASH R
 
Uncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in LibrariesUncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in Libraries
Brian Pichman
 
Vulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive OverviewVulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive Overview
Steven Carlson
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
ldtexsolbl
 
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
shanihomely
 
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Zilliz
 
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
bellared2
 
Retrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with RagasRetrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with Ragas
Zilliz
 
Google I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged SlidesGoogle I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged Slides
Google Developer Group - Harare
 
Communications Mining Series - Zero to Hero - Session 3
Communications Mining Series - Zero to Hero - Session 3Communications Mining Series - Zero to Hero - Session 3
Communications Mining Series - Zero to Hero - Session 3
DianaGray10
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
sunilverma7884
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
ssuser1915fe1
 
UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...
UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...
UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...
FIDO Alliance
 
Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17
Bhajan Mehta
 

Recently uploaded (20)

It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...
 
Tailored CRM Software Development for Enhanced Customer Insights
Tailored CRM Software Development for Enhanced Customer InsightsTailored CRM Software Development for Enhanced Customer Insights
Tailored CRM Software Development for Enhanced Customer Insights
 
Opencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of MünsterOpencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of Münster
 
Zaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdfZaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdf
 
Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024
 
The History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal EmbeddingsThe History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal Embeddings
 
BLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
BLOCKCHAIN TECHNOLOGY - Advantages and DisadvantagesBLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
BLOCKCHAIN TECHNOLOGY - Advantages and Disadvantages
 
Uncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in LibrariesUncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in Libraries
 
Vulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive OverviewVulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive Overview
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
 
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
 
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
 
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
 
Retrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with RagasRetrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with Ragas
 
Google I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged SlidesGoogle I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged Slides
 
Communications Mining Series - Zero to Hero - Session 3
Communications Mining Series - Zero to Hero - Session 3Communications Mining Series - Zero to Hero - Session 3
Communications Mining Series - Zero to Hero - Session 3
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
 
UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...
UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...
UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...
 
Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17
 

Everything you always wanted to know about search in typo3

  • 2. dkd development kommunikation design Freitag, 15. Juni 12
  • 3. Welcome TYPO3 Conference Quebec Canada Olivier Dobberkau, Founder and CIO dkd Member of the Expert Advisory Board TYPO3 Assoc. Twitter @T3RevNeverend olivier.dobberkau@dkd.de Freitag, 15. Juni 12
  • 4. Everything You Always Wanted to Know About Search in TYPO3. But Were Afraid to Ask Freitag, 15. Juni 12
  • 5. Woody Allen Inspiration for this Talk: Woody Allen Movie: „Everything You Always Wanted to Know About Sex * But Were Afraid to Ask“ Freitag, 15. Juni 12
  • 6. Woody Allen Inspiration for this Talk: Woody Allen Movie: „Everything You Always Wanted to Know About Sex * But Were Afraid to Ask“ Internet Movie Database: http://www.imdb.com/title/tt0068555/ Freitag, 15. Juni 12
  • 7. Agenda A short history of Search Slang The need to Search Who is searching and what is (s)he searching for? Search in TYPO3 with Apache Solr Questions & Answers Freitag, 15. Juni 12
  • 8. History A short trip in the History of Searchsolutions in times of IT. Really short, really lots of missing facts and not scientific at all. Freitag, 15. Juni 12
  • 9. Scratch your own itch, IBM. At the beginning was the Mainframe IBM develops in 1969 STAIRS (storage and information retrieval system) Fulltext Search for Terminal Applications Performance: „far below anyone‘s expectations“ First use in the DOJ Case againts IBM Source: A history of online information services, 1963-1976 von Charles P. Bourne,Trudi Bellardo Hahn Freitag, 15. Juni 12
  • 10. Internet years are dog years The Internet changes the needs in Fulltextsearch. With Lycos, Alltheweb, Infoseek, Excite and Altavista Searchpages compete in solving the „How do i find something in the Internet?“ Its a race for the love of the seeking internet users in 1995. Yahoo tries to be the Directory of Websites Freitag, 15. Juni 12
  • 11. And then came GOOGLE Who does not know about Googles Secret? The Anatomy of a Large-Scale Hypertextual Web Search Engine http://infolab.stanford.edu/~backrub/google.html Visionary Paper The named technologies and principles are industry standard and are still changing our IT Industry. (Map reduce, Big data & Pagerank) A must read! Freitag, 15. Juni 12
  • 13. Its all about words! Irformation Retrieval (IR) Term versus Query Index Recall & Precision Relevancy Index, Inverted Index & Posting List Recency & Authority Freitag, 15. Juni 12
  • 14. The need to Search What leads us when we search? How do we search? How does what we find change us? Freitag, 15. Juni 12
  • 15. People are like Bears (only less fur) How do we search? Marcia Bates, 1989 THE DESIGN OF BROWSING AND BERRYPICKING TECHNIQUES FOR THE ONLINE SEARCH INTERFACE http://pages.gseis.ucla.edu/faculty/bates/ berrypicking.html Every search can be described with this Freitag, 15. Juni 12
  • 16. Marcia J. Bates Berrypicking techniques for the online search interface (1989) Freitag, 15. Juni 12
  • 17. Carrots & Sticks Search Behavior Patterns, John Ferrara http://www.boxesandarrows.com/view/search- behavior Domain Expertise Search Expertise Cognitive Style Goal Type Mode of seeking Situational idiosyncrasies Freitag, 15. Juni 12
  • 18. Neo: The Matrix Matrix of Scope/Style of information needs Scope & Type -Tyler Tate. Sohn et al. Church & Smyth http://twigkit.com/blog/2011/12/06/mobile-information-needs.html Freitag, 15. Juni 12
  • 19. Search = Success for your Website Benefits for your Visitors & Users They will find it on your Website Serendipity Better and faster knowledge transfer Business benefits ROI Agility Awareness and Enablement Freitag, 15. Juni 12
  • 20. TYPO3 & Search Shameless Plug: Apache Solr for TYPO3 I still have some „I love Indexed Search“ Buttons to giveaway. Freitag, 15. Juni 12
  • 21. Solr-Components Query Indexing Analysis Results Additional Components Freitag, 15. Juni 12
  • 23. What can be indexed? TYPO3 Content TYPO3 Databases (TCA Tables) External Websites RSS-Feeds Files ... Freitag, 15. Juni 12
  • 24. Indexing Features Synonyms Stopwords Protected words External Content RSS Microsites Application Data ... Freitag, 15. Juni 12
  • 26. Query Options Operators “+” and “-” to add or exclude terms Soon “and” und “or” to combine terms Quotes to tie words together ie. “This is a Search with many Terms” Diacritical Characters cuvée = cuvee Søren = Sören = Soeren = Sœren = Soren Freitag, 15. Juni 12
  • 27. Query Takes care of Access Control Rights Autocomplete Did you mean? Freitag, 15. Juni 12
  • 29. Results Searchresults linking to a result Page Browser Sorting Relavancy (Score) Author Date (cr_date of TYPO3 Page) your own criterias Freitag, 15. Juni 12
  • 30. Results View-Helper to display additional Information like Custom Prices & Preview images. Preset Filters so that Facets are activated with a Query Freitag, 15. Juni 12
  • 31. Results Field Boosting (Terms in certain Field score higher. Can be freely set) Boost-Functions (Functions on values of documents. I.e. newer documents are more ranked higher) Query-Manipulation (Can be changed before they hit Solr) Elevation (Paid content) Freitag, 15. Juni 12
  • 32. Results Template Engine: flexible Template to customize your results listing fast and easily Search word highlighting Spell-Checking: "Did you mean?" Common Searches Recent Searches Freitag, 15. Juni 12
  • 33. Facets Type-Facets Author Type of Document (Pages, News, Files & many more) Range-Facets (Work in Progress) (ie. 1-10 $ or Slider) Hierarchical Facets (Great if you have lots of categorized Data like in News or Filerepositories) Facets can be combined with each other (ie. Show me all red & blue shoes) Freitag, 15. Juni 12
  • 34. Facets Geo-Search (work in progress) (i.e If you want to search and display the location of your data of a certain type: Stores, Servicepoints, Bus-stops ) Geo-IP based on IP of your visitor (ie: Where is the next salespoint for your products) Facets are TYPO3 content objects (can be manipulated with typoscript i.e Gifbuilder) Filters can be preset (You can preset certain facets) ... Freitag, 15. Juni 12
  • 36. Analysis Query Logging Stats on Queries (Work in Progress) Userbased Ranking (Work in Progress) Integration with analytics tools posible Roll your own There might be a Solr Server feature coming up ... Freitag, 15. Juni 12
  • 38. Additional Components More like this Component on the Details page can show related additional documents Its possible to access Indexed Data Nutch Crawler to Index non TYPO3 Websites Data Import Handler Freitag, 15. Juni 12
  • 39. dkd development kommunikation design Thank You! Merci. Freitag, 15. Juni 12
  • 40. Quellenangaben Lucene Scoring for dummies: http:// www.supermind.org/blog/378/lucene-scoring- for-dummies Fotos: Søren Schaffstein Freitag, 15. Juni 12