SlideShare a Scribd company logo
1 of 23
Download to read offline
Search, APIs,
Capability Management
           and
  the Sensis Journey


      Craig Rees
•    Project background

•    Platform selection

•    Search capability

•    Relevance

•    Architecture

•    Quality management

•    Hurdles

•    What’s next



 Today’s menu
•  Sensis helps Australians find,
          buy and sell

         •  From print directories to a
          cross-platform lead generator

         •  Sensis publishes over 1.8
          Million business listings

         •  Two of the top 10 visited online
          sites in Australia
          (WhitePages.com.au and
          YellowPages.com.au)


Sensis
Business objectives
•    Drive presence in the local
     search market place
•    Open up the largest database of
     business listings in Australia
•    Reduce the effort required from
     local search developers           Technology objectives
•    Free to use, we are after the     •    Develop a total search platform
     reporting                         •    Relevancy testing as part of the
                                            development lifecycle
                                       •    A framework to identify problem
                                            spaces
                                       •    Manageable platform
                                       •    Continuous deployments


Project background
Developer portal
•    Support for the search
     capability team

•    Structured vs non
     structured data

•    Deterministic vs black
     box

•    Non propriety code base

•    Community backing




     Platform selection
•  A/B testing
                                                      •  Machine learning
Optimized                                     Lvl 5   •  External collaboration
                                                      •  Multiple contexts


                                                                   •  Online dashboards
                                                                   •  Test environments
Managed                                       Lvl 4                •  Dynamic search refinements
                                                                   •  Targets and metrics


                                                                             •  Defined team
                                                                             •  Regular monitoring
Monitored                                     Lvl 3                          •  Static autosuggest
                                                                             •  Basic linguistics


                                                                                  •  Adhoc processes
                                                                                  •  Part time team
Adhoc                                         Lvl 2                               •  Static dictionaries
                                                                                  •  Individual led innovation

                                                                                      •  No resources
                                                                                      •  No reporting
Unmanaged                                     Lvl 1                                   •  Out of the box
                                                                                         features




The Sensis Search capability maturity model
*Courtesy of Pete Crawford & Craig Lonsdale
Location



                 Intent       Chronology
                 •  Name
                 •  Type
                              Social Graph
                 •  Product
                 •  Spatial

                                Device




                               Individual



Context is key
Business                         Geo Service
    Data



                                       Solr                     Mashery
  Business                             Name Query
    Data                                                         Search
               MongoDB                   Handler                 Service
                           Index                      API                   Publisher
                                                                Reporting
                                       Type Query
                                                                 Service
                                         Handler

  Historical
   search
    Data

                                                    Reporting
                                                     Events

                         Ontologies




Our architecture
Business                         Geo Service
    Data



                                       Solr                     Mashery
  Business                             Name Query
    Data                                                         Search
               MongoDB                   Handler                 Service
                           Index                      API                   Publisher
                                                                Reporting
                                       Type Query
                                                                 Service
                                         Handler

  Historical
   search
    Data

                                                    Reporting
                                                     Events

                         Ontologies




Data staging
Business                          Geo Service
   Data



                                       Solr                     Mashery
 Business                              Name Query
   Data                                                          Search
               MongoDB                   Handler                 Service
                           Index                      API                   Publisher
                                                                Reporting
                                       Type Query
                                                                 Service
                                         Handler

  Historical
   search
    Data

                                                    Reporting
                                                     Events

                         Ontologies




Search
Business                          Geo Service
   Data



                                       Solr                     Mashery
 Business                              Name Query
   Data                                                          Search
               MongoDB                   Handler                 Service
                           Index                      API                   Publisher
                                                                Reporting
                                       Type Query
                                                                 Service
                                         Handler

  Historical
   search
    Data

                                                    Reporting
                                                     Events

                         Ontologies




API
Business                         Geo Service
    Data



                                       Solr                     Mashery
 Business                              Name Query
   Data                                                          Search
               MongoDB                   Handler                 Service
                           Index                      API                   Publisher
                                                                Reporting
                                       Type Query
                                                                 Service
                                         Handler

  Historical
   search
    Data

                                                    Reporting
                                                     Events

                         Ontologies




API proxy
•  Moved from a black box solution    Yesterday   Today   Tomorrow
   to a manageable platform
•  Deliver search improvements
   without major code changes
•  Understand how results were
   calculated
•  Identity problems scientifically
•  Continuously tune and test
   relevance




  Evolution of search management
Specific gold sets for each
       Path Analysis         problem space:
       used to identify       Ø    Intent
                                    Spelling & stemming
       problems
                              Ø 
                              Ø    Location
       spaces                 Ø    Phrase parsing




                             Features signed off
       “Gold Sets”           only when they make
       used to define        a positive impact to
       overall quality       quality score
       score (TREC)



Problem spaces, quality management & tuning
Search quality analysis and testing
Results examiner
Score analysis
Tuning
Lather, rinse, repeat
•    Data redundancy and homogeneity
                   •    Solr ranking of rare terms
                   •    Intent differentiation
                   •    Contextual synonyms




Hurdles along the way
•    Query engine
              •    Facets / autosuggest
              •    Real time tuning
              •    Machine learning
              •    Multi term queries
              •    Scoring thresholds
              •    Content Value




Where next?
Email: craig.rees@sensis.com.au
             www: developers.sensis.com.au
             Twitter: @SensisAPI
                      @ablebagel




Questions?

More Related Content

Viewers also liked

Jonh Lennon
Jonh LennonJonh Lennon
Jonh Lennontanica
 
Tv ролики
Tv роликиTv ролики
Tv роликиtarodnova
 
Creating Custom Finishes
Creating Custom FinishesCreating Custom Finishes
Creating Custom Finishesguest0a3c64a
 
Mujer, pajaro y estrella
Mujer, pajaro y estrellaMujer, pajaro y estrella
Mujer, pajaro y estrellaguest986e5ae
 
Impact of open source search on the intelligence community
Impact of open source search on the intelligence communityImpact of open source search on the intelligence community
Impact of open source search on the intelligence communityLucidworks (Archived)
 
Amazing grace[1]
Amazing grace[1]Amazing grace[1]
Amazing grace[1]tanica
 
20101023 ie9 cache
20101023 ie9 cache20101023 ie9 cache
20101023 ie9 cache彰 村地
 
Solr Cluster installation tool "Anuenue"
Solr Cluster installation tool "Anuenue"Solr Cluster installation tool "Anuenue"
Solr Cluster installation tool "Anuenue"Lucidworks (Archived)
 
Understanding Lucene Search Performance
Understanding Lucene Search PerformanceUnderstanding Lucene Search Performance
Understanding Lucene Search PerformanceLucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrLucidworks (Archived)
 
Lucene rev preso busch realtime search lr1010
Lucene rev preso busch realtime search lr1010Lucene rev preso busch realtime search lr1010
Lucene rev preso busch realtime search lr1010Lucidworks (Archived)
 
Practical Search with Solr: Beyond just Looking it Up
Practical Search with Solr: Beyond just Looking it UpPractical Search with Solr: Beyond just Looking it Up
Practical Search with Solr: Beyond just Looking it UpLucidworks (Archived)
 
Network Forensics Puzzle Contest に挑戦 #2
Network Forensics Puzzle Contest に挑戦 #2Network Forensics Puzzle Contest に挑戦 #2
Network Forensics Puzzle Contest に挑戦 #2彰 村地
 
Descritores de linguagem
Descritores de linguagemDescritores de linguagem
Descritores de linguagemgindri
 
Webテクノロジー@2012
Webテクノロジー@2012Webテクノロジー@2012
Webテクノロジー@2012彰 村地
 
Already, just, still, yet
Already, just, still, yetAlready, just, still, yet
Already, just, still, yettanica
 

Viewers also liked (20)

Jonh Lennon
Jonh LennonJonh Lennon
Jonh Lennon
 
Tv ролики
Tv роликиTv ролики
Tv ролики
 
Creating Custom Finishes
Creating Custom FinishesCreating Custom Finishes
Creating Custom Finishes
 
Mujer, pajaro y estrella
Mujer, pajaro y estrellaMujer, pajaro y estrella
Mujer, pajaro y estrella
 
All Data Big and Small
All Data Big and SmallAll Data Big and Small
All Data Big and Small
 
Impact of open source search on the intelligence community
Impact of open source search on the intelligence communityImpact of open source search on the intelligence community
Impact of open source search on the intelligence community
 
Real Time Search at Yammer
Real Time Search at YammerReal Time Search at Yammer
Real Time Search at Yammer
 
Amazing grace[1]
Amazing grace[1]Amazing grace[1]
Amazing grace[1]
 
20101023 ie9 cache
20101023 ie9 cache20101023 ie9 cache
20101023 ie9 cache
 
Solr Cluster installation tool "Anuenue"
Solr Cluster installation tool "Anuenue"Solr Cluster installation tool "Anuenue"
Solr Cluster installation tool "Anuenue"
 
Understanding Lucene Search Performance
Understanding Lucene Search PerformanceUnderstanding Lucene Search Performance
Understanding Lucene Search Performance
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
 
корея
кореякорея
корея
 
Lucene rev preso busch realtime search lr1010
Lucene rev preso busch realtime search lr1010Lucene rev preso busch realtime search lr1010
Lucene rev preso busch realtime search lr1010
 
Practical Search with Solr: Beyond just Looking it Up
Practical Search with Solr: Beyond just Looking it UpPractical Search with Solr: Beyond just Looking it Up
Practical Search with Solr: Beyond just Looking it Up
 
Network Forensics Puzzle Contest に挑戦 #2
Network Forensics Puzzle Contest に挑戦 #2Network Forensics Puzzle Contest に挑戦 #2
Network Forensics Puzzle Contest に挑戦 #2
 
What’s new in apache solr 1.4
What’s new in apache solr 1.4What’s new in apache solr 1.4
What’s new in apache solr 1.4
 
Descritores de linguagem
Descritores de linguagemDescritores de linguagem
Descritores de linguagem
 
Webテクノロジー@2012
Webテクノロジー@2012Webテクノロジー@2012
Webテクノロジー@2012
 
Already, just, still, yet
Already, just, still, yetAlready, just, still, yet
Already, just, still, yet
 

Similar to "Search, APIs,Capability Management and the Sensis Journey"

Hadoop summit EU - Crowd Sourcing Reflected Intelligence
Hadoop summit EU - Crowd Sourcing Reflected IntelligenceHadoop summit EU - Crowd Sourcing Reflected Intelligence
Hadoop summit EU - Crowd Sourcing Reflected IntelligenceTed Dunning
 
Crowd-Sourced Intelligence Built into Search over Hadoop
Crowd-Sourced Intelligence Built into Search over HadoopCrowd-Sourced Intelligence Built into Search over Hadoop
Crowd-Sourced Intelligence Built into Search over HadoopDataWorks Summit
 
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and SolrLarge Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and SolrGrant Ingersoll
 
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and SolrLarge Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and SolrGrant Ingersoll
 
Kuali update v4 - mw
Kuali update   v4 - mwKuali update   v4 - mw
Kuali update v4 - mwsarnoa
 
10 Things I Like in SharePoint 2013 Search
10 Things I Like in SharePoint 2013 Search10 Things I Like in SharePoint 2013 Search
10 Things I Like in SharePoint 2013 SearchSPC Adriatics
 
SPCAdriatics - 10 Things I Like In SharePoint 2013 Search
SPCAdriatics - 10 Things I Like In SharePoint 2013 SearchSPCAdriatics - 10 Things I Like In SharePoint 2013 Search
SPCAdriatics - 10 Things I Like In SharePoint 2013 SearchAgnes Molnar
 
Large-Scale Search Discovery Analytics with Hadoop, Mahout, Solr
Large-Scale Search Discovery Analytics with Hadoop, Mahout, SolrLarge-Scale Search Discovery Analytics with Hadoop, Mahout, Solr
Large-Scale Search Discovery Analytics with Hadoop, Mahout, SolrDataWorks Summit
 
Business intelligence-solutions 2012-english
Business intelligence-solutions 2012-englishBusiness intelligence-solutions 2012-english
Business intelligence-solutions 2012-englishStratebi
 
Oracle Application Management Suite
Oracle Application Management SuiteOracle Application Management Suite
Oracle Application Management SuiteOracleVolutionSeries
 
Information architecture strategic process
Information architecture strategic processInformation architecture strategic process
Information architecture strategic processKerry Dirks MCPS MS
 
Leveraging Solr and Mahout
Leveraging Solr and MahoutLeveraging Solr and Mahout
Leveraging Solr and MahoutGrant Ingersoll
 
A Behind the Scenes Look at the Force.com Platform
A Behind the Scenes Look at the Force.com PlatformA Behind the Scenes Look at the Force.com Platform
A Behind the Scenes Look at the Force.com PlatformSalesforce Developers
 
2010 10-building-global-listening-platform-with-solr
2010 10-building-global-listening-platform-with-solr2010 10-building-global-listening-platform-with-solr
2010 10-building-global-listening-platform-with-solrLucidworks (Archived)
 
Excellence in asset information management sid snitkin arc 2008
Excellence in asset information management sid snitkin arc 2008Excellence in asset information management sid snitkin arc 2008
Excellence in asset information management sid snitkin arc 2008ARC Advisory Group
 
MetaVis Webinar - 10 Things I Like in SharePoint 2013 Search
MetaVis Webinar - 10 Things I Like in SharePoint 2013 SearchMetaVis Webinar - 10 Things I Like in SharePoint 2013 Search
MetaVis Webinar - 10 Things I Like in SharePoint 2013 SearchAgnes Molnar
 

Similar to "Search, APIs,Capability Management and the Sensis Journey" (20)

Project Implenmentation Methodology
Project Implenmentation MethodologyProject Implenmentation Methodology
Project Implenmentation Methodology
 
Hadoop summit EU - Crowd Sourcing Reflected Intelligence
Hadoop summit EU - Crowd Sourcing Reflected IntelligenceHadoop summit EU - Crowd Sourcing Reflected Intelligence
Hadoop summit EU - Crowd Sourcing Reflected Intelligence
 
Crowd-Sourced Intelligence Built into Search over Hadoop
Crowd-Sourced Intelligence Built into Search over HadoopCrowd-Sourced Intelligence Built into Search over Hadoop
Crowd-Sourced Intelligence Built into Search over Hadoop
 
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and SolrLarge Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
 
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and SolrLarge Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
 
Kuali update v4 - mw
Kuali update   v4 - mwKuali update   v4 - mw
Kuali update v4 - mw
 
10 Things I Like in SharePoint 2013 Search
10 Things I Like in SharePoint 2013 Search10 Things I Like in SharePoint 2013 Search
10 Things I Like in SharePoint 2013 Search
 
SPCAdriatics - 10 Things I Like In SharePoint 2013 Search
SPCAdriatics - 10 Things I Like In SharePoint 2013 SearchSPCAdriatics - 10 Things I Like In SharePoint 2013 Search
SPCAdriatics - 10 Things I Like In SharePoint 2013 Search
 
Large-Scale Search Discovery Analytics with Hadoop, Mahout, Solr
Large-Scale Search Discovery Analytics with Hadoop, Mahout, SolrLarge-Scale Search Discovery Analytics with Hadoop, Mahout, Solr
Large-Scale Search Discovery Analytics with Hadoop, Mahout, Solr
 
32 cc 3_a_l-drumheller
32 cc 3_a_l-drumheller32 cc 3_a_l-drumheller
32 cc 3_a_l-drumheller
 
SharePoint Development
SharePoint DevelopmentSharePoint Development
SharePoint Development
 
Business intelligence-solutions 2012-english
Business intelligence-solutions 2012-englishBusiness intelligence-solutions 2012-english
Business intelligence-solutions 2012-english
 
Oracle Application Management Suite
Oracle Application Management SuiteOracle Application Management Suite
Oracle Application Management Suite
 
SEALS @ WWW2012
SEALS @ WWW2012SEALS @ WWW2012
SEALS @ WWW2012
 
Information architecture strategic process
Information architecture strategic processInformation architecture strategic process
Information architecture strategic process
 
Leveraging Solr and Mahout
Leveraging Solr and MahoutLeveraging Solr and Mahout
Leveraging Solr and Mahout
 
A Behind the Scenes Look at the Force.com Platform
A Behind the Scenes Look at the Force.com PlatformA Behind the Scenes Look at the Force.com Platform
A Behind the Scenes Look at the Force.com Platform
 
2010 10-building-global-listening-platform-with-solr
2010 10-building-global-listening-platform-with-solr2010 10-building-global-listening-platform-with-solr
2010 10-building-global-listening-platform-with-solr
 
Excellence in asset information management sid snitkin arc 2008
Excellence in asset information management sid snitkin arc 2008Excellence in asset information management sid snitkin arc 2008
Excellence in asset information management sid snitkin arc 2008
 
MetaVis Webinar - 10 Things I Like in SharePoint 2013 Search
MetaVis Webinar - 10 Things I Like in SharePoint 2013 SearchMetaVis Webinar - 10 Things I Like in SharePoint 2013 Search
MetaVis Webinar - 10 Things I Like in SharePoint 2013 Search
 

More from Lucidworks (Archived)

Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Lucidworks (Archived)
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and SolrLucidworks (Archived)
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessLucidworks (Archived)
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceLucidworks (Archived)
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineLucidworks (Archived)
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchLucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchLucidworks (Archived)
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Lucidworks (Archived)
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...Lucidworks (Archived)
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Lucidworks (Archived)
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCLucidworks (Archived)
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCLucidworks (Archived)
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCLucidworks (Archived)
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCLucidworks (Archived)
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCLucidworks (Archived)
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKLucidworks (Archived)
 
Introducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarIntroducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarLucidworks (Archived)
 

More from Lucidworks (Archived) (20)

Integrating Hadoop & Solr
Integrating Hadoop & SolrIntegrating Hadoop & Solr
Integrating Hadoop & Solr
 
The Data-Driven Paradigm
The Data-Driven ParadigmThe Data-Driven Paradigm
The Data-Driven Paradigm
 
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
 
What's new in solr june 2014
What's new in solr june 2014What's new in solr june 2014
What's new in solr june 2014
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DC
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLK
 
Introducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarIntroducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinar
 

Recently uploaded

Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 

Recently uploaded (20)

Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 

"Search, APIs,Capability Management and the Sensis Journey"

  • 1. Search, APIs, Capability Management and the Sensis Journey Craig Rees
  • 2. •  Project background •  Platform selection •  Search capability •  Relevance •  Architecture •  Quality management •  Hurdles •  What’s next Today’s menu
  • 3. •  Sensis helps Australians find, buy and sell •  From print directories to a cross-platform lead generator •  Sensis publishes over 1.8 Million business listings •  Two of the top 10 visited online sites in Australia (WhitePages.com.au and YellowPages.com.au) Sensis
  • 4. Business objectives •  Drive presence in the local search market place •  Open up the largest database of business listings in Australia •  Reduce the effort required from local search developers Technology objectives •  Free to use, we are after the •  Develop a total search platform reporting •  Relevancy testing as part of the development lifecycle •  A framework to identify problem spaces •  Manageable platform •  Continuous deployments Project background
  • 6. •  Support for the search capability team •  Structured vs non structured data •  Deterministic vs black box •  Non propriety code base •  Community backing Platform selection
  • 7. •  A/B testing •  Machine learning Optimized Lvl 5 •  External collaboration •  Multiple contexts •  Online dashboards •  Test environments Managed Lvl 4 •  Dynamic search refinements •  Targets and metrics •  Defined team •  Regular monitoring Monitored Lvl 3 •  Static autosuggest •  Basic linguistics •  Adhoc processes •  Part time team Adhoc Lvl 2 •  Static dictionaries •  Individual led innovation •  No resources •  No reporting Unmanaged Lvl 1 •  Out of the box features The Sensis Search capability maturity model *Courtesy of Pete Crawford & Craig Lonsdale
  • 8. Location Intent Chronology •  Name •  Type Social Graph •  Product •  Spatial Device Individual Context is key
  • 9. Business Geo Service Data Solr Mashery Business Name Query Data Search MongoDB Handler Service Index API Publisher Reporting Type Query Service Handler Historical search Data Reporting Events Ontologies Our architecture
  • 10. Business Geo Service Data Solr Mashery Business Name Query Data Search MongoDB Handler Service Index API Publisher Reporting Type Query Service Handler Historical search Data Reporting Events Ontologies Data staging
  • 11. Business Geo Service Data Solr Mashery Business Name Query Data Search MongoDB Handler Service Index API Publisher Reporting Type Query Service Handler Historical search Data Reporting Events Ontologies Search
  • 12. Business Geo Service Data Solr Mashery Business Name Query Data Search MongoDB Handler Service Index API Publisher Reporting Type Query Service Handler Historical search Data Reporting Events Ontologies API
  • 13. Business Geo Service Data Solr Mashery Business Name Query Data Search MongoDB Handler Service Index API Publisher Reporting Type Query Service Handler Historical search Data Reporting Events Ontologies API proxy
  • 14. •  Moved from a black box solution Yesterday Today Tomorrow to a manageable platform •  Deliver search improvements without major code changes •  Understand how results were calculated •  Identity problems scientifically •  Continuously tune and test relevance Evolution of search management
  • 15. Specific gold sets for each Path Analysis problem space: used to identify Ø  Intent Spelling & stemming problems Ø  Ø  Location spaces Ø  Phrase parsing Features signed off “Gold Sets” only when they make used to define a positive impact to overall quality quality score score (TREC) Problem spaces, quality management & tuning
  • 16. Search quality analysis and testing
  • 21. •  Data redundancy and homogeneity •  Solr ranking of rare terms •  Intent differentiation •  Contextual synonyms Hurdles along the way
  • 22. •  Query engine •  Facets / autosuggest •  Real time tuning •  Machine learning •  Multi term queries •  Scoring thresholds •  Content Value Where next?
  • 23. Email: craig.rees@sensis.com.au www: developers.sensis.com.au Twitter: @SensisAPI @ablebagel Questions?