• Save
Search, APIs, capability management and Sensis's journey
Upcoming SlideShare
Loading in...5
×
 

Search, APIs, capability management and Sensis's journey

on

  • 1,542 views

My talk at Lucene Revolution 2011 about how Sensis is using Solr to deliver its API strategy and some of the drivers that define how search is managed at Sensis

My talk at Lucene Revolution 2011 about how Sensis is using Solr to deliver its API strategy and some of the drivers that define how search is managed at Sensis

Statistics

Views

Total Views
1,542
Views on SlideShare
1,513
Embed Views
29

Actions

Likes
1
Downloads
0
Comments
0

2 Embeds 29

http://www.linkedin.com 25
https://www.linkedin.com 4

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Search, APIs, capability management and Sensis's journey Presentation Transcript

  • 1. Search, APIs,Capability Management and the Sensis Journey Craig Rees
  • 2. • Project background• Platform selection• Search capability• Relevance• Architecture• Quality management• Hurdles• What’s next Today’s menu
  • 3. • Sensis helps Australians find, buy and sell • From print directories to a cross-platform lead generator • Sensis publishes over 1.8 Million business listings • Two of the top 10 visited online sites in Australia (WhitePages.com.au and YellowPages.com.au)Sensis
  • 4. Business objectives• Drive presence in the local search market place• Open up the largest database of business listings in Australia• Reduce the effort required from local search developers Technology objectives• Free to use, we are after the • Develop a total search platform reporting • Relevancy testing as part of the development lifecycle • A framework to identify problem spaces • Manageable platform • Continuous deploymentsProject background
  • 5. Developer portal
  • 6. • Support for the search capability team• Structured vs non structured data• Deterministic vs black box• Non propriety code base• Community backing Platform selection
  • 7. • A/B testing • Machine learningOptimized Lvl 5 • External collaboration • Multiple contexts • Online dashboards • Test environmentsManaged Lvl 4 • Dynamic search refinements • Targets and metrics • Defined team • Regular monitoringMonitored Lvl 3 • Static autosuggest • Basic linguistics • Adhoc processes • Part time teamAdhoc Lvl 2 • Static dictionaries • Individual led innovation • No resources • No reportingUnmanaged Lvl 1 • Out of the box featuresThe Sensis Search capability maturity model*Courtesy of Pete Crawford & Craig Lonsdale
  • 8. Location Intent Chronology • Name • Type Social Graph • Product • Spatial Device IndividualContext is key
  • 9. Business Geo Service Data Solr Mashery Business Name Query Data Search MongoDB Handler Service Index API Publisher Reporting Type Query Service Handler Historical search Data Reporting Events OntologiesOur architecture
  • 10. Business Geo Service Data Solr Mashery Business Name Query Data Search MongoDB Handler Service Index API Publisher Reporting Type Query Service Handler Historical search Data Reporting Events OntologiesData staging
  • 11. Business Geo Service Data Solr Mashery Business Name Query Data Search MongoDB Handler Service Index API Publisher Reporting Type Query Service Handler Historical search Data Reporting Events OntologiesSearch
  • 12. Business Geo Service Data Solr Mashery Business Name Query Data Search MongoDB Handler Service Index API Publisher Reporting Type Query Service Handler Historical search Data Reporting Events OntologiesAPI
  • 13. Business Geo Service Data Solr Mashery Business Name Query Data Search MongoDB Handler Service Index API Publisher Reporting Type Query Service Handler Historical search Data Reporting Events OntologiesAPI proxy
  • 14. • Moved from a black box Yesterday Today Tomorrow solution to a manageable platform• Deliver search improvements without major code changes• Understand how results were calculated• Identity problems scientifically• Continuously tune and test relevance Evolution of search management
  • 15. Specific gold sets for each Path Analysis problem space: used to identify  Intent  Spelling & stemming problems  Location spaces  Phrase parsing Features signed off “Gold Sets” only when they make used to define a positive impact to overall quality quality score score (TREC)Problem spaces, quality management & tuning
  • 16. Search quality analysis and testing
  • 17. Results examiner
  • 18. Score analysis
  • 19. Tuning
  • 20. Lather, rinse, repeat
  • 21. • Data redundancy and homogeneity • Solr ranking of rare terms • Intent differentiation • Contextual synonymsHurdles along the way
  • 22. • Query engine • Facets / autosuggest • Real time tuning • Machine learning • Multi term queries • Scoring thresholds • Content ValueWhere next?
  • 23. Email: craig.rees@sensis.com.au www: developers.sensis.com.au Twitter: @SensisAPI @ablebagelQuestions?