SlideShare a Scribd company logo
1 of 52
Implementing and designing search solutions

  Gothenburg University – Gothenburg – 2012-03-08



                                                © FINDWISE 2012
Agenda

    •    Introduction to Findwise
    •    Technical approach
    •    DIY UX design
    •    Research
About Findwise




•   Founded in 2005

•   Offices in Sweden, Denmark,
    Norway and Poland

•   72 employees (February 2012)

•   Our objective is to be a leading provider of Findability solutions utilising
    the full potential of search technology to create customer business value
Technology independent
Creating search-driven Findability solutions based on market-leading
commercial and open source search technology platforms:

   Autonomy IDOL
   Microsoft (SharePoint and FAST Search products)
   Google GSA
   IBM ICA/OmniFind
   LucidWorks
   Apache Lucene/Solr (Open source)
   and more…
Findability Challenges

 Employee productivity (DN article, March 2011):
 ”The effort to find the right information costs an average company 80,000
 SEK per employee and year”


 Customer Service quality and efficiency (Accenture report, March 2011):
 “69% of agents don't have answers to help service customers”


 E-commerce conversion rate (Google survey, December 2010):
 “77% of those surveyed used search within an e-commerce website to find
 products”
Information overload?
A search engine alone is not enough
Technical approach
RE-USE
STANDARD
Standard architecture
Search core
Search core - overview
Documents                             Inverted index
Title: Brown fox                      Term             Documents
Content: The quick
brown fox jumps over
the lazy dog
                       Tokenization   …                …
                       Stemming
Author: Tobias Berg
                       Stop-word      fox              1
                       …
                                      jump             1,2
Title: My dog                         lazy             1
Content: My old dog
cannot jump anymore                   dog              1,2
Author: Svetoslav
Marinov                               tobias           1
                                      berg             1
                                      …                …
Relevancy



                  Retrieved
                 documents             Relevant
                                      documents




  • Precision – how many of the retrieved documents are relevant?
  • Recall – how many of the relevant documents were retrieved?
Relevancy


             Recall
find everything related to the query
                                                            Goal
           - lemmatization                            Improve precision,
           - synonyms                              without sacrificing recall
           - wildcards
           - anti-phrasing
           - or-operator




                                                   Precision
                                       find only entities related to the query

                                              - exact word matching
                                              - exact phrase matching
                                              - and-operator
Search core – relevance score

 • TF/IDF
 • Field length
 • Field weight
         • Title *2
         • Author *4
         • Content *1
 • Freshness
 • …
Search Core

                        •   Optimized for full-text search
{query}     {result}    •   Sub-second responses
                        •   Tunable relevance
                        •   Scalable
Find        Score
matching    documents   •   Configurable & Extendable
documents
Standard architecture
Connectors
Connectors – fetch data


 Id   Product      Description   Price
      name
 1    Wheel        Makes the
                   bus go
                                 45            Database
                   round round
                   round
                                               connector
 2    Window       A shield of   12
                   glass




 Id   Book name    Abstract      Author

 1    Ulysses      Irish novel   James
                                 Joyce
                                               Database
 2    Crime and    Russion       Dostoevsky,   connector
      Punishment   novel         Fyodor
Connector framework – code example

 public void execute() {
       //Insert code to fetch content
 }

 public void interrupt() {
       //Insert code to handle interrupt signal
 }

 public void init() {
       //Insert code to initialize connnector
 }
Connector Frameworks


                                              •   Existing connectors
                                              •   Re-usable
                                              •   Configuration interfaces
http://incubator.apache.org/connectors/       •   Standardized implementation




http://code.google.com/p/google-enterprise-
connector-manager/
Standard architecture
Pipeline
Pipeline - overview




         •   PDF/Office -> Text
         •   Lemmatization
         •   Language identification
         •   NER
         •   Phonetic search
         •   Keyword extraction
         •   External calls
         •   …
Pipeline framework – code example

protected void addAction(Document doc) throws PipelineException {
        //Insert code
        doc.addField(“Title”,”Hello world!”);
}

protected void updateAction(Document doc) throws PipelineException {
        //Insert code
        addAction(item);
}

protected void deleteAction(Document doc) throws PipelineException {
        //Insert code
}
NLP tools and approaches

    • Open source:
             GATE, OpenNLP, UIMA, StanfordNLP, Mallet,
             Apache Mahout
    •   Proprietary:
             IBM LanguageWare
    •   Own components:
             e.g. KeywordExtraction Service; LanguageIdentify
    •   POS taggers – Hunpos, OpenNLP, Mallet
    •   Dependency Parsers – MaltParser, StanfordParser
    •   NER – rule-based + statistical models
    •   Document summarization
    •   Document clustering
Pipeline – configuration example
Pipeline frameworks


                                • Re-usable stages
 http://www.openpipeline.com/   • Configuration interface
                                • Focus on task
 Findwise
 Hydra


  http://www.pypes.org/
Putting it all together
What the frell is UX design?
What the frell is UX design?

     •   Interaction design
     •   Usability Engineering
     •   Information Architecture
     •   Visual Design
Findwise UX design principles

     Users want results

     Dialogue not monologue
     Participation builds trust
     Answer   frequent questions
     Simple but powerful
Users want results
Dialogue not monologue
Participation builds trust
Answer frequent questions
Simple but powerful
Findwise UX design principles

     Users want results

     Dialogue not monologue
     Participation builds trust
     Answer   frequent questions
     Simple but powerful
DIY UX design
DIY UX design


           Design research
                Analytics
            Usability tests
                Iterate!
Design research

     •   Be easy to reach – keep contact
     •   Let users requests guide you when prioritizing new features
     •   Listen & try to discover the underlying problem
     •   Try to find out what the user needs not what they say they want
Analytics

     •   Web analytics
     •   Search analytics
     •   A/B testing
Usability tests

     •      Test early - test often
     •     Use sketches, paper prototypes, static prototypes and
         working prototypes!
     •      Create real tasks or problems
     •      Don’t ask them how they would want it
     •      Test on friends and family or colleagues
Iterate!
Why UX design?

 • Improved requirements
 • Better feedback
 • Eliminate bias
 • Less development time
Summary

   •   Listen & try to discover the underlying problem
   •   Search analytics – Top queries
   •   Do usability tests early & often
   •   Iterate!
Research

    •   Collaboration with Universities
            GU, Borås, KTH, Copenhangen U.
    •   EU projects
            RUSHES
    •   Master’s Thesis supervision
            Chalmers, KTH, Lund
Master’s Thesis projects

     •   A way to test ideas
     •   A way to recruit people
     •   A way to cooperate with Universities


     •   Keyword Extraction
     •   Document Clustering
     •   NER
     •   Document summarization
     •   Extracting structural information from text
     •   Query log analysis
Resources - books

    •   The design of everyday things
    •   Don’t make me think
    •   Search analytics for your site
    •   ManifoldCF in Action
    •   Taming Text
Shameless plug


    twitter.com/findwise
    slideshare.net/findwise
    findabilityblog.se
    findwise.com
Tobias Berg
          tobias.berg@findwise.com



Thanks!   Björn Klockljung Johansson
          bjorn.klockljung.johansson@findwise.com


          Svetoslav Marinov
          svetoslav.marinov@findwise.com

More Related Content

Viewers also liked

Presentation11
Presentation11Presentation11
Presentation11
sumberlor
 

Viewers also liked (17)

The Findwise Spirit
The Findwise SpiritThe Findwise Spirit
The Findwise Spirit
 
The Why and How of Findability
The Why and How of FindabilityThe Why and How of Findability
The Why and How of Findability
 
Findability by Findwise - 5 dimensions of Findability
Findability by Findwise - 5 dimensions of FindabilityFindability by Findwise - 5 dimensions of Findability
Findability by Findwise - 5 dimensions of Findability
 
Presentation11
Presentation11Presentation11
Presentation11
 
Apache Solr - Enterprise search platform
Apache Solr - Enterprise search platformApache Solr - Enterprise search platform
Apache Solr - Enterprise search platform
 
Logganalys med Elastic & Findwise
Logganalys med Elastic & FindwiseLogganalys med Elastic & Findwise
Logganalys med Elastic & Findwise
 
Intelligent Search
Intelligent SearchIntelligent Search
Intelligent Search
 
Search Analytics in Practice
Search Analytics in PracticeSearch Analytics in Practice
Search Analytics in Practice
 
Findability Day 2015 - Mickel Grönroos - Findwise - How to increase safety on...
Findability Day 2015 - Mickel Grönroos - Findwise - How to increase safety on...Findability Day 2015 - Mickel Grönroos - Findwise - How to increase safety on...
Findability Day 2015 - Mickel Grönroos - Findwise - How to increase safety on...
 
Going open source first
Going open source firstGoing open source first
Going open source first
 
Intelligent Search
Intelligent SearchIntelligent Search
Intelligent Search
 
Digital Workplace, past, present and future
Digital Workplace, past, present and futureDigital Workplace, past, present and future
Digital Workplace, past, present and future
 
Trends in content analytics
Trends in content analyticsTrends in content analytics
Trends in content analytics
 
Enterprise Search, Simple, Complex and Powerful
Enterprise Search, Simple, Complex and PowerfulEnterprise Search, Simple, Complex and Powerful
Enterprise Search, Simple, Complex and Powerful
 
Architecture of Search Systems and Measuring the Search Effectiveness
Architecture of Search Systems and Measuring the Search EffectivenessArchitecture of Search Systems and Measuring the Search Effectiveness
Architecture of Search Systems and Measuring the Search Effectiveness
 
Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!
 
Findability Day 2016 - Enterprise Search and Findability Survey 2016
Findability Day 2016 - Enterprise Search and Findability Survey 2016Findability Day 2016 - Enterprise Search and Findability Survey 2016
Findability Day 2016 - Enterprise Search and Findability Survey 2016
 

Similar to Designing and Implementing Search Solutions

2010 10-building-global-listening-platform-with-solr
2010 10-building-global-listening-platform-with-solr2010 10-building-global-listening-platform-with-solr
2010 10-building-global-listening-platform-with-solr
Lucidworks (Archived)
 
"Hands Off! Best Practices for Code Hand Offs"
"Hands Off!  Best Practices for Code Hand Offs""Hands Off!  Best Practices for Code Hand Offs"
"Hands Off! Best Practices for Code Hand Offs"
Naomi Dushay
 
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
Agnes Molnar
 

Similar to Designing and Implementing Search Solutions (20)

How to SEO a Terrific - and Profitable - User Experience
How to SEO a Terrific - and Profitable - User ExperienceHow to SEO a Terrific - and Profitable - User Experience
How to SEO a Terrific - and Profitable - User Experience
 
Introducing Hydra – An Open Source Document Processing Framework
Introducing Hydra – An Open Source Document Processing FrameworkIntroducing Hydra – An Open Source Document Processing Framework
Introducing Hydra – An Open Source Document Processing Framework
 
2010 10-building-global-listening-platform-with-solr
2010 10-building-global-listening-platform-with-solr2010 10-building-global-listening-platform-with-solr
2010 10-building-global-listening-platform-with-solr
 
Hydra - Content Processing Framework for Search Driven Solutions
Hydra - Content Processing Framework for Search Driven SolutionsHydra - Content Processing Framework for Search Driven Solutions
Hydra - Content Processing Framework for Search Driven Solutions
 
Decoder Ring
Decoder RingDecoder Ring
Decoder Ring
 
"Hands Off! Best Practices for Code Hand Offs"
"Hands Off!  Best Practices for Code Hand Offs""Hands Off!  Best Practices for Code Hand Offs"
"Hands Off! Best Practices for Code Hand Offs"
 
NoSQL, which way to go?
NoSQL, which way to go?NoSQL, which way to go?
NoSQL, which way to go?
 
No SQL : Which way to go? Presented at DDDMelbourne 2015
No SQL : Which way to go?  Presented at DDDMelbourne 2015No SQL : Which way to go?  Presented at DDDMelbourne 2015
No SQL : Which way to go? Presented at DDDMelbourne 2015
 
MongoDB Basics
MongoDB BasicsMongoDB Basics
MongoDB Basics
 
394 wade word2007-ssp2008
394 wade word2007-ssp2008394 wade word2007-ssp2008
394 wade word2007-ssp2008
 
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
 
Enterprise Search @EPAM
Enterprise Search @EPAMEnterprise Search @EPAM
Enterprise Search @EPAM
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Drupal and Apache Stanbol
Drupal and Apache StanbolDrupal and Apache Stanbol
Drupal and Apache Stanbol
 
Benchmarking Domain-specific Expert Search using Workshop Program Committees
Benchmarking Domain-specific Expert Search using Workshop Program CommitteesBenchmarking Domain-specific Expert Search using Workshop Program Committees
Benchmarking Domain-specific Expert Search using Workshop Program Committees
 
Introduction to Structured Authoring
Introduction to Structured AuthoringIntroduction to Structured Authoring
Introduction to Structured Authoring
 
Software Programming with Python II.pptx
Software Programming with Python II.pptxSoftware Programming with Python II.pptx
Software Programming with Python II.pptx
 
Improving Search in Workday Products using Natural Language Processing
Improving Search in Workday Products using Natural Language ProcessingImproving Search in Workday Products using Natural Language Processing
Improving Search in Workday Products using Natural Language Processing
 
Navigating the Mess of a Shared drive Migration to SharePoint
Navigating the Mess of a Shared drive Migration to SharePointNavigating the Mess of a Shared drive Migration to SharePoint
Navigating the Mess of a Shared drive Migration to SharePoint
 
Using Transcription and Text Encoding in Digital Exhibits
Using Transcription and Text Encoding in Digital ExhibitsUsing Transcription and Text Encoding in Digital Exhibits
Using Transcription and Text Encoding in Digital Exhibits
 

More from Findwise

Findability Day 2015 Liam Holley - Dassault systems - Insight and discovery...
Findability Day 2015   Liam Holley - Dassault systems - Insight and discovery...Findability Day 2015   Liam Holley - Dassault systems - Insight and discovery...
Findability Day 2015 Liam Holley - Dassault systems - Insight and discovery...
Findwise
 

More from Findwise (20)

White Arkitekter - Findability Day Roadshow 2017
White Arkitekter - Findability Day Roadshow 2017White Arkitekter - Findability Day Roadshow 2017
White Arkitekter - Findability Day Roadshow 2017
 
AI och maskininlärning - Findability Day Roadshow 2017
AI och maskininlärning - Findability Day Roadshow 2017AI och maskininlärning - Findability Day Roadshow 2017
AI och maskininlärning - Findability Day Roadshow 2017
 
De kognitiva eran med IBM Watson - Findability Day Roadshow 2017
De kognitiva eran med IBM Watson - Findability Day Roadshow 2017De kognitiva eran med IBM Watson - Findability Day Roadshow 2017
De kognitiva eran med IBM Watson - Findability Day Roadshow 2017
 
Findwise and IBM Watson
Findwise and IBM WatsonFindwise and IBM Watson
Findwise and IBM Watson
 
Findability Day 2016 - Enterprise Search and Findability Survey 2016
Findability Day 2016 - Enterprise Search and Findability Survey 2016Findability Day 2016 - Enterprise Search and Findability Survey 2016
Findability Day 2016 - Enterprise Search and Findability Survey 2016
 
Findability Day 2016 - Big data analytics and machine learning
Findability Day 2016 - Big data analytics and machine learningFindability Day 2016 - Big data analytics and machine learning
Findability Day 2016 - Big data analytics and machine learning
 
Findability Day 2016 - Enterprise social collaboration
Findability Day 2016 - Enterprise social collaborationFindability Day 2016 - Enterprise social collaboration
Findability Day 2016 - Enterprise social collaboration
 
Findability Day 2016 - SKF case study
Findability Day 2016 - SKF case studyFindability Day 2016 - SKF case study
Findability Day 2016 - SKF case study
 
Findability Day 2016 - Structuring content for user experience
Findability Day 2016 - Structuring content for user experienceFindability Day 2016 - Structuring content for user experience
Findability Day 2016 - Structuring content for user experience
 
Findability Day 2016 - Augmented intelligence
Findability Day 2016 - Augmented intelligenceFindability Day 2016 - Augmented intelligence
Findability Day 2016 - Augmented intelligence
 
Findability Day 2016 - What is GDPR?
Findability Day 2016 - What is GDPR?Findability Day 2016 - What is GDPR?
Findability Day 2016 - What is GDPR?
 
Findability Day 2016 - Get started with GDPR
Findability Day 2016 - Get started with GDPRFindability Day 2016 - Get started with GDPR
Findability Day 2016 - Get started with GDPR
 
Digital workplace och informationshantering i office 365
Digital workplace och informationshantering i office 365Digital workplace och informationshantering i office 365
Digital workplace och informationshantering i office 365
 
Findability Day 2015 - Abby Covert - Keynote - How to make sense of any mess
Findability Day 2015 - Abby Covert - Keynote - How to make sense of any messFindability Day 2015 - Abby Covert - Keynote - How to make sense of any mess
Findability Day 2015 - Abby Covert - Keynote - How to make sense of any mess
 
Findability Day 2015 - Noel Garry - IBM - Information governance and a 360 de...
Findability Day 2015 - Noel Garry - IBM - Information governance and a 360 de...Findability Day 2015 - Noel Garry - IBM - Information governance and a 360 de...
Findability Day 2015 - Noel Garry - IBM - Information governance and a 360 de...
 
Findability Day 2015 Mattias Ellison - Findwise - Enterprise Search and fin...
Findability Day 2015   Mattias Ellison - Findwise - Enterprise Search and fin...Findability Day 2015   Mattias Ellison - Findwise - Enterprise Search and fin...
Findability Day 2015 Mattias Ellison - Findwise - Enterprise Search and fin...
 
Findability Day 2015 - Martin White - The future is search!
Findability Day 2015 - Martin White - The future is search!Findability Day 2015 - Martin White - The future is search!
Findability Day 2015 - Martin White - The future is search!
 
Findability Day 2015 Liam Holley - Dassault systems - Insight and discovery...
Findability Day 2015   Liam Holley - Dassault systems - Insight and discovery...Findability Day 2015   Liam Holley - Dassault systems - Insight and discovery...
Findability Day 2015 Liam Holley - Dassault systems - Insight and discovery...
 
Findability Day 2015 Joachim Dahl - Virtual Works - 360 degree view of the ...
Findability Day 2015   Joachim Dahl - Virtual Works - 360 degree view of the ...Findability Day 2015   Joachim Dahl - Virtual Works - 360 degree view of the ...
Findability Day 2015 Joachim Dahl - Virtual Works - 360 degree view of the ...
 
Findability Day 2015 Anders Fors - Volvo Bus - A cost efficient R&D with EX...
Findability Day 2015   Anders Fors - Volvo Bus - A cost efficient R&D with EX...Findability Day 2015   Anders Fors - Volvo Bus - A cost efficient R&D with EX...
Findability Day 2015 Anders Fors - Volvo Bus - A cost efficient R&D with EX...
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software Engineering
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using Ballerina
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation Computing
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 

Designing and Implementing Search Solutions

  • 1. Implementing and designing search solutions Gothenburg University – Gothenburg – 2012-03-08 © FINDWISE 2012
  • 2. Agenda • Introduction to Findwise • Technical approach • DIY UX design • Research
  • 3. About Findwise • Founded in 2005 • Offices in Sweden, Denmark, Norway and Poland • 72 employees (February 2012) • Our objective is to be a leading provider of Findability solutions utilising the full potential of search technology to create customer business value
  • 4. Technology independent Creating search-driven Findability solutions based on market-leading commercial and open source search technology platforms:  Autonomy IDOL  Microsoft (SharePoint and FAST Search products)  Google GSA  IBM ICA/OmniFind  LucidWorks  Apache Lucene/Solr (Open source)  and more…
  • 5. Findability Challenges Employee productivity (DN article, March 2011): ”The effort to find the right information costs an average company 80,000 SEK per employee and year” Customer Service quality and efficiency (Accenture report, March 2011): “69% of agents don't have answers to help service customers” E-commerce conversion rate (Google survey, December 2010): “77% of those surveyed used search within an e-commerce website to find products”
  • 7. A search engine alone is not enough
  • 13. Search core - overview Documents Inverted index Title: Brown fox Term Documents Content: The quick brown fox jumps over the lazy dog Tokenization … … Stemming Author: Tobias Berg Stop-word fox 1 … jump 1,2 Title: My dog lazy 1 Content: My old dog cannot jump anymore dog 1,2 Author: Svetoslav Marinov tobias 1 berg 1 … …
  • 14. Relevancy Retrieved documents Relevant documents • Precision – how many of the retrieved documents are relevant? • Recall – how many of the relevant documents were retrieved?
  • 15. Relevancy Recall find everything related to the query Goal - lemmatization Improve precision, - synonyms without sacrificing recall - wildcards - anti-phrasing - or-operator Precision find only entities related to the query - exact word matching - exact phrase matching - and-operator
  • 16. Search core – relevance score • TF/IDF • Field length • Field weight • Title *2 • Author *4 • Content *1 • Freshness • …
  • 17. Search Core • Optimized for full-text search {query} {result} • Sub-second responses • Tunable relevance • Scalable Find Score matching documents • Configurable & Extendable documents
  • 20. Connectors – fetch data Id Product Description Price name 1 Wheel Makes the bus go 45 Database round round round connector 2 Window A shield of 12 glass Id Book name Abstract Author 1 Ulysses Irish novel James Joyce Database 2 Crime and Russion Dostoevsky, connector Punishment novel Fyodor
  • 21. Connector framework – code example public void execute() { //Insert code to fetch content } public void interrupt() { //Insert code to handle interrupt signal } public void init() { //Insert code to initialize connnector }
  • 22. Connector Frameworks • Existing connectors • Re-usable • Configuration interfaces http://incubator.apache.org/connectors/ • Standardized implementation http://code.google.com/p/google-enterprise- connector-manager/
  • 25. Pipeline - overview • PDF/Office -> Text • Lemmatization • Language identification • NER • Phonetic search • Keyword extraction • External calls • …
  • 26. Pipeline framework – code example protected void addAction(Document doc) throws PipelineException { //Insert code doc.addField(“Title”,”Hello world!”); } protected void updateAction(Document doc) throws PipelineException { //Insert code addAction(item); } protected void deleteAction(Document doc) throws PipelineException { //Insert code }
  • 27. NLP tools and approaches • Open source: GATE, OpenNLP, UIMA, StanfordNLP, Mallet, Apache Mahout • Proprietary: IBM LanguageWare • Own components: e.g. KeywordExtraction Service; LanguageIdentify • POS taggers – Hunpos, OpenNLP, Mallet • Dependency Parsers – MaltParser, StanfordParser • NER – rule-based + statistical models • Document summarization • Document clustering
  • 29. Pipeline frameworks • Re-usable stages http://www.openpipeline.com/ • Configuration interface • Focus on task Findwise Hydra http://www.pypes.org/
  • 30. Putting it all together
  • 31. What the frell is UX design?
  • 32. What the frell is UX design? • Interaction design • Usability Engineering • Information Architecture • Visual Design
  • 33. Findwise UX design principles Users want results Dialogue not monologue Participation builds trust Answer frequent questions Simple but powerful
  • 39. Findwise UX design principles Users want results Dialogue not monologue Participation builds trust Answer frequent questions Simple but powerful
  • 41. DIY UX design Design research Analytics Usability tests Iterate!
  • 42. Design research • Be easy to reach – keep contact • Let users requests guide you when prioritizing new features • Listen & try to discover the underlying problem • Try to find out what the user needs not what they say they want
  • 43. Analytics • Web analytics • Search analytics • A/B testing
  • 44. Usability tests • Test early - test often • Use sketches, paper prototypes, static prototypes and working prototypes! • Create real tasks or problems • Don’t ask them how they would want it • Test on friends and family or colleagues
  • 46. Why UX design? • Improved requirements • Better feedback • Eliminate bias • Less development time
  • 47. Summary • Listen & try to discover the underlying problem • Search analytics – Top queries • Do usability tests early & often • Iterate!
  • 48. Research • Collaboration with Universities GU, Borås, KTH, Copenhangen U. • EU projects RUSHES • Master’s Thesis supervision Chalmers, KTH, Lund
  • 49. Master’s Thesis projects • A way to test ideas • A way to recruit people • A way to cooperate with Universities • Keyword Extraction • Document Clustering • NER • Document summarization • Extracting structural information from text • Query log analysis
  • 50. Resources - books • The design of everyday things • Don’t make me think • Search analytics for your site • ManifoldCF in Action • Taming Text
  • 51. Shameless plug twitter.com/findwise slideshare.net/findwise findabilityblog.se findwise.com
  • 52. Tobias Berg tobias.berg@findwise.com Thanks! Björn Klockljung Johansson bjorn.klockljung.johansson@findwise.com Svetoslav Marinov svetoslav.marinov@findwise.com