SlideShare a Scribd company logo
Search + Big Data:
It’s (still) All About the User
Grant Ingersoll, Chief Scientist – Lucid Imagination
           grant@lucidimagination.com
                 October 19, 2011
Promise and Reality

  “Data is increasingly digital air: the oxygen we
 breathe and the carbon dioxide that we exhale. It
can be a source of both sustenance and pollution.”
            Six Provocations for Big Data
          by Danah Boyd and Kate Crawford



  “The truth is, I spend most of my time trying to
reduce the size of my data so it can be analyzed.”
      Hilary Mason, Chief Scientist, Bitly @ Strata
Pragmatism
Evolution
                           Documents
                           • Models
                           • Feature Selection




                                                 User
                                                 Interaction
       Content
                                                 • Clicks
       Relationships                             • Ratings/
       • Page Rank, etc.                          Reviews
       • Organization                            • Learning to
                                                  Rank
                                                 • Social Graph




                               Queries
                               • Phrases
                               • NLP
Minding the Intersection


                   Search




       Analytics            Discovery
Benefits
§  End users
   •  Better relevance/conversion
   •  Serendipity
   •  Better/faster insight


§  Business:
   •    ROI
   •    Awareness across organization
   •    Enablement
   •    Agility
Needs
§  Fast, efficient, scalable search
§  Large scale, cost effective storage
§  Processing Power:
   •  Large scale distributed for whole data consumption
   •  Streaming/In Memory for real time needs
   •  Ability to learn


§  Willingness to ask questions
The Good News
Search
§  Good scalable, search a given
   •  Talks: Chitouras, Sturlese, Binns, Miller


§  Custom Relevancy via function queries, boosts
§  Explore other relevance models
   •  Talks: Muir, Pugh
   •  Lucene/Solr trunk has pluggable scoring (BM25, etc.)


§  NRT for timeliness
   •  Talks: Busch
Discovery
Facets
  •  Talks: Yonik
  •  Classification, Taxonomy
Clustering
  •  Talk: Frank S.
Suggestions
  •  Auto-suggest, Spelling,
     More Like This,
     Related Searches, search trails
Visualization
Analytics
Analytics for End Users
Offline                         Online
   •    Popularity/Click          •  Trends/Stats
   •    Link Analysis
   •    Search Trails             •  Social/Personal
   •    Recommendations
   •    Spellchecking weights     •  Location
   •    Collocations


                                         STORM
Analytics for Internal Users
Offline                         Online
   •    Top X                     •  Trends
   •    Zero results
   •    MRR, MAP                  •  Operational alerts
   •    User segmentation            (QPS,
   •    Location, conversions        DPS, etc)
   •    Ad hoc Analysis


                                   GIRAPH
What’s Missing?
§  The glue is up to you (us?)
   •    Lucene Index -> Pig/Others
   •    Mahout -> Pig/Others
   •    Mahout -> Lucene/Solr
   •    Logs -> Pig/Others


§  Nice to have:
   •  More in-index functionality (that performs)
         §  Aggregations
         §  Arbitrary stats
         §  Complex Joins
What’s Next?

“I can have all the data I want to have – but I still
  have to communicate it to our players. It has to
  get into their minds. And they have to utilize it. ”
        Brad Stevens, Head Basketball Coach,
     Butler University in Oct. ‘11 McKinsey Quarterly
Thanks!


§  http://www.lucidimagination.com

§  @gsingers

§  grant@lucidimagination.com

§  stump@lucene-eurocon.com
Lucene Ecosystem




               Spark   Storm
              Giraph
Lucene Ecosystem




               Spark   Storm
              Giraph

More Related Content

Similar to Search + Big Data: It's (still) All About the User- Grant Ingersoll

IxDA UX Research Mentoring Circle - 4. Analysing Data and Presenting Findings
IxDA UX Research Mentoring Circle - 4. Analysing Data and Presenting FindingsIxDA UX Research Mentoring Circle - 4. Analysing Data and Presenting Findings
IxDA UX Research Mentoring Circle - 4. Analysing Data and Presenting Findings
Jieyun Yang
 
Introduction to Information Architecture & Design - SVA Workshop 10/04/14
Introduction to Information Architecture & Design - SVA Workshop 10/04/14Introduction to Information Architecture & Design - SVA Workshop 10/04/14
Introduction to Information Architecture & Design - SVA Workshop 10/04/14
Robert Stribley
 
Introduction to Information Architecture & Design - 12/06/14
Introduction to Information Architecture & Design - 12/06/14Introduction to Information Architecture & Design - 12/06/14
Introduction to Information Architecture & Design - 12/06/14
Robert Stribley
 
The Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive Think Tank: Machine Learning at Pinterest by Jure LeskovecThe Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive
 
Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...
Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...
Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...
Yandex
 
Building Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media AnalysisBuilding Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media AnalysisOpen Analytics
 
Introduction to Information Architecture & Design - 3/19/16
Introduction to Information Architecture & Design - 3/19/16Introduction to Information Architecture & Design - 3/19/16
Introduction to Information Architecture & Design - 3/19/16
Robert Stribley
 
Introduction to Information Retrieval
Introduction to Information RetrievalIntroduction to Information Retrieval
Introduction to Information Retrieval
Roi Blanco
 
Introduction to Information Architecture & Design - 6/25/16
Introduction to Information Architecture & Design - 6/25/16Introduction to Information Architecture & Design - 6/25/16
Introduction to Information Architecture & Design - 6/25/16
Robert Stribley
 
Warming Up to Analytics
Warming Up to AnalyticsWarming Up to Analytics
Warming Up to Analytics
Lewandog, Inc,
 
Introduction to Information Architecture & Design - 2/13/16
Introduction to Information Architecture & Design - 2/13/16Introduction to Information Architecture & Design - 2/13/16
Introduction to Information Architecture & Design - 2/13/16
Robert Stribley
 
Introduction to Information Architecture & Design - 6/24/17
Introduction to Information Architecture & Design - 6/24/17Introduction to Information Architecture & Design - 6/24/17
Introduction to Information Architecture & Design - 6/24/17
Robert Stribley
 
Share point 2013 the way to go...
Share point 2013 the way to go...Share point 2013 the way to go...
Share point 2013 the way to go...
K.Mohamed Faizal
 
Alla ricerca della User Story perduta
Alla ricerca della User Story perdutaAlla ricerca della User Story perduta
Alla ricerca della User Story perdutaEdoardo Schepis
 
Alla ricerca della user story perduta
Alla ricerca della user story perdutaAlla ricerca della user story perduta
Alla ricerca della user story perduta
Better Software
 
ASA conference Feb 2013
ASA conference Feb 2013ASA conference Feb 2013
ASA conference Feb 2013
mrkwr
 
IA breakfast briefing apr12 upload
IA breakfast briefing apr12 uploadIA breakfast briefing apr12 upload
IA breakfast briefing apr12 uploadRoss Philip
 
UXD v. Analytics - eMetrics 2013 San Francisco
UXD v. Analytics - eMetrics 2013 San FranciscoUXD v. Analytics - eMetrics 2013 San Francisco
UXD v. Analytics - eMetrics 2013 San Francisco
Chris Farnum
 
Evaluating search engines
Evaluating search enginesEvaluating search engines
Evaluating search engines
Phil Bradley
 

Similar to Search + Big Data: It's (still) All About the User- Grant Ingersoll (20)

Duncan product tank
Duncan product tankDuncan product tank
Duncan product tank
 
IxDA UX Research Mentoring Circle - 4. Analysing Data and Presenting Findings
IxDA UX Research Mentoring Circle - 4. Analysing Data and Presenting FindingsIxDA UX Research Mentoring Circle - 4. Analysing Data and Presenting Findings
IxDA UX Research Mentoring Circle - 4. Analysing Data and Presenting Findings
 
Introduction to Information Architecture & Design - SVA Workshop 10/04/14
Introduction to Information Architecture & Design - SVA Workshop 10/04/14Introduction to Information Architecture & Design - SVA Workshop 10/04/14
Introduction to Information Architecture & Design - SVA Workshop 10/04/14
 
Introduction to Information Architecture & Design - 12/06/14
Introduction to Information Architecture & Design - 12/06/14Introduction to Information Architecture & Design - 12/06/14
Introduction to Information Architecture & Design - 12/06/14
 
The Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive Think Tank: Machine Learning at Pinterest by Jure LeskovecThe Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
 
Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...
Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...
Чираг Шах «Коллективный поиск, взаимодействие пользователей: подходы к изучен...
 
Building Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media AnalysisBuilding Effective Frameworks for Social Media Analysis
Building Effective Frameworks for Social Media Analysis
 
Introduction to Information Architecture & Design - 3/19/16
Introduction to Information Architecture & Design - 3/19/16Introduction to Information Architecture & Design - 3/19/16
Introduction to Information Architecture & Design - 3/19/16
 
Introduction to Information Retrieval
Introduction to Information RetrievalIntroduction to Information Retrieval
Introduction to Information Retrieval
 
Introduction to Information Architecture & Design - 6/25/16
Introduction to Information Architecture & Design - 6/25/16Introduction to Information Architecture & Design - 6/25/16
Introduction to Information Architecture & Design - 6/25/16
 
Warming Up to Analytics
Warming Up to AnalyticsWarming Up to Analytics
Warming Up to Analytics
 
Introduction to Information Architecture & Design - 2/13/16
Introduction to Information Architecture & Design - 2/13/16Introduction to Information Architecture & Design - 2/13/16
Introduction to Information Architecture & Design - 2/13/16
 
Introduction to Information Architecture & Design - 6/24/17
Introduction to Information Architecture & Design - 6/24/17Introduction to Information Architecture & Design - 6/24/17
Introduction to Information Architecture & Design - 6/24/17
 
Share point 2013 the way to go...
Share point 2013 the way to go...Share point 2013 the way to go...
Share point 2013 the way to go...
 
Alla ricerca della User Story perduta
Alla ricerca della User Story perdutaAlla ricerca della User Story perduta
Alla ricerca della User Story perduta
 
Alla ricerca della user story perduta
Alla ricerca della user story perdutaAlla ricerca della user story perduta
Alla ricerca della user story perduta
 
ASA conference Feb 2013
ASA conference Feb 2013ASA conference Feb 2013
ASA conference Feb 2013
 
IA breakfast briefing apr12 upload
IA breakfast briefing apr12 uploadIA breakfast briefing apr12 upload
IA breakfast briefing apr12 upload
 
UXD v. Analytics - eMetrics 2013 San Francisco
UXD v. Analytics - eMetrics 2013 San FranciscoUXD v. Analytics - eMetrics 2013 San Francisco
UXD v. Analytics - eMetrics 2013 San Francisco
 
Evaluating search engines
Evaluating search enginesEvaluating search engines
Evaluating search engines
 

More from lucenerevolution

Text Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and LuceneText Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and Lucene
lucenerevolution
 
State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here!
lucenerevolution
 
Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solr
lucenerevolution
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
lucenerevolution
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloud
lucenerevolution
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusters
lucenerevolution
 
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
lucenerevolution
 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs
lucenerevolution
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchlucenerevolution
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
lucenerevolution
 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?
lucenerevolution
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST API
lucenerevolution
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucene
lucenerevolution
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
lucenerevolution
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucene
lucenerevolution
 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenal
lucenerevolution
 
Turning search upside down
Turning search upside downTurning search upside down
Turning search upside down
lucenerevolution
 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
lucenerevolution
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - finallucenerevolution
 

More from lucenerevolution (20)

Text Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and LuceneText Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and Lucene
 
State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here!
 
Search at Twitter
Search at TwitterSearch at Twitter
Search at Twitter
 
Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solr
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloud
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusters
 
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST API
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucene
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucene
 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenal
 
Turning search upside down
Turning search upside downTurning search upside down
Turning search upside down
 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - final
 

Recently uploaded

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 

Search + Big Data: It's (still) All About the User- Grant Ingersoll

  • 1. Search + Big Data: It’s (still) All About the User Grant Ingersoll, Chief Scientist – Lucid Imagination grant@lucidimagination.com October 19, 2011
  • 2. Promise and Reality “Data is increasingly digital air: the oxygen we breathe and the carbon dioxide that we exhale. It can be a source of both sustenance and pollution.” Six Provocations for Big Data by Danah Boyd and Kate Crawford “The truth is, I spend most of my time trying to reduce the size of my data so it can be analyzed.” Hilary Mason, Chief Scientist, Bitly @ Strata
  • 4. Evolution Documents • Models • Feature Selection User Interaction Content • Clicks Relationships • Ratings/ • Page Rank, etc. Reviews • Organization • Learning to Rank • Social Graph Queries • Phrases • NLP
  • 5. Minding the Intersection Search Analytics Discovery
  • 6. Benefits §  End users •  Better relevance/conversion •  Serendipity •  Better/faster insight §  Business: •  ROI •  Awareness across organization •  Enablement •  Agility
  • 7. Needs §  Fast, efficient, scalable search §  Large scale, cost effective storage §  Processing Power: •  Large scale distributed for whole data consumption •  Streaming/In Memory for real time needs •  Ability to learn §  Willingness to ask questions
  • 9. Search §  Good scalable, search a given •  Talks: Chitouras, Sturlese, Binns, Miller §  Custom Relevancy via function queries, boosts §  Explore other relevance models •  Talks: Muir, Pugh •  Lucene/Solr trunk has pluggable scoring (BM25, etc.) §  NRT for timeliness •  Talks: Busch
  • 10. Discovery Facets •  Talks: Yonik •  Classification, Taxonomy Clustering •  Talk: Frank S. Suggestions •  Auto-suggest, Spelling, More Like This, Related Searches, search trails Visualization
  • 12. Analytics for End Users Offline Online •  Popularity/Click •  Trends/Stats •  Link Analysis •  Search Trails •  Social/Personal •  Recommendations •  Spellchecking weights •  Location •  Collocations STORM
  • 13. Analytics for Internal Users Offline Online •  Top X •  Trends •  Zero results •  MRR, MAP •  Operational alerts •  User segmentation (QPS, •  Location, conversions DPS, etc) •  Ad hoc Analysis GIRAPH
  • 14. What’s Missing? §  The glue is up to you (us?) •  Lucene Index -> Pig/Others •  Mahout -> Pig/Others •  Mahout -> Lucene/Solr •  Logs -> Pig/Others §  Nice to have: •  More in-index functionality (that performs) §  Aggregations §  Arbitrary stats §  Complex Joins
  • 15. What’s Next? “I can have all the data I want to have – but I still have to communicate it to our players. It has to get into their minds. And they have to utilize it. ” Brad Stevens, Head Basketball Coach, Butler University in Oct. ‘11 McKinsey Quarterly
  • 16. Thanks! §  http://www.lucidimagination.com §  @gsingers §  grant@lucidimagination.com §  stump@lucene-eurocon.com
  • 17. Lucene Ecosystem Spark Storm Giraph
  • 18. Lucene Ecosystem Spark Storm Giraph