SlideShare a Scribd company logo
1 of 39
Download to read offline
design for interaction
   Daniel Tunkelang
   Chief Scientist, Endeca

      © 2009 Endeca Technologies, Inc. All rights reserved.
about me




    Organizing SIGIR ’09 Industry Track in Boston on July 22nd!


2                       © 2009 Endeca Technologies, Inc. All rights reserved.
about endeca


     leading provider of
     search applications




         250M+
          end users
              per month
                                                                                       600+ customers
                                                                                   $100M+ annual sales




3                          © 2009 Endeca Technologies, Inc. All rights reserved.
what i hope you learn from this talk




     the db and ir perspectives have a common thread



              convergence may be upon us



         but we need interaction to make it work



4                    © 2009 Endeca Technologies, Inc. All rights reserved.
overview




          don't put all your eggs in one basket



                 design for interaction



         human-computer information retrieval



5                   © 2009 Endeca Technologies, Inc. All rights reserved.
don’t put all your eggs in one basket




              Still Life with Basket and Broken Eggs by Michael Edwards, 2008




6                            © 2009 Endeca Technologies, Inc. All rights reserved.
the db approach: perfection in, perfection out




              http://www.storeitfoodsblog.com/category/food-preparation/meat-grinder/




7                             © 2009 Endeca Technologies, Inc. All rights reserved.
db usability researchers recognize the pain




8                   © 2009 Endeca Technologies, Inc. All rights reserved.
sql is hard


    Making Database Systems Usable
    [Jagadish et al., SIGMOD 2007]
                                                                              __
                                                                              sql


    • labor-intensive query construction

    • lengthy query evaluation

    • high query reformulation cost




9                     © 2009 Endeca Technologies, Inc. All rights reserved.
data sucks and users are lazy


     Extracting Problems for Database
     and IR Researchers
     [Naughton, Spring 2008 North East DB/IR Day]

     • real data is
        – incomplete
        – inconsistent
        – incorrect


     • users don’t want to learn
        – data schemas
        – structured query languages                                      we’re not gonna take it!



10                         © 2009 Endeca Technologies, Inc. All rights reserved.
the ir way: don’t worry, be happy




                http://adsoftheworld.com/media/print/mcdonalds_burger_mysteries



11                          © 2009 Endeca Technologies, Inc. All rights reserved.
ir for db people: what would google do?


                                        tf-idf                                      PageRank
 SYSTEM:



                                                 rank using IR model




 USER:




     information Need       query                                               select from results


12                      © 2009 Endeca Technologies, Inc. All rights reserved.
assumptions of relevance-centric ir approach



                                              • self-awareness

                                              • self-expression

                                              • model knows best

                                              • answer is a document

                                              • one-shot query


13                  © 2009 Endeca Technologies, Inc. All rights reserved.
life is not a batch


     • db approach expects too much of user
     • ir approach expects too much of system



              both approaches act as if it all
              comes down to a single query




                  is that your final answer question?


14                      © 2009 Endeca Technologies, Inc. All rights reserved.
design for interaction




                   The Future of Social Interaction by Jim Stoten




15                       © 2009 Endeca Technologies, Inc. All rights reserved.
changes assumptions about what to optimize




                                                                           precision
                                                                                         recall
          complexity                                                                   relevance




                            communication


16                     © 2009 Endeca Technologies, Inc. All rights reserved.
how do we optimize communication?




           transparency

                                                                                  guidance




                  control

17                        © 2009 Endeca Technologies, Inc. All rights reserved.
ir offers a black box




           ca c'est la caisse. le mouton que tu veux est dedans.




18                        © 2009 Endeca Technologies, Inc. All rights reserved.
db / set retrieval offers 2 out of 3




            transparency

                                                                                   guidance




                   control

19                         © 2009 Endeca Technologies, Inc. All rights reserved.
but we need it all!


     • set retrieval is a failure in the ir world
        – though quite successful in the db world!


     • but ranked retrieval is inherently crippled
        – no transparency, control, or guidance!




        how do we optimize for communication?




20                        © 2009 Endeca Technologies, Inc. All rights reserved.
human-computer information retrieval



                          “Toward Human-Computer
                           Information Retrieval”

                          Gary Marchionini


     • don’t just guess the user’s intent
     • increase user responsibility and control
     • require and reward human intellectual effort




21                     © 2009 Endeca Technologies, Inc. All rights reserved.
great idea




                                  how?




22                © 2009 Endeca Technologies, Inc. All rights reserved.
treat query construction as a process


     A Case for Interaction
     [Koenemann and Belkin, 1996]

     • used term feedback to improve alerting queries

     • users select from suggested terms

     • 17 – 34% improvement in precision @ 30

     • users liked the feedback interface


23                    © 2009 Endeca Technologies, Inc. All rights reserved.
expose the facets of semistructured content




24                  © 2009 Endeca Technologies, Inc. All rights reserved.
success in the lab and the field


     • favored in user studies by Marti Hearst
        – http://flamenco.berkeley.edu/


     • ubiquitous in ecommerce
        – amazon.com
        – eBay
        – endeca powers 42 of top 100 online retailers


     • taking over media, libraries, enterprise, etc.




25                       © 2009 Endeca Technologies, Inc. All rights reserved.
even a few db folks have drunk the kool-aid


     DataGuides
     [Goldman and Widom, VLDB 1997]
     • user-friendly schema summaries


     Magnet
     [Sinha and Karger, SIGMOD 2005]
     • navigation and refinement options

             common theme: semistructured


26                    © 2009 Endeca Technologies, Inc. All rights reserved.
what is semistructured data?




                                             • one universe

                                             • self-describing

                                             • blends data / meta-data




27                  © 2009 Endeca Technologies, Inc. All rights reserved.
data modeling flexibility


     • no a-priori schema
        – integrated sources without up-front schema design


     • richer modeling capabilities tame data complexity
        – hierarchy, multi-valued fields, sparse fields


     • schema flexibility eases schema evolution
        – new entity types, new data source




                   WWW                               SOA, ESB,               Groupware and            Content
      Databases                                                                               ERP
                  Internet   File Systems           Web Service               Collaboration         Management




28                           © 2009 Endeca Technologies, Inc. All rights reserved.
semantically direct queries


                                                               which attributes
            which on-sale items                                characterize on-sale
            are available in blue?                             blue items?

                                                                                        price, sleeve,
                                                                                        color, salePrice,
                                                                                        brand, fabric, …




           <shirt>
                                                        <buyingGuide>
                 <sku>1234</sku>
                                                              <title>Selecting the right
                 <sleeve>Long</sleeve>
                                                                  ski coat for you.</title>
                 <desc>Classic end-on-end shirt</desc>
                                                              <file>skiguide.pdf</file>
                 <price>39.99</price>
                                                              <keyword>ski</keyword>
                 <salePrice>29.99</salePrice>
                                                              <keyword>coat</keyword>
                 <color>Blue</color>
                                                              ...
                 <color>Yellow</color>
                                                        </buyingGuide>
                 <color>White</color>
                 ...
           </shirt>               <trousers>
                                        <sku>1579</sku>
                                        <price>59.99</price>
                                        <color>Khaki</color>
                                        ...
                                  </trousers>


29                              © 2009 Endeca Technologies, Inc. All rights reserved.
but let’s make this concrete


                         Uh oh, I’m presenting at
                        SIGMOD! Better find a good
                          book about databases!




30                   © 2009 Endeca Technologies, Inc. All rights reserved.
quick, to the goog-mobile!




                                                                         not quite…




31                   © 2009 Endeca Technologies, Inc. All rights reserved.
i know, i’ll go to the library!




                                                                               #%@$!




32                     © 2009 Endeca Technologies, Inc. All rights reserved.
let’s try a little hcir…




33                     © 2009 Endeca Technologies, Inc. All rights reserved.
hcir works for news too




34                  © 2009 Endeca Technologies, Inc. All rights reserved.
life in a semistructured world


     • search is a great starting point
        – users can’t / won’t initiate structured queries


     • ranked lists are an inadequate ending point
        – search queries are lossy projections of intent


     • hcir leads users down a garden path to structure




35                        © 2009 Endeca Technologies, Inc. All rights reserved.
lots of trade-offs


     “everything should be made as simple
      as possible, but no simpler”

     “speed of thought” vs. “going nowhere quickly”

     “to err is human, but to really foul
      things up requires a computer”

                   simple interfaces don’t
                  always yield satisfaction


36                      © 2009 Endeca Technologies, Inc. All rights reserved.
users want the triumvirate


     • transparency
     • control
     • guidance



           transparency and control are easy

              guidance requires cleverness




37                    © 2009 Endeca Technologies, Inc. All rights reserved.
in closing




      all of us want to help people access information



        the best help is to help them help themselves



                design for interaction though
              transparency, control, guidance


38                    © 2009 Endeca Technologies, Inc. All rights reserved.
thank you…and come to SIGIR!


                communication 1.0
               email: dt@endeca.com

                 communication 2.0
          blog: http://thenoisychannel.com
        twitter: http://twitter.com/dtunkelang

            SIGIR: July 19-23 in Boston
            Industry Track on July 22nd!


39                 © 2009 Endeca Technologies, Inc. All rights reserved.

More Related Content

What's hot

Intel Corporation - BA401
Intel Corporation - BA401Intel Corporation - BA401
Intel Corporation - BA401guest3ea4529f
 
Emc - Journey to the Cloud - Business Agility Seminar
Emc - Journey to the Cloud - Business Agility SeminarEmc - Journey to the Cloud - Business Agility Seminar
Emc - Journey to the Cloud - Business Agility SeminarExponential_e
 
Novell Tour Europe and South Africa 2012
Novell Tour Europe and South Africa 2012Novell Tour Europe and South Africa 2012
Novell Tour Europe and South Africa 2012Werner Luetkemeier
 
Bbx Biz Plan Presentation
Bbx Biz Plan PresentationBbx Biz Plan Presentation
Bbx Biz Plan PresentationPaul Brisson
 
MDD: Models, frameworks, & code generation
MDD: Models, frameworks, & code generationMDD: Models, frameworks, & code generation
MDD: Models, frameworks, & code generationPedro J. Molina
 
Cloud Communications: Top 5 Advantages for Your Enterprise
Cloud Communications: Top 5 Advantages for Your EnterpriseCloud Communications: Top 5 Advantages for Your Enterprise
Cloud Communications: Top 5 Advantages for Your EnterpriseXO Communications
 
Discovering Computers: Chapter 03
Discovering Computers: Chapter 03Discovering Computers: Chapter 03
Discovering Computers: Chapter 03Anna Stirling
 
Fun and games for profit
Fun and games for profitFun and games for profit
Fun and games for profitVenu Vasudevan
 
Code Generation for Conceptual User Interface Patterns
Code Generation for Conceptual User Interface PatternsCode Generation for Conceptual User Interface Patterns
Code Generation for Conceptual User Interface PatternsPedro J. Molina
 
Irfan Ur Rehman
Irfan Ur RehmanIrfan Ur Rehman
Irfan Ur Rehmanmrcool2002
 
Dispelling the mystery around resource planning revc
Dispelling the mystery around resource planning revcDispelling the mystery around resource planning revc
Dispelling the mystery around resource planning revckdelcol
 
HP Open Stack Keynote 4 18_2012 final
HP Open Stack Keynote 4 18_2012 finalHP Open Stack Keynote 4 18_2012 final
HP Open Stack Keynote 4 18_2012 finallaurabeckcahoon
 
Business made Social - How social technologies and behaviour are changing the...
Business made Social - How social technologies and behaviour are changing the...Business made Social - How social technologies and behaviour are changing the...
Business made Social - How social technologies and behaviour are changing the...Stefan Pfeiffer
 
Exploring the future of the IT industry and the next generation CIO
Exploring the future of the IT industry and the next generation CIOExploring the future of the IT industry and the next generation CIO
Exploring the future of the IT industry and the next generation CIOJessvin Thomas
 
Congressional it reform-roadmap_2011
Congressional it reform-roadmap_2011Congressional it reform-roadmap_2011
Congressional it reform-roadmap_2011John Weiler
 
Mobile advisor zenprise-pitch - lars
Mobile advisor zenprise-pitch - larsMobile advisor zenprise-pitch - lars
Mobile advisor zenprise-pitch - larsLars Bodenhoff
 

What's hot (19)

Intel Corporation - BA401
Intel Corporation - BA401Intel Corporation - BA401
Intel Corporation - BA401
 
Emc - Journey to the Cloud - Business Agility Seminar
Emc - Journey to the Cloud - Business Agility SeminarEmc - Journey to the Cloud - Business Agility Seminar
Emc - Journey to the Cloud - Business Agility Seminar
 
Emc expoesymposium
Emc expoesymposiumEmc expoesymposium
Emc expoesymposium
 
Novell Tour Europe and South Africa 2012
Novell Tour Europe and South Africa 2012Novell Tour Europe and South Africa 2012
Novell Tour Europe and South Africa 2012
 
Bbx Biz Plan Presentation
Bbx Biz Plan PresentationBbx Biz Plan Presentation
Bbx Biz Plan Presentation
 
MDD: Models, frameworks, & code generation
MDD: Models, frameworks, & code generationMDD: Models, frameworks, & code generation
MDD: Models, frameworks, & code generation
 
Cloud Communications: Top 5 Advantages for Your Enterprise
Cloud Communications: Top 5 Advantages for Your EnterpriseCloud Communications: Top 5 Advantages for Your Enterprise
Cloud Communications: Top 5 Advantages for Your Enterprise
 
Discovering Computers: Chapter 03
Discovering Computers: Chapter 03Discovering Computers: Chapter 03
Discovering Computers: Chapter 03
 
Curated Computing
Curated Computing Curated Computing
Curated Computing
 
Fun and games for profit
Fun and games for profitFun and games for profit
Fun and games for profit
 
Code Generation for Conceptual User Interface Patterns
Code Generation for Conceptual User Interface PatternsCode Generation for Conceptual User Interface Patterns
Code Generation for Conceptual User Interface Patterns
 
EMC Overview
EMC OverviewEMC Overview
EMC Overview
 
Irfan Ur Rehman
Irfan Ur RehmanIrfan Ur Rehman
Irfan Ur Rehman
 
Dispelling the mystery around resource planning revc
Dispelling the mystery around resource planning revcDispelling the mystery around resource planning revc
Dispelling the mystery around resource planning revc
 
HP Open Stack Keynote 4 18_2012 final
HP Open Stack Keynote 4 18_2012 finalHP Open Stack Keynote 4 18_2012 final
HP Open Stack Keynote 4 18_2012 final
 
Business made Social - How social technologies and behaviour are changing the...
Business made Social - How social technologies and behaviour are changing the...Business made Social - How social technologies and behaviour are changing the...
Business made Social - How social technologies and behaviour are changing the...
 
Exploring the future of the IT industry and the next generation CIO
Exploring the future of the IT industry and the next generation CIOExploring the future of the IT industry and the next generation CIO
Exploring the future of the IT industry and the next generation CIO
 
Congressional it reform-roadmap_2011
Congressional it reform-roadmap_2011Congressional it reform-roadmap_2011
Congressional it reform-roadmap_2011
 
Mobile advisor zenprise-pitch - lars
Mobile advisor zenprise-pitch - larsMobile advisor zenprise-pitch - lars
Mobile advisor zenprise-pitch - lars
 

Similar to Design for Interaction

Cloud Technology to Facilitate Growth
Cloud Technology to Facilitate GrowthCloud Technology to Facilitate Growth
Cloud Technology to Facilitate GrowthIconnyx
 
Cloud Computing, Business Models, Geilo April 2009
Cloud Computing, Business Models, Geilo April 2009Cloud Computing, Business Models, Geilo April 2009
Cloud Computing, Business Models, Geilo April 2009Francis D'Silva
 
"You don't need a bigger boat": serverless MLOps for reasonable companies
"You don't need a bigger boat": serverless MLOps for reasonable companies"You don't need a bigger boat": serverless MLOps for reasonable companies
"You don't need a bigger boat": serverless MLOps for reasonable companiesData Science Milan
 
Ibm software network2012 claudio cinquepalmi #ibmsocialbiz
Ibm software network2012 claudio cinquepalmi  #ibmsocialbiz Ibm software network2012 claudio cinquepalmi  #ibmsocialbiz
Ibm software network2012 claudio cinquepalmi #ibmsocialbiz Claudio Cinquepalmi
 
The Elements Of User Experience
The Elements Of User ExperienceThe Elements Of User Experience
The Elements Of User ExperienceJohn Chen, Jun
 
Day of data: skills for the future
Day of data: skills for the futureDay of data: skills for the future
Day of data: skills for the futureSteven Miller
 
Rio Info 2015 - Projetos de Big Data no Setor Público - Karin Breitman
Rio Info 2015 - Projetos de Big Data no Setor Público - Karin BreitmanRio Info 2015 - Projetos de Big Data no Setor Público - Karin Breitman
Rio Info 2015 - Projetos de Big Data no Setor Público - Karin BreitmanRio Info
 
17h25_closing_keynote_stefano_stinchi_-_innovation_story.pdf
17h25_closing_keynote_stefano_stinchi_-_innovation_story.pdf17h25_closing_keynote_stefano_stinchi_-_innovation_story.pdf
17h25_closing_keynote_stefano_stinchi_-_innovation_story.pdfBrunoAtti1
 
Cw13 cloud meets big data by ibrahim alloub-emc
Cw13 cloud meets big data by ibrahim alloub-emcCw13 cloud meets big data by ibrahim alloub-emc
Cw13 cloud meets big data by ibrahim alloub-emcinevitablecloud
 
Adobe flash platform java
Adobe flash platform javaAdobe flash platform java
Adobe flash platform javaCh'ti JUG
 
Adobe flash platform java
Adobe flash platform javaAdobe flash platform java
Adobe flash platform javaMichael Chaize
 
Who Made This Mess?
Who Made This Mess?Who Made This Mess?
Who Made This Mess?mmiddaugh
 
Ds roi tc_world
Ds roi tc_worldDs roi tc_world
Ds roi tc_worldvsrtwin
 
Mobile Monday - WebServices on the iPhone - 05/2008
Mobile Monday - WebServices on the iPhone - 05/2008Mobile Monday - WebServices on the iPhone - 05/2008
Mobile Monday - WebServices on the iPhone - 05/2008Roland Tritsch
 
Php In The Enterprise 01 24 2010
Php In The Enterprise 01 24 2010Php In The Enterprise 01 24 2010
Php In The Enterprise 01 24 2010phptechtalk
 
Support as a Leader in Innovation: A Case Study with Cisco
Support as a Leader in Innovation: A Case Study with CiscoSupport as a Leader in Innovation: A Case Study with Cisco
Support as a Leader in Innovation: A Case Study with CisconoHold, Inc.
 

Similar to Design for Interaction (20)

Jobs in the Cloud
 Jobs in the Cloud Jobs in the Cloud
Jobs in the Cloud
 
Cloud Technology to Facilitate Growth
Cloud Technology to Facilitate GrowthCloud Technology to Facilitate Growth
Cloud Technology to Facilitate Growth
 
Cloud Computing, Business Models, Geilo April 2009
Cloud Computing, Business Models, Geilo April 2009Cloud Computing, Business Models, Geilo April 2009
Cloud Computing, Business Models, Geilo April 2009
 
"You don't need a bigger boat": serverless MLOps for reasonable companies
"You don't need a bigger boat": serverless MLOps for reasonable companies"You don't need a bigger boat": serverless MLOps for reasonable companies
"You don't need a bigger boat": serverless MLOps for reasonable companies
 
Ibm software network2012 claudio cinquepalmi #ibmsocialbiz
Ibm software network2012 claudio cinquepalmi  #ibmsocialbiz Ibm software network2012 claudio cinquepalmi  #ibmsocialbiz
Ibm software network2012 claudio cinquepalmi #ibmsocialbiz
 
The Elements Of User Experience
The Elements Of User ExperienceThe Elements Of User Experience
The Elements Of User Experience
 
101 ab 1445-1515
101 ab 1445-1515101 ab 1445-1515
101 ab 1445-1515
 
101 ab 1445-1515
101 ab 1445-1515101 ab 1445-1515
101 ab 1445-1515
 
Day of data: skills for the future
Day of data: skills for the futureDay of data: skills for the future
Day of data: skills for the future
 
Rio Info 2015 - Projetos de Big Data no Setor Público - Karin Breitman
Rio Info 2015 - Projetos de Big Data no Setor Público - Karin BreitmanRio Info 2015 - Projetos de Big Data no Setor Público - Karin Breitman
Rio Info 2015 - Projetos de Big Data no Setor Público - Karin Breitman
 
17h25_closing_keynote_stefano_stinchi_-_innovation_story.pdf
17h25_closing_keynote_stefano_stinchi_-_innovation_story.pdf17h25_closing_keynote_stefano_stinchi_-_innovation_story.pdf
17h25_closing_keynote_stefano_stinchi_-_innovation_story.pdf
 
Cw13 cloud meets big data by ibrahim alloub-emc
Cw13 cloud meets big data by ibrahim alloub-emcCw13 cloud meets big data by ibrahim alloub-emc
Cw13 cloud meets big data by ibrahim alloub-emc
 
Adobe flash platform java
Adobe flash platform javaAdobe flash platform java
Adobe flash platform java
 
Adobe flash platform java
Adobe flash platform javaAdobe flash platform java
Adobe flash platform java
 
Who Made This Mess?
Who Made This Mess?Who Made This Mess?
Who Made This Mess?
 
Ds roi tc_world
Ds roi tc_worldDs roi tc_world
Ds roi tc_world
 
Mobile Monday - WebServices on the iPhone - 05/2008
Mobile Monday - WebServices on the iPhone - 05/2008Mobile Monday - WebServices on the iPhone - 05/2008
Mobile Monday - WebServices on the iPhone - 05/2008
 
Php In The Enterprise 01 24 2010
Php In The Enterprise 01 24 2010Php In The Enterprise 01 24 2010
Php In The Enterprise 01 24 2010
 
EMC & Techno Vision
EMC & Techno VisionEMC & Techno Vision
EMC & Techno Vision
 
Support as a Leader in Innovation: A Case Study with Cisco
Support as a Leader in Innovation: A Case Study with CiscoSupport as a Leader in Innovation: A Case Study with Cisco
Support as a Leader in Innovation: A Case Study with Cisco
 

More from Daniel Tunkelang

Query Understanding and Ecommerce
Query Understanding and EcommerceQuery Understanding and Ecommerce
Query Understanding and EcommerceDaniel Tunkelang
 
Semantic Equivalence of e-Commerce Queries
Semantic Equivalence of e-Commerce QueriesSemantic Equivalence of e-Commerce Queries
Semantic Equivalence of e-Commerce QueriesDaniel Tunkelang
 
Helping Searchers Satisfice through Query Understanding
Helping Searchers Satisfice through Query UnderstandingHelping Searchers Satisfice through Query Understanding
Helping Searchers Satisfice through Query UnderstandingDaniel Tunkelang
 
Query Understanding: A Manifesto
Query Understanding: A ManifestoQuery Understanding: A Manifesto
Query Understanding: A ManifestoDaniel Tunkelang
 
Where should you put your data scientists?
Where should you put your data scientists?Where should you put your data scientists?
Where should you put your data scientists?Daniel Tunkelang
 
Data Science: A Mindset for Productivity
Data Science: A Mindset for ProductivityData Science: A Mindset for Productivity
Data Science: A Mindset for ProductivityDaniel Tunkelang
 
My Three Ex’s: A Data Science Approach for Applied Machine Learning
My Three Ex’s: A Data Science Approach for Applied Machine LearningMy Three Ex’s: A Data Science Approach for Applied Machine Learning
My Three Ex’s: A Data Science Approach for Applied Machine LearningDaniel Tunkelang
 
Web science - How is it different?
Web science - How is it different?Web science - How is it different?
Web science - How is it different?Daniel Tunkelang
 
Better Search Through Query Understanding
Better Search Through Query UnderstandingBetter Search Through Query Understanding
Better Search Through Query UnderstandingDaniel Tunkelang
 
Social Search in a Professional Context
Social Search in a Professional ContextSocial Search in a Professional Context
Social Search in a Professional ContextDaniel Tunkelang
 
Find and be Found: Information Retrieval at LinkedIn
Find and be Found: Information Retrieval at LinkedInFind and be Found: Information Retrieval at LinkedIn
Find and be Found: Information Retrieval at LinkedInDaniel Tunkelang
 
Search as Communication: Lessons from a Personal Journey
Search as Communication: Lessons from a Personal JourneySearch as Communication: Lessons from a Personal Journey
Search as Communication: Lessons from a Personal JourneyDaniel Tunkelang
 
Enterprise Search: How do we get there from here?
Enterprise Search: How do we get there from here?Enterprise Search: How do we get there from here?
Enterprise Search: How do we get there from here?Daniel Tunkelang
 
Big Data, We Have a Communication Problem
Big Data, We Have a Communication Problem Big Data, We Have a Communication Problem
Big Data, We Have a Communication Problem Daniel Tunkelang
 
How to Interview a Data Scientist
How to Interview a Data ScientistHow to Interview a Data Scientist
How to Interview a Data ScientistDaniel Tunkelang
 
Information, Attention, and Trust: A Hierarchy of Needs
Information, Attention, and Trust: A Hierarchy of NeedsInformation, Attention, and Trust: A Hierarchy of Needs
Information, Attention, and Trust: A Hierarchy of NeedsDaniel Tunkelang
 
Data By The People, For The People
Data By The People, For The PeopleData By The People, For The People
Data By The People, For The PeopleDaniel Tunkelang
 
Content, Connections, and Context
Content, Connections, and ContextContent, Connections, and Context
Content, Connections, and ContextDaniel Tunkelang
 

More from Daniel Tunkelang (20)

Query Understanding and Ecommerce
Query Understanding and EcommerceQuery Understanding and Ecommerce
Query Understanding and Ecommerce
 
Semantic Equivalence of e-Commerce Queries
Semantic Equivalence of e-Commerce QueriesSemantic Equivalence of e-Commerce Queries
Semantic Equivalence of e-Commerce Queries
 
Helping Searchers Satisfice through Query Understanding
Helping Searchers Satisfice through Query UnderstandingHelping Searchers Satisfice through Query Understanding
Helping Searchers Satisfice through Query Understanding
 
MMM, Search!
MMM, Search!MMM, Search!
MMM, Search!
 
Enterprise Intelligence
Enterprise IntelligenceEnterprise Intelligence
Enterprise Intelligence
 
Query Understanding: A Manifesto
Query Understanding: A ManifestoQuery Understanding: A Manifesto
Query Understanding: A Manifesto
 
Where should you put your data scientists?
Where should you put your data scientists?Where should you put your data scientists?
Where should you put your data scientists?
 
Data Science: A Mindset for Productivity
Data Science: A Mindset for ProductivityData Science: A Mindset for Productivity
Data Science: A Mindset for Productivity
 
My Three Ex’s: A Data Science Approach for Applied Machine Learning
My Three Ex’s: A Data Science Approach for Applied Machine LearningMy Three Ex’s: A Data Science Approach for Applied Machine Learning
My Three Ex’s: A Data Science Approach for Applied Machine Learning
 
Web science - How is it different?
Web science - How is it different?Web science - How is it different?
Web science - How is it different?
 
Better Search Through Query Understanding
Better Search Through Query UnderstandingBetter Search Through Query Understanding
Better Search Through Query Understanding
 
Social Search in a Professional Context
Social Search in a Professional ContextSocial Search in a Professional Context
Social Search in a Professional Context
 
Find and be Found: Information Retrieval at LinkedIn
Find and be Found: Information Retrieval at LinkedInFind and be Found: Information Retrieval at LinkedIn
Find and be Found: Information Retrieval at LinkedIn
 
Search as Communication: Lessons from a Personal Journey
Search as Communication: Lessons from a Personal JourneySearch as Communication: Lessons from a Personal Journey
Search as Communication: Lessons from a Personal Journey
 
Enterprise Search: How do we get there from here?
Enterprise Search: How do we get there from here?Enterprise Search: How do we get there from here?
Enterprise Search: How do we get there from here?
 
Big Data, We Have a Communication Problem
Big Data, We Have a Communication Problem Big Data, We Have a Communication Problem
Big Data, We Have a Communication Problem
 
How to Interview a Data Scientist
How to Interview a Data ScientistHow to Interview a Data Scientist
How to Interview a Data Scientist
 
Information, Attention, and Trust: A Hierarchy of Needs
Information, Attention, and Trust: A Hierarchy of NeedsInformation, Attention, and Trust: A Hierarchy of Needs
Information, Attention, and Trust: A Hierarchy of Needs
 
Data By The People, For The People
Data By The People, For The PeopleData By The People, For The People
Data By The People, For The People
 
Content, Connections, and Context
Content, Connections, and ContextContent, Connections, and Context
Content, Connections, and Context
 

Recently uploaded

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessWSO2
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialJoão Esperancinha
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sectoritnewsafrica
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 

Recently uploaded (20)

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with Platformless
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorial
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 

Design for Interaction

  • 1. design for interaction Daniel Tunkelang Chief Scientist, Endeca © 2009 Endeca Technologies, Inc. All rights reserved.
  • 2. about me Organizing SIGIR ’09 Industry Track in Boston on July 22nd! 2 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 3. about endeca leading provider of search applications 250M+ end users per month 600+ customers $100M+ annual sales 3 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 4. what i hope you learn from this talk the db and ir perspectives have a common thread convergence may be upon us but we need interaction to make it work 4 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 5. overview don't put all your eggs in one basket design for interaction human-computer information retrieval 5 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 6. don’t put all your eggs in one basket Still Life with Basket and Broken Eggs by Michael Edwards, 2008 6 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 7. the db approach: perfection in, perfection out http://www.storeitfoodsblog.com/category/food-preparation/meat-grinder/ 7 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 8. db usability researchers recognize the pain 8 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 9. sql is hard Making Database Systems Usable [Jagadish et al., SIGMOD 2007] __ sql • labor-intensive query construction • lengthy query evaluation • high query reformulation cost 9 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 10. data sucks and users are lazy Extracting Problems for Database and IR Researchers [Naughton, Spring 2008 North East DB/IR Day] • real data is – incomplete – inconsistent – incorrect • users don’t want to learn – data schemas – structured query languages we’re not gonna take it! 10 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 11. the ir way: don’t worry, be happy http://adsoftheworld.com/media/print/mcdonalds_burger_mysteries 11 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 12. ir for db people: what would google do? tf-idf PageRank SYSTEM: rank using IR model USER: information Need query select from results 12 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 13. assumptions of relevance-centric ir approach • self-awareness • self-expression • model knows best • answer is a document • one-shot query 13 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 14. life is not a batch • db approach expects too much of user • ir approach expects too much of system both approaches act as if it all comes down to a single query is that your final answer question? 14 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 15. design for interaction The Future of Social Interaction by Jim Stoten 15 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 16. changes assumptions about what to optimize precision recall complexity relevance communication 16 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 17. how do we optimize communication? transparency guidance control 17 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 18. ir offers a black box ca c'est la caisse. le mouton que tu veux est dedans. 18 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 19. db / set retrieval offers 2 out of 3 transparency guidance control 19 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 20. but we need it all! • set retrieval is a failure in the ir world – though quite successful in the db world! • but ranked retrieval is inherently crippled – no transparency, control, or guidance! how do we optimize for communication? 20 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 21. human-computer information retrieval “Toward Human-Computer Information Retrieval” Gary Marchionini • don’t just guess the user’s intent • increase user responsibility and control • require and reward human intellectual effort 21 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 22. great idea how? 22 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 23. treat query construction as a process A Case for Interaction [Koenemann and Belkin, 1996] • used term feedback to improve alerting queries • users select from suggested terms • 17 – 34% improvement in precision @ 30 • users liked the feedback interface 23 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 24. expose the facets of semistructured content 24 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 25. success in the lab and the field • favored in user studies by Marti Hearst – http://flamenco.berkeley.edu/ • ubiquitous in ecommerce – amazon.com – eBay – endeca powers 42 of top 100 online retailers • taking over media, libraries, enterprise, etc. 25 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 26. even a few db folks have drunk the kool-aid DataGuides [Goldman and Widom, VLDB 1997] • user-friendly schema summaries Magnet [Sinha and Karger, SIGMOD 2005] • navigation and refinement options common theme: semistructured 26 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 27. what is semistructured data? • one universe • self-describing • blends data / meta-data 27 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 28. data modeling flexibility • no a-priori schema – integrated sources without up-front schema design • richer modeling capabilities tame data complexity – hierarchy, multi-valued fields, sparse fields • schema flexibility eases schema evolution – new entity types, new data source WWW SOA, ESB, Groupware and Content Databases ERP Internet File Systems Web Service Collaboration Management 28 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 29. semantically direct queries which attributes which on-sale items characterize on-sale are available in blue? blue items? price, sleeve, color, salePrice, brand, fabric, … <shirt> <buyingGuide> <sku>1234</sku> <title>Selecting the right <sleeve>Long</sleeve> ski coat for you.</title> <desc>Classic end-on-end shirt</desc> <file>skiguide.pdf</file> <price>39.99</price> <keyword>ski</keyword> <salePrice>29.99</salePrice> <keyword>coat</keyword> <color>Blue</color> ... <color>Yellow</color> </buyingGuide> <color>White</color> ... </shirt> <trousers> <sku>1579</sku> <price>59.99</price> <color>Khaki</color> ... </trousers> 29 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 30. but let’s make this concrete Uh oh, I’m presenting at SIGMOD! Better find a good book about databases! 30 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 31. quick, to the goog-mobile! not quite… 31 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 32. i know, i’ll go to the library! #%@$! 32 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 33. let’s try a little hcir… 33 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 34. hcir works for news too 34 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 35. life in a semistructured world • search is a great starting point – users can’t / won’t initiate structured queries • ranked lists are an inadequate ending point – search queries are lossy projections of intent • hcir leads users down a garden path to structure 35 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 36. lots of trade-offs “everything should be made as simple as possible, but no simpler” “speed of thought” vs. “going nowhere quickly” “to err is human, but to really foul things up requires a computer” simple interfaces don’t always yield satisfaction 36 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 37. users want the triumvirate • transparency • control • guidance transparency and control are easy guidance requires cleverness 37 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 38. in closing all of us want to help people access information the best help is to help them help themselves design for interaction though transparency, control, guidance 38 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 39. thank you…and come to SIGIR! communication 1.0 email: dt@endeca.com communication 2.0 blog: http://thenoisychannel.com twitter: http://twitter.com/dtunkelang SIGIR: July 19-23 in Boston Industry Track on July 22nd! 39 © 2009 Endeca Technologies, Inc. All rights reserved.