More Related Content Similar to Design for Interaction (20) More from Daniel Tunkelang (20) Design for Interaction1. design for interaction
Daniel Tunkelang
Chief Scientist, Endeca
© 2009 Endeca Technologies, Inc. All rights reserved.
2. about me
Organizing SIGIR ’09 Industry Track in Boston on July 22nd!
2 © 2009 Endeca Technologies, Inc. All rights reserved.
3. about endeca
leading provider of
search applications
250M+
end users
per month
600+ customers
$100M+ annual sales
3 © 2009 Endeca Technologies, Inc. All rights reserved.
4. what i hope you learn from this talk
the db and ir perspectives have a common thread
convergence may be upon us
but we need interaction to make it work
4 © 2009 Endeca Technologies, Inc. All rights reserved.
5. overview
don't put all your eggs in one basket
design for interaction
human-computer information retrieval
5 © 2009 Endeca Technologies, Inc. All rights reserved.
6. don’t put all your eggs in one basket
Still Life with Basket and Broken Eggs by Michael Edwards, 2008
6 © 2009 Endeca Technologies, Inc. All rights reserved.
7. the db approach: perfection in, perfection out
http://www.storeitfoodsblog.com/category/food-preparation/meat-grinder/
7 © 2009 Endeca Technologies, Inc. All rights reserved.
9. sql is hard
Making Database Systems Usable
[Jagadish et al., SIGMOD 2007]
__
sql
• labor-intensive query construction
• lengthy query evaluation
• high query reformulation cost
9 © 2009 Endeca Technologies, Inc. All rights reserved.
10. data sucks and users are lazy
Extracting Problems for Database
and IR Researchers
[Naughton, Spring 2008 North East DB/IR Day]
• real data is
– incomplete
– inconsistent
– incorrect
• users don’t want to learn
– data schemas
– structured query languages we’re not gonna take it!
10 © 2009 Endeca Technologies, Inc. All rights reserved.
11. the ir way: don’t worry, be happy
http://adsoftheworld.com/media/print/mcdonalds_burger_mysteries
11 © 2009 Endeca Technologies, Inc. All rights reserved.
12. ir for db people: what would google do?
tf-idf PageRank
SYSTEM:
rank using IR model
USER:
information Need query select from results
12 © 2009 Endeca Technologies, Inc. All rights reserved.
13. assumptions of relevance-centric ir approach
• self-awareness
• self-expression
• model knows best
• answer is a document
• one-shot query
13 © 2009 Endeca Technologies, Inc. All rights reserved.
14. life is not a batch
• db approach expects too much of user
• ir approach expects too much of system
both approaches act as if it all
comes down to a single query
is that your final answer question?
14 © 2009 Endeca Technologies, Inc. All rights reserved.
15. design for interaction
The Future of Social Interaction by Jim Stoten
15 © 2009 Endeca Technologies, Inc. All rights reserved.
16. changes assumptions about what to optimize
precision
recall
complexity relevance
communication
16 © 2009 Endeca Technologies, Inc. All rights reserved.
17. how do we optimize communication?
transparency
guidance
control
17 © 2009 Endeca Technologies, Inc. All rights reserved.
18. ir offers a black box
ca c'est la caisse. le mouton que tu veux est dedans.
18 © 2009 Endeca Technologies, Inc. All rights reserved.
19. db / set retrieval offers 2 out of 3
transparency
guidance
control
19 © 2009 Endeca Technologies, Inc. All rights reserved.
20. but we need it all!
• set retrieval is a failure in the ir world
– though quite successful in the db world!
• but ranked retrieval is inherently crippled
– no transparency, control, or guidance!
how do we optimize for communication?
20 © 2009 Endeca Technologies, Inc. All rights reserved.
21. human-computer information retrieval
“Toward Human-Computer
Information Retrieval”
Gary Marchionini
• don’t just guess the user’s intent
• increase user responsibility and control
• require and reward human intellectual effort
21 © 2009 Endeca Technologies, Inc. All rights reserved.
22. great idea
how?
22 © 2009 Endeca Technologies, Inc. All rights reserved.
23. treat query construction as a process
A Case for Interaction
[Koenemann and Belkin, 1996]
• used term feedback to improve alerting queries
• users select from suggested terms
• 17 – 34% improvement in precision @ 30
• users liked the feedback interface
23 © 2009 Endeca Technologies, Inc. All rights reserved.
24. expose the facets of semistructured content
24 © 2009 Endeca Technologies, Inc. All rights reserved.
25. success in the lab and the field
• favored in user studies by Marti Hearst
– http://flamenco.berkeley.edu/
• ubiquitous in ecommerce
– amazon.com
– eBay
– endeca powers 42 of top 100 online retailers
• taking over media, libraries, enterprise, etc.
25 © 2009 Endeca Technologies, Inc. All rights reserved.
26. even a few db folks have drunk the kool-aid
DataGuides
[Goldman and Widom, VLDB 1997]
• user-friendly schema summaries
Magnet
[Sinha and Karger, SIGMOD 2005]
• navigation and refinement options
common theme: semistructured
26 © 2009 Endeca Technologies, Inc. All rights reserved.
27. what is semistructured data?
• one universe
• self-describing
• blends data / meta-data
27 © 2009 Endeca Technologies, Inc. All rights reserved.
28. data modeling flexibility
• no a-priori schema
– integrated sources without up-front schema design
• richer modeling capabilities tame data complexity
– hierarchy, multi-valued fields, sparse fields
• schema flexibility eases schema evolution
– new entity types, new data source
WWW SOA, ESB, Groupware and Content
Databases ERP
Internet File Systems Web Service Collaboration Management
28 © 2009 Endeca Technologies, Inc. All rights reserved.
29. semantically direct queries
which attributes
which on-sale items characterize on-sale
are available in blue? blue items?
price, sleeve,
color, salePrice,
brand, fabric, …
<shirt>
<buyingGuide>
<sku>1234</sku>
<title>Selecting the right
<sleeve>Long</sleeve>
ski coat for you.</title>
<desc>Classic end-on-end shirt</desc>
<file>skiguide.pdf</file>
<price>39.99</price>
<keyword>ski</keyword>
<salePrice>29.99</salePrice>
<keyword>coat</keyword>
<color>Blue</color>
...
<color>Yellow</color>
</buyingGuide>
<color>White</color>
...
</shirt> <trousers>
<sku>1579</sku>
<price>59.99</price>
<color>Khaki</color>
...
</trousers>
29 © 2009 Endeca Technologies, Inc. All rights reserved.
30. but let’s make this concrete
Uh oh, I’m presenting at
SIGMOD! Better find a good
book about databases!
30 © 2009 Endeca Technologies, Inc. All rights reserved.
31. quick, to the goog-mobile!
not quite…
31 © 2009 Endeca Technologies, Inc. All rights reserved.
32. i know, i’ll go to the library!
#%@$!
32 © 2009 Endeca Technologies, Inc. All rights reserved.
33. let’s try a little hcir…
33 © 2009 Endeca Technologies, Inc. All rights reserved.
34. hcir works for news too
34 © 2009 Endeca Technologies, Inc. All rights reserved.
35. life in a semistructured world
• search is a great starting point
– users can’t / won’t initiate structured queries
• ranked lists are an inadequate ending point
– search queries are lossy projections of intent
• hcir leads users down a garden path to structure
35 © 2009 Endeca Technologies, Inc. All rights reserved.
36. lots of trade-offs
“everything should be made as simple
as possible, but no simpler”
“speed of thought” vs. “going nowhere quickly”
“to err is human, but to really foul
things up requires a computer”
simple interfaces don’t
always yield satisfaction
36 © 2009 Endeca Technologies, Inc. All rights reserved.
37. users want the triumvirate
• transparency
• control
• guidance
transparency and control are easy
guidance requires cleverness
37 © 2009 Endeca Technologies, Inc. All rights reserved.
38. in closing
all of us want to help people access information
the best help is to help them help themselves
design for interaction though
transparency, control, guidance
38 © 2009 Endeca Technologies, Inc. All rights reserved.
39. thank you…and come to SIGIR!
communication 1.0
email: dt@endeca.com
communication 2.0
blog: http://thenoisychannel.com
twitter: http://twitter.com/dtunkelang
SIGIR: July 19-23 in Boston
Industry Track on July 22nd!
39 © 2009 Endeca Technologies, Inc. All rights reserved.