Enterprise Search: How Do We Get There From Here?
by Daniel Tunkelang (Head of Query Understanding, LinkedIn)
Keynote at 2013 Enterprise Search Summit
We've been tackling the challenges of enterprise and site search for at least 3 decades. We've succeeded to the point that search is the gateway to many of our information repositories. Nonetheless, users of enterprise search systems are frustrated with these systems' shortcomings. We see this frustration in surveys, but, more importantly, most of us experience it personally in our daily work life. We all dream of a world where searching any information repository is as effective as searching the web—perhaps even more so. A world where we find what we're looking for, or quickly determine that it doesn't exist. Is this Utopia possible? If so, how do we get there from here? Or at least somewhere close? In this talk, Tunkelang reviews the track record of enterprise search. He talks about what's worked and what hasn't, especially as compared to web search. Finally, he proposes some paths to bring us closer to our dream.
--
Daniel Tunkelang is Head of Query Understanding at LinkedIn. Educated at MIT and CMU, he has his career working on big data, addressing key challenges in search, data mining, user interfaces, and network analysis. He co-founded enterprise search and business intelligence pioneer Endeca, where he spent a decade as its Chief Scientist. In 2011, Endeca was acquired by Oracle for over $1B. Previous to LinkedIn, he led a team at Google working on local search quality. Daniel has authored fifteen patents, written a textbook on faceted search, and created the annual symposium on human-computer interaction and information retrieval.
9. Meta-‐utopia
or
Metacrap?
Cory
Doctorow’s
seven
straw-‐men
of
meta-‐utopia:
1. People
lie.
2. People
are
lazy.
3. People
are
stupid.
4. Mission:
Impossible
-‐-‐
know
thyself.
5. Schemas
aren't
neutral.
6. Metrics
influence
results.
7. There's
more
than
one
way
to
describe
something.
15. 15
15
for i in [1..n]!
s ← w1 w2 … wi!
if Pc(s) > 0!
a ← new Segment()!
a.segs ← {s}!
a.prob ← Pc(s)!
B[i] ← {a}!
for j in [1..i-1]!
for b in B[j]!
s ← wj wj+1 … wi!
if Pc(s) > 0!
a ← new Segment()!
a.segs ← b.segs U {s}!
a.prob ← b.prob * Pc(s)!
B[i] ← B[i] U {a}!
sort B[i] by prob!
truncate B[i] to size k!
Long
tail?
Structure
and
segment
your
queries.
19. Use
analyecs
to
drive
triage.
“Sorry,
no
results
containing
all
your
search
terms
were
found.”
Analyzed
representaDve
random
sample
of
name
searches.
Leading
causes:
1) Misspelled
names.
2) Correctly
spelled
name
of
someone
not
on
site.
Combine
automated
analysis
with
human
judgment.
20. Triage
drives
and
validates
agile
development.
Misspelled
name?
Correctly
spelled
name
of
someone
not
on
site?