Introduction to Enterprise Search

INTRODUCTION TO ENTERPRISE
SEARCH
Kristian Norling

Introduction
• Who is here?
• Your expectations?
• Kristian?
• 2 hours, one break
• Lifetime answer Guarantee on this class

Agenda
• Problem
• History of (web) search
• How we search and !nd?
• Current state of Enterprise Search + stats
• Technical concept
• Information quality
• Feedback cycle
• Five dimensions of Findability

The Problems
• Growing amounts of Information
• Changing patterns of information
consumption
• Information silos
• Web like behaviour > Information !lters
• Internal information use is still in the
Digital Stone Age

History of Search
In Academia search is called Information
Retrieval.
It is an old discipline, dating back
thousands of years...
Basic concepts in Information Retrieval:
Recall and Precision, more later...

Directories vs. Search Engines
• Directories are manually compiled taxonomies of
websites
• Directories are far more costly and time intensive to
maintain
• Directories lack coverage, although it provides an
important alternative, especially for novice surfers
• Search engines rely mainly on automated search
algorithms
• Search engines rank pages by popularity on the web,
the more referrals (links) the more relevant

Early days of Web Search
Yahoo – searchable directory (1994, ~10000 websites)
• Integrates
search
over
its
directory.
Organized
by
subject

ma8ers.
Sites
can
be
suggested,
but
human
editors
control

quality
of
directory
(~100
dedicated
editors)
Ask – natural language search engine (1998)
• used
human
editors
to
match
popular
queries.
Tried

diﬀerent
algorithms
to
rank
pages
by
popularity
Google – searchable index (1998)
• Developed
Pagerank,
popularity
algorithm
that
hides
bad

content.
Set
standards
(spellchecking,
query
suggesIon,

search
results
page
design)

Web Search - evolution
First generation (1995-97) – AltaVista, Excite, WebCrawler
Uses mostly on-page data (text and formatting).
Informational queries.
Second generation (1998-2010) – Google, Yahoo
Use o"-page, web-speci!c data: link analysis, anchor-text, click-
through data. Informational and navigational queries.
Third generation (2010-present) – Google, Wolfram-Alpha,
Bing
Blend data from many sources, tries to answer ‘‘the need
behind the query’’: semantic analysis, context determination,
dynamic database selection etc. Informational, navigational, and
transactional queries.

Seeking information modes:
Informational
Find information assumed to be available
on the web in a static form.

Navigational
Reach a particular site that the user has in
mind, either because they visited it in the
past or because they assume that such a
site exists. Have usually only one "right"
result.

Transactional
Reach a site where further interaction will happen. This
interaction constitutes the transaction de!ning these
queries. The main categories for such queries are
shopping, !nding various web-mediated services,
downloading various type of !le (images, songs, etc),
accessing certain data-bases (e.g. Yellow Pages type data),
!nding servers (e.g.for gaming) etc.

Four modes of seeking information

Finding something when I
know what I want and have
words to describe it.


Exploring when I only have
some idea of what I want and
may lack the words to
articulate it.


Finding relevant items when I
don’t know what I need.


Finding something I have seen
before, but can’t remember
where.

The State of Enterprise Search
• Amount of information is growing
everyday
• What to Search for?
• Where to Search?
• How to Search?
• Search is simple, complex and powerful
• Findability Dimensions

STATS FROM THE
“ENTERPRISE SEARCH AND
FINDABILITY SURVEY 2012”
SIGN-UP

HOW CRITICAL IS FINDING
THE RIGHT INFORMATION
TO BUSINESS GOALS AND
SUCCESS?

EUROPE
76.5%
IMPERATIVE/SIGNIFICANT

IS IT EASY TO FIND THE
RIGHT INFORMATION
WITHIN YOUR
ORGANISATION TODAY?

EUROPE
77%
MODERATELY/VERY HARD

EUROPE
18.5%
MOSTLY/VERY SATISFIED

WHAT ARE THE OBSTACLES
TO FINDING THE RIGHT
INFORMATION?

Globally
63.4% POOR SEARCH FUNCTIONALITY
52.1% DON'T KNOW WHERE TO LOOK
51.4% INCONSISTENCY IN HOW WE TAG
CONTENT
50.0% LACK OF ADEQUATE TAGS
33.1% DON’T KNOW WHAT TO LOOK FOR

Wikipedia De!nition
“Enterprise search is the practice of
making content from multiple
enterprise-type sources, such as
databases and intranets, searchable to a
de!ned audience.”
http://en.wikipedia.org/wiki/Enterprise_search

The Concept of Enterprise
Search: Precision
In the !eld of information retrieval, precision is the
fraction of retrieved documents that are relevant to the
search.

Precision takes all retrieved documents into account,
but it can also be evaluated at a given cut-o" rank,
considering only the topmost results returned by the
system. This measure is called precision at n or P@n.
Source: Wikipedia

The Concept of Enterprise
Search: Recall
Recall in information retrieval is the fraction of the
documents that are relevant to the query that are
successfully retrieved.

For example for text search on a set of documents recall
is the number of correct results divided by the number
of results that should have been returned.
Source: Wikipedia

Precision and Recall

R number of
M number of N number of
retrieved documents
relevant documents retrieved documents
that are also relevant

Precision and Recall
Recall = R / M =
Number of retrieved documents that are
also relevant / Total number of relevant
documents.
Precision = R / N =
Number of retrieved documents that are
also relevant / Total number of retrieved
documents.

Relevance
...enterprises typically have to use other query-
independent factors, such as a document's recency or
popularity, along with query-dependent factors
traditionally associated with information retrieval
algorithms. Also, the rich functionality of enterprise
search UIs, such as clustering and faceting, diminish
reliance on ranking as the means to direct the user's
attention.
Source: Wikipedia

Relevance
We do not have PageRank...
...but we have social!
Social Reconnects Enterprise Search
Emails, People Catalogues, Connections,
Tagging, Sharing etc.

The Concept of Enterprise Search

Search based Solutions
Examples of implementations:
- People Search
- Product Search
- Document Search
- Intranet and Website Search
- E-commerce
- Dashboard / Search as a Service

Information / Content
• Good Data/Information hygiene
• Crap in = Crap out
• Metadata is very important!
• Taxonomy and Metadata demysti!ed
• TetraPak example (video)
• SimCorp example
• VGR example (video)

HCE (SWEDEN)
DEWEY DECIMAL CLASSIFICATION

Author: Douglas Coupland
Title: Hej Nostradamus!
Publisher: Norstedts
Year: 2003
Printed by: Smedjebacken
Printed: 2004

KristianNorling

Metadata
Semantic

KristianNorling

ESEO: Actionable activities
Example: Ernst & Young
• Metadata
• Titles

• Content Quality
• Information Life Cycle Management

Show me the Money
But, an average Search budget is 100K Euro
• TCO
• ROI
• KPI

Search Analytics is key

Search Analytics
Important, delivers actionable to-dos quickly
• 0-results
• Top Terms Searched for

Video: Search Analytics in Practice

User Satisfaction
• Feedback form
• KPI from Search Analytics
• Session time x n:o sessions = Time spent
on search x hourly price = Cost per
“answer”
• Add search re!nements + exit page (=is
the right answer)

Findability by Findwise

1. BUSINESS
Build solutions to support your business processes and goals
2. INFORMATION
Prepare information to make it !ndable
3. USERS
Build usable solutions based on user needs
4. ORGANISATION
Govern and improve your solution over time
5. SEARCH TECHNOLOGY
Build solutions based on state-of-the-art search technology

Business
• Analyze how your business goals and
strategies can be met by improved
information access
• Set Findability goals. Examples; increase the
revenue on sales, raise productivity, improve
knowledge sharing, better collaboration
• Specify your requirements
• De!ne KPI’s and measure the success of your
investments

Information
• Clean up and archive or delete outdated/
unrelevant information
• Ensure good quality of information by
adding structured and suitable metadata
• Create and use information models and
taxonomies
• Tagging?

Users
• Get to know your users and their needs
• Make sure your solution is easy to use
• Perform continuous usability evaluations,
like usage tests and expert evaluations
• Make sure users !nd what they are looking
for
• Enable feedback loops for complaints,
feedback and praise

Organisation
• Resources!
• De!ne processes, roles and routines to
govern the solution
• Perform Search Analytics
• Create easy to use administration
interfaces
• Perform training, technical and editorial
• Help publishers get started with processes
for better !ndability

Search Technology
• Select a suitable search platform or make
the most of your current solution
• Design your architecture with search-as-a-
service in mind
• Utilise the full potential of the selected
technology

Kristian Norling
Kristian Norling
LinkedIn
@kristiannorling
@!ndwise
!ndwise.com
Findability Blog
Slideshare
Vimeo
Newsroom

Introduction to Enterprise Search

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (7)

Similar to Introduction to Enterprise Search

Similar to Introduction to Enterprise Search (20)

More from Findwise

More from Findwise (20)

Recently uploaded

Recently uploaded (20)

Introduction to Enterprise Search

Editor's Notes