Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Introduction to Enterprise Search

1,560 views

Published on

Introduction to Enterprise Search. A two hour class to introduce Enterprise Search. It covers:
The problems enterprise search can solve
History of (web) search
How we search and find?
Current state of Enterprise Search + stats
Technical concept
Information quality
Feedback cycle
Five dimensions of Findability

Published in: Technology
  • Be the first to comment

Introduction to Enterprise Search

  1. 1. INTRODUCTION TO ENTERPRISE SEARCH Kristian Norling
  2. 2. Introduction• Who is here?• Your expectations?• Kristian?• 2 hours, one break• Lifetime answer Guarantee on this class
  3. 3. Hikingartist
  4. 4. Agenda• Problem• History of (web) search• How we search and !nd?• Current state of Enterprise Search + stats• Technical concept• Information quality• Feedback cycle• Five dimensions of Findability
  5. 5. •List mrflip
  6. 6. nathansnider
  7. 7. erikref
  8. 8. The Problems• Growing amounts of Information• Changing patterns of information consumption• Information silos• Web like behaviour > Information !lters• Internal information use is still in the Digital Stone Age
  9. 9. History of SearchIn Academia search is called InformationRetrieval.It is an old discipline, dating backthousands of years...Basic concepts in Information Retrieval:Recall and Precision, more later...
  10. 10. Directories vs. Search Engines• Directories are manually compiled taxonomies of websites• Directories are far more costly and time intensive to maintain• Directories lack coverage, although it provides an important alternative, especially for novice surfers• Search engines rely mainly on automated search algorithms• Search engines rank pages by popularity on the web, the more referrals (links) the more relevant
  11. 11. Early days of Web SearchYahoo – searchable directory (1994, ~10000 websites) • Integrates  search  over  its  directory.  Organized  by  subject   ma8ers.  Sites  can  be  suggested,  but  human  editors  control   quality  of  directory  (~100  dedicated  editors)Ask – natural language search engine (1998) • used  human  editors  to  match  popular  queries.  Tried   different  algorithms  to  rank  pages  by  popularityGoogle – searchable index (1998) • Developed  Pagerank,  popularity  algorithm  that  hides  bad   content.  Set  standards  (spellchecking,  query  suggesIon,   search  results  page  design)
  12. 12. Web Search - evolutionFirst generation (1995-97) – AltaVista, Excite, WebCrawlerUses mostly on-page data (text and formatting).Informational queries.Second generation (1998-2010) – Google, YahooUse o"-page, web-speci!c data: link analysis, anchor-text, click-through data. Informational and navigational queries.Third generation (2010-present) – Google, Wolfram-Alpha,BingBlend data from many sources, tries to answer ‘‘the needbehind the query’’: semantic analysis, context determination,dynamic database selection etc. Informational, navigational, andtransactional queries.
  13. 13. Seeking information modes:InformationalFind information assumed to be availableon the web in a static form.
  14. 14. Seeking information modes:NavigationalReach a particular site that the user has inmind, either because they visited it in thepast or because they assume that such asite exists. Have usually only one "right"result.
  15. 15. Seeking information modes:TransactionalReach a site where further interaction will happen. Thisinteraction constitutes the transaction de!ning thesequeries. The main categories for such queries areshopping, !nding various web-mediated services,downloading various type of !le (images, songs, etc),accessing certain data-bases (e.g. Yellow Pages type data),!nding servers (e.g.for gaming) etc.
  16. 16. Four modes of seeking information Finding something when I know what I want and have words to describe it.
  17. 17. Four modes of seeking information Exploring when I only have some idea of what I want and may lack the words to articulate it.
  18. 18. Four modes of seeking information Finding relevant items when I don’t know what I need.
  19. 19. Four modes of seeking information Finding something I have seen before, but can’t remember where.
  20. 20. The State of Enterprise Search• Amount of information is growing everyday• What to Search for?• Where to Search?• How to Search?• Search is simple, complex and powerful• Findability Dimensions
  21. 21. STATS FROM THE“ENTERPRISE SEARCH ANDFINDABILITY SURVEY 2012” SIGN-UP
  22. 22. HOW CRITICAL IS FINDINGTHE RIGHT INFORMATION TO BUSINESS GOALS AND SUCCESS?
  23. 23. EUROPE 76.5%IMPERATIVE/SIGNIFICANT
  24. 24. Zoom Zoom
  25. 25. IS IT EASY TO FIND THE RIGHT INFORMATION WITHIN YOURORGANISATION TODAY?
  26. 26. EUROPE 77%MODERATELY/VERY HARD
  27. 27. LEVEL OF SATISFACTION?
  28. 28. proimos
  29. 29. EUROPE 18.5%MOSTLY/VERY SATISFIED
  30. 30. WHAT ARE THE OBSTACLES TO FINDING THE RIGHT INFORMATION?
  31. 31. Globally63.4% POOR SEARCH FUNCTIONALITY52.1% DONT KNOW WHERE TO LOOK51.4% INCONSISTENCY IN HOW WE TAG CONTENT50.0% LACK OF ADEQUATE TAGS33.1% DON’T KNOW WHAT TO LOOK FOR
  32. 32. Wikipedia De!nition“Enterprise search is the practice ofmaking content from multipleenterprise-type sources, such asdatabases and intranets, searchable to ade!ned audience.”http://en.wikipedia.org/wiki/Enterprise_search
  33. 33. The Concept of Enterprise Search: Precision In the !eld of information retrieval, precision is the fraction of retrieved documents that are relevant to the search. Precision takes all retrieved documents into account, but it can also be evaluated at a given cut-o" rank, considering only the topmost results returned by the system. This measure is called precision at n or P@n. Source: Wikipedia
  34. 34. The Concept of Enterprise Search: Recall Recall in information retrieval is the fraction of the documents that are relevant to the query that are successfully retrieved. For example for text search on a set of documents recall is the number of correct results divided by the number of results that should have been returned. Source: Wikipedia
  35. 35. Precision and Recall R number of M number of N number of retrieved documents relevant documents retrieved documents that are also relevant
  36. 36. Precision and RecallRecall = R / M =Number of retrieved documents that arealso relevant / Total number of relevantdocuments.Precision = R / N =Number of retrieved documents that arealso relevant / Total number of retrieveddocuments.
  37. 37. Relevance...enterprises typically have to use other query-independent factors, such as a documents recency orpopularity, along with query-dependent factorstraditionally associated with information retrievalalgorithms. Also, the rich functionality of enterprisesearch UIs, such as clustering and faceting, diminishreliance on ranking as the means to direct the usersattention. Source: Wikipedia
  38. 38. PageRank
  39. 39. RelevanceWe do not have PageRank......but we have social!Social Reconnects Enterprise SearchEmails, People Catalogues, Connections,Tagging, Sharing etc.
  40. 40. The Concept of Enterprise Search
  41. 41. Search based SolutionsExamples of implementations:- People Search- Product Search- Document Search- Intranet and Website Search- E-commerce- Dashboard / Search as a Service
  42. 42. Information / Content• Good Data/Information hygiene• Crap in = Crap out• Metadata is very important!• Taxonomy and Metadata demysti!ed• TetraPak example (video)• SimCorp example• VGR example (video)
  43. 43. •List yeraze
  44. 44. svenwerk
  45. 45. HCE (SWEDEN)DEWEY DECIMAL CLASSIFICATION
  46. 46. KristianNorling
  47. 47. Author: Douglas CouplandTitle: Hej Nostradamus!Publisher: NorstedtsYear: 2003Printed by: SmedjebackenPrinted: 2004 KristianNorling
  48. 48. MetadataSemantic KristianNorling
  49. 49. ESEO: Actionable activitiesExample: Ernst & Young• Metadata• Titles• Content Quality• Information Life Cycle Management
  50. 50. Show me the MoneyBut, an average Search budget is 100K Euro• TCO• ROI• KPISearch Analytics is key
  51. 51. Search AnalyticsImportant, delivers actionable to-dos quickly• 0-results• Top Terms Searched forVideo: Search Analytics in Practice
  52. 52. User Satisfaction• Feedback form• KPI from Search Analytics• Session time x n:o sessions = Time spent on search x hourly price = Cost per “answer”• Add search re!nements + exit page (=is the right answer)
  53. 53. Findability by Findwise 1. BUSINESSBuild solutions to support your business processes and goals 2. INFORMATION Prepare information to make it !ndable 3. USERS Build usable solutions based on user needs 4. ORGANISATION Govern and improve your solution over time 5. SEARCH TECHNOLOGYBuild solutions based on state-of-the-art search technology
  54. 54. Business• Analyze how your business goals andstrategies can be met by improvedinformation access• Set Findability goals. Examples; increase therevenue on sales, raise productivity, improveknowledge sharing, better collaboration• Specify your requirements• De!ne KPI’s and measure the success of yourinvestments
  55. 55. Information• Clean up and archive or delete outdated/unrelevant information• Ensure good quality of information byadding structured and suitable metadata• Create and use information models andtaxonomies• Tagging?
  56. 56. Users• Get to know your users and their needs• Make sure your solution is easy to use• Perform continuous usability evaluations,like usage tests and expert evaluations• Make sure users !nd what they are lookingfor• Enable feedback loops for complaints,feedback and praise
  57. 57. Organisation• Resources!• De!ne processes, roles and routines togovern the solution• Perform Search Analytics• Create easy to use administrationinterfaces• Perform training, technical and editorial• Help publishers get started with processesfor better !ndability
  58. 58. Search Technology• Select a suitable search platform or makethe most of your current solution• Design your architecture with search-as-a-service in mind• Utilise the full potential of the selectedtechnology
  59. 59. Kristian Norling Kristian Norling LinkedIn @kristiannorling @!ndwise !ndwise.com Findability Blog Slideshare Vimeo Newsroom

×