SharePoint 101:Which SharePoint Search is Right for You?<br />Miles Kehoe<br />New Idea Engineering Inc.<br />miles.kehoe@...
SharePoint and FAST Search<br />Introduction<br />The Products<br />	Grand Tour<br />	Tech Details<br />	What Really count...
New Idea Engineering Inc.<br />Company Background<br />Founded in 1996<br />Headquarters: Santa Clara, California<br />Cus...
The Book<br />Published Fall 2010<br />Covers Search:<br />Business<br />SharePoint<br />FS4SP<br />ESP 5.3<br />Trends<br />
SharePoint Search Products:<br />The Grand Tour<br />
Naming Conventions<br />Business Productivity<br />(server/(e)CAL licensing)<br />Internet Business<br />(Server licensing...
The ‘SharePoint’ Codebase<br />Business Productivity<br />(server/(e)CAL licensing)<br />Internet Business<br />(Server li...
Product Names<br />
Solutions for Internet Business<br />Solutions for Business Productivity<br />The Marketing Fantasy<br />FAST Search <br /...
In Summary<br /><ul><li>Two entry level (SPF, MSS-X), three infrastructure-tier (SP, SP_FIS, MSS), four high end (FS4SP, F...
Four stand-alone (MSS-X, MSS, FSIA, FSIS), five integrated with SharePoint) SPF, SP, SP-FIS, FS4SP, FS4SP-IS)
Three intended/licensed for internally facing applications (SP, FSIA, FS4SP), three intended/licensed for externally facin...
Six different images/media sets (SPF, MSS/MSS-X, SP/SP-FIS/SP/FIS-E, FS4SP/FS4SP/FS4SP-IS. FSIA, FSIS)</li></li></ul><li>C...
The Products<br />SP and FS4SP Platforms<br />
High Level Overview: SP vs FS4SP<br />
SharePoint Server and FS4SP<br />FS/FS4SP:<br /><ul><li>	Two different codebases </li></ul>		united by (mostly) common fea...
PowerShell
iFilters
OOB experience</li></ul>FS4SP adds<br /><ul><li>Index Pipeline
Deep facets
Geo/Location Search
FAST XRANK operator
	Deep facets
Predictable scalability …</li></li></ul><li>SP and FS4SP: End users<br />
SP and FS4SP: IT <br />
SP and FS4SP: Developers<br />
Powerful FS4SP Features<br />IT<br />Content Processing pipeline <br />Entity extraction<br />Tunable relevance ranking<br...
Indexing Pipeline<br />…<br />FS4SP<br />Stages coded in .Net<br />Configure via UI or PowerShell<br />Custom stages befor...
Typical Content Pipeline Stages<br />Default<br />Optional<br /><ul><li>Format Conversion
Language and encoding detection
Lemmatizer
Tokenizer
Entity Extraction
DateTimeNormalizer
Vectorizer
WebAnalyzer
PropertiesMapper
PropertiesReporter
XML Properties mapper
Offensive Content Filter
Verbatim (whole word) extractor</li></ul>	Loads dictionary for custom extraction, 	e.g. product names<br /><ul><li>Field C...
Entity Extraction</li></ul>	‘Persons1’<br />
The Role of the Pipeline<br />The Content Pipeline<br />Processing & refinement<br />Date<br />Lookaside<br />Location<br ...
The ‘Virtual Document’<br />The initial document text content plus:<br /><ul><li>Explicit metadata (Title/Author/etc.)
Implicit metadata (Path/repository/filename)
Look-aside content from pipeline</li></ul>(synonyms/taxonomies/other mark-up)<br /><ul><li>Anything you can add to ‘make t...
Upcoming SlideShare
Loading in …5
×

Which SharePoint Search is Right for You?

1,433 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,433
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
35
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Opportunity to fix up bad/missing metadataFS4SP: Custom stage just before mapper; C# default.. Safe, has timeout..ESP: any CLI code – sdtin/stdout; even Bat file (newsletter example)Why a pipeline: Mark up documents w/ look-aside data (stock quotes); extend with sentiment analysis, auto-classification, etc. Very powerful capability for fixing content problems
  • Format: Filters, OutsideInVector: Similarity searching
  • Which SharePoint Search is Right for You?

    1. 1. SharePoint 101:Which SharePoint Search is Right for You?<br />Miles Kehoe<br />New Idea Engineering Inc.<br />miles.kehoe@ideaeng.com<br />
    2. 2. SharePoint and FAST Search<br />Introduction<br />The Products<br /> Grand Tour<br /> Tech Details<br /> What Really counts<br />Mapping Business Requirements to Technology<br /> Data <br /> Capacities<br /> Price<br />Search Resources<br />
    3. 3. New Idea Engineering Inc.<br />Company Background<br />Founded in 1996<br />Headquarters: Santa Clara, California<br />Customers in Europe and North America<br />Vendor neutral approach to search<br />Focus is on what is best for our clients<br />Products and Consulting Services:<br />Evaluation/Selection/Implementing Enterprise Search<br />Search Best Practices<br />SearchTrack Reporting & Analytics<br />Search Data Quality Toolkit<br />Enterprise Search Newsletter<br />3<br />
    4. 4. The Book<br />Published Fall 2010<br />Covers Search:<br />Business<br />SharePoint<br />FS4SP<br />ESP 5.3<br />Trends<br />
    5. 5. SharePoint Search Products:<br />The Grand Tour<br />
    6. 6. Naming Conventions<br />Business Productivity<br />(server/(e)CAL licensing)<br />Internet Business<br />(Server licensing)<br />FAST Search Server 2010 for SharePoint<br />SharePoint Server 2010 <br />for Internet Sites, Enterprise<br />High End<br />SharePoint Server 2010<br />SharePoint Server 2010 <br />for Internet Sites, <br />Standard<br />Integrated with SharePoint<br />Infrastructure<br />SharePoint Foundation 2010<br />Entry Level<br />FAST Search Server 2010 for Internal Applications<br />FAST Search Server 2010for Internet Sites<br />High End<br />Microsoft Search Server 2010<br />Stand-Alone<br />Infrastructure<br />Microsoft Search Server 2010, Express<br />Entry Level<br />
    7. 7. The ‘SharePoint’ Codebase<br />Business Productivity<br />(server/(e)CAL licensing)<br />Internet Business<br />(Server licensing)<br />FAST Search Server 2010 for SharePoint<br />SharePoint Server 2010 <br />for Internet Sites, Enterprise<br />High End<br />SharePoint Server 2010<br />SharePoint Server 2010 <br />for Internet Sites, <br />Standard<br />Integrated with SharePoint<br />Infrastructure<br />SharePoint Foundation 2010<br />Entry Level<br />FAST Search Server 2010 for Internal Applications<br />FAST Search Server 2010for Internet Sites<br />High End<br />Microsoft Search Server 2010<br />Stand-Alone<br />Infrastructure<br />Microsoft Search Server 2010, Express<br />Entry Level<br />
    8. 8. Product Names<br />
    9. 9. Solutions for Internet Business<br />Solutions for Business Productivity<br />The Marketing Fantasy<br />FAST Search <br />for SharePoint Internet Sites<br />FAST Search <br />for SharePoint<br />Integrated <br />with <br />SharePoint<br />SharePoint Server for Internet Sites<br />SharePoint Server<br />FAST Search <br />For Internet Sites<br />FAST Search <br />for Internal Applications<br />Stand-alone<br />Search Server<br />Search Server <br />Express<br />Entry-Level<br />Solutions<br />SharePoint <br />Foundation<br />
    10. 10. In Summary<br /><ul><li>Two entry level (SPF, MSS-X), three infrastructure-tier (SP, SP_FIS, MSS), four high end (FS4SP, FS4SP-IS, FSIA, FSIS)
    11. 11. Four stand-alone (MSS-X, MSS, FSIA, FSIS), five integrated with SharePoint) SPF, SP, SP-FIS, FS4SP, FS4SP-IS)
    12. 12. Three intended/licensed for internally facing applications (SP, FSIA, FS4SP), three intended/licensed for externally facing applications (SP-FIS, FS4SP-IS, FSIS)
    13. 13. Six different images/media sets (SPF, MSS/MSS-X, SP/SP-FIS/SP/FIS-E, FS4SP/FS4SP/FS4SP-IS. FSIA, FSIS)</li></li></ul><li>Confused Yet?<br />
    14. 14. The Products<br />SP and FS4SP Platforms<br />
    15. 15. High Level Overview: SP vs FS4SP<br />
    16. 16. SharePoint Server and FS4SP<br />FS/FS4SP:<br /><ul><li> Two different codebases </li></ul> united by (mostly) common features<br /><ul><li>SharePoint environment
    17. 17. PowerShell
    18. 18. iFilters
    19. 19. OOB experience</li></ul>FS4SP adds<br /><ul><li>Index Pipeline
    20. 20. Deep facets
    21. 21. Geo/Location Search
    22. 22. FAST XRANK operator
    23. 23. Deep facets
    24. 24. Predictable scalability …</li></li></ul><li>SP and FS4SP: End users<br />
    25. 25. SP and FS4SP: IT <br />
    26. 26. SP and FS4SP: Developers<br />
    27. 27. Powerful FS4SP Features<br />IT<br />Content Processing pipeline <br />Entity extraction<br />Tunable relevance ranking<br />Developers<br />Add custom property extractors<br />Extend content processing<br />Include external data in relevance<br />Build multiple relevance profiles<br />USER<br />Deep refiners with counts<br />Context based user profile<br />Multiple relevance profiles<br />Sorting on any property<br />Similarity Search<br />Broader, better language support <br />Richer query language<br />
    28. 28. Indexing Pipeline<br />…<br />FS4SP<br />Stages coded in .Net<br />Configure via UI or PowerShell<br />Custom stages before ‘Mapper’<br />Runs in sandbox w/ timeout<br />Mapper<br />FS4SP Index<br />Language<br />Detection<br />Format<br />Conversion<br />EntityExtraction<br />Lemmatization<br />FS4SP<br />Stages coded in Python<br /> (any CLI language OK)<br />Configure via XML config file<br />Custom stages allowed anywhere<br />Runs in-line<br />
    29. 29. Typical Content Pipeline Stages<br />Default<br />Optional<br /><ul><li>Format Conversion
    30. 30. Language and encoding detection
    31. 31. Lemmatizer
    32. 32. Tokenizer
    33. 33. Entity Extraction
    34. 34. DateTimeNormalizer
    35. 35. Vectorizer
    36. 36. WebAnalyzer
    37. 37. PropertiesMapper
    38. 38. PropertiesReporter
    39. 39. XML Properties mapper
    40. 40. Offensive Content Filter
    41. 41. Verbatim (whole word) extractor</li></ul> Loads dictionary for custom extraction, e.g. product names<br /><ul><li>Field Collapsing
    42. 42. Entity Extraction</li></ul> ‘Persons1’<br />
    43. 43. The Role of the Pipeline<br />The Content Pipeline<br />Processing & refinement<br />Date<br />Lookaside<br />Location<br />Company<br />REDMOND, Wash., andOSLO, Norway — Jan. 8, 2008<br />Microsoft Corp. (Nasdaq “MSFT”) today announced that it will make an offer to acquire Fast Search & Transfer ASA (OSE: “FAST”), a leading provider of enterprise search solutions, through a cash tender offer for 19.00 Norwegian kroner (NOK) per share. This offer represents a 42 percent premium to the closing share price on Jan. 4, 2008 (the last trading day prior to this announcement), …<br />Mapper<br />…<br />Configurable<br />Stages<br />EntityExtraction<br />Language<br />Detection<br />Format<br />Conversion<br />
    44. 44. The ‘Virtual Document’<br />The initial document text content plus:<br /><ul><li>Explicit metadata (Title/Author/etc.)
    45. 45. Implicit metadata (Path/repository/filename)
    46. 46. Look-aside content from pipeline</li></ul>(synonyms/taxonomies/other mark-up)<br /><ul><li>Anything you can add to ‘make the needle bigger’</li></li></ul><li>Entity Extraction<br />FS4SP<br /><ul><li>Create/Edit Config Files
    47. 47. Update files (Persons/Places/Things)
    48. 48. Wait for system to update (5 minutes)
    49. 49. Voila!!</li></ul>ESP<br /><ul><li>Create/Edit dictionary file
    50. 50. Compile dictionary w/ ‘dictupdate’
    51. 51. Names, Companies, Job titles
    52. 52. Samples provided in many languages
    53. 53. Whitelist and blacklist
    54. 54. Bazinga!</li></li></ul><li>Deep Refiners<br />SP<br /><ul><li>Shallow refiners only
    55. 55. No count provided
    56. 56. Only managed properties/metadata
    57. 57. Uses top 50 results to populate refiners</li></ul>FS4SP<br /><ul><li>Provides document counts with refiners
    58. 58. All refiners shown (based on config)</li></li></ul><li>FS4SP Result List<br />
    59. 59. ESP Platforms<br />
    60. 60. FAST ESP <br />FAST ESP 5.3 <br /><ul><li>Multiple Platform Support
    61. 61. Extreme Scalability
    62. 62. Petabytes of Content
    63. 63. >10K QPS</li></ul>FSIA = ESP<br /><ul><li>Identical to FAST ESP 5.3
    64. 64. ‘Internal Applications’
    65. 65. License based on ECAL + Server</li></ul>FSIS = FAST ESP 5.3 + IMS / CTS<br /><ul><li>IMS/CTS - Windows only (Visual Studio)
    66. 66. ‘Internet Sites’
    67. 67. License based on servers</li></li></ul><li>ESP Platform<br />Search development environment<br /><ul><li>‘Some assembly required’
    68. 68. Not based on SharePoint
    69. 69. Connectors and WebParts available
    70. 70. No OOB end-user UI
    71. 71. Separate user and IT consoles
    72. 72. Fully customizable pipeline</li></ul>Development Tools<br /><ul><li>Java/Python/PostgreSQL
    73. 73. GUI and CLI (no PowerShell support)
    74. 74. Petabytes of Content
    75. 75. 10K+ QPS</li></li></ul><li>ESP Admin Console<br />
    76. 76. ESP Search Business Console<br />
    77. 77. Search View: Results<br />
    78. 78. Search View: Debugging <br />
    79. 79. Search View: Debug Output <br />
    80. 80. New FSIS Products<br />Content Transformation Services: CTS<br /><ul><li>Design workflows for indexing from multiple sources</li></ul>Interaction Management Services: IMS<br /><ul><li>Manage query/results processing</li></ul>CTS and IMS:<br /><ul><li>Based on Visual Studio: Windows Only!
    81. 81. Licensed/Intended for FSIS ONLY (for now)</li></li></ul><li>Content Transformation Services<br />
    82. 82. Interaction Management Services<br />
    83. 83. ESP Customer: Best Buy<br />
    84. 84. ESP Customer: Financial Times<br />
    85. 85. But…. What’s really important..<br />What does your customer expect?<br />
    86. 86. Mapping Business Requirements <br />to the Technology<br />
    87. 87. Identify business rules for facets/refiners<br />If refiners are a business need:<br /><ul><li>Choose to index the appropriate metadata fields; or
    88. 88. Upgrade your data</li></ul>If refiners are not a business need:<br /><ul><li>Push the business rules into the 21st century
    89. 89. Use what you have; or
    90. 90. Update your content</li></li></ul><li>The Data Audit<br />Understand the data<br />Repositories<br /> Where does the content live<br /> Is there security involved<br />Documents & document structure<br /> Do documents have good metadata?<br /> Do you need to extract data<br /> Are there recognizable blocks of content?<br />Lightweight publishing content<br /> Look at email/wikis/blogs/support calls…<br />
    91. 91. Great search doesn’t just happen..<br />Do search owners understand?<br />Staffing Expectations<br />Is there a search manager? An SCOE?<br /> How many people will be involved day to day?<br />How to manage search?<br />Review activity logs<br /> Update and manage best bets, new content<br /> Evangelize<br />User Skills<br />Are users knowledge workers or casual searchers?<br /> Is search business critical?<br />
    92. 92. So what is the right answer?<br />It depends<br />Internal or external facing search?<br /> SharePoint or stand-alone?<br /> Casual users or Knowledge workers?<br /> Resources for managing search?<br /> Search box or search derived application?<br /> What’s the risk of missing content?<br />That’s why your customer needs you!<br />
    93. 93. Resources<br />Search Dev Newsgroup:<br />www.SearchDev.org<br />Newsletter & Whitepapers:<br />www.ideaeng.com/current<br />www.ideaeng.com/wp<br />Blog:<br />EnterpriseSearchBlog.com<br />
    94. 94. Questions/Follow-Up<br />Miles Kehoe<br />mbk@ideaeng.com<br />www.ideaeng.com<br />

    ×