Your SlideShare is downloading. ×
Examining the newSearch core in SP2013Marcus JohanssonOSLO STOCKHOLM   LONDON   BOSTON   SINGAPORE
Marcus Johansson• Senior Consultant, Comperio• V-TSP Enterprise Search, MicrosoftEmail:      marcus.johansson@comperiosear...
End of an era, birth of a New age• FAST now “fully integrated”   – True, but there’s more!• No longer a “FAST license”  – ...
The evolution of FAST                                    Secret sauce                                     (incl. Mars)    ...
All this talk about the new Sheriff…• Search in SP2013 gets a lot of attention   – Revamped user/admin interface   – Hover...
…what Search in SP13 really is Empowering                              Better, more                  Major user  the whole...
For the first time,  Search isn’t defined by the        nuts and bolts,but from the User Experienceand high-level tools ar...
Examining SharePoint 2013’sNEW SEARCH CORE
Search architecture                                  Public API                                  Extensibility Points     ...
The search components• A “node” is an instance of a component• Scale by adding nodes
RESTful interfaces• Directly interact with SharePoint artifacts by  using any technology supporting REST• Also:  – CSOM   ...
The new Search Service Application
Keeping it all together            Services                                 ProcessesProcess name                Descripti...
Crawl component                      • Changes from SP2010       mssearch.exe     – Only crawling                         ...
Continuous crawls• Not event-driven indexing• Starts crawl regardless of prior crawl session• Large change sets no longer ...
Crawl health reports   Crawl rate per type                   Crawl load                               CPU and     Content ...
Crawl component performance• Anecdotal: feels faster, more stable• Bound by CPU and network  – Documents per second  – Lin...
Content processing component             • Schema mapping                – Crawled  Managed properties             • Enti...
Content processing flows• Hidden in SP2013. In FSIS, flows could be  created, modified and debugged in real-time.• Why on ...
Processing flow execution
Index component            • Disk-based and atomic(!)            • Divided into partitions            • One partition per ...
Example: Partitions and replicasSame contentDifferent content
Query processing component• Prepares the queries   – Query rules   – Result sources   – Linguistics/dictionaries   – Etc.•...
Query rules• For a certain term  trigger certain action:   – Add/change query terms   – Use alternate sorting/relevance  ...
Query builder• Easily builds KQL  – CSWP, result sources, query rules, etc.
Query client types• Adjust throttling per client type
Query health reports     Latency per processing node in SharePoint flow                                 Latency in        ...
Analytics processing component• Analyzes crawled items and search usage• Updates index without re-indexing documents• Resu...
Type 1: Search analyticsInfluences relevanceType                DescriptionAnchor processing   Comparable to Google PageRa...
Type 2: SP usage analytics• Usage counts  – Opened and viewed items  – From all of SharePoint, not just search results  – ...
Search reports• Self-learning relevance aside,  never underestimate manual effort!   – Query rules, synonyms, boosts, etc....
Search administration component• Provisions other search components• Talks to Admin database on behalf of:  Crawl, Content...
Hardware propertiesComponent              CPU      Memory   Disk I/O   NetworkCrawl                  Medium   Medium   Med...
Changes in HW requirements• I/O bound, lots of IOPS!   • Still I/O-bound, but:• VMs not recommended           – VMs are fi...
A note on RAM consumption• Search is a BIG thief of RAM in SP13• Memory limit configurable in:  <15 hive>SearchRuntime1.0n...
Questions?Email:      marcus.johansson@comperiosearch.com                                                         ThankTwi...
Upcoming SlideShare
Loading in...5
×

Examining the new search core in SharePoint 2013

3,172

Published on

Much has been written about Search in SharePoint 2013 – and rightly so. But if you think the airtight FAST integration is all there is… think again! New cutting-edge technology has made its way into the platform, promising for more flexibility, better performance and richer functionality.
But how is it different? What can we use it for? And how does all of this affect capacity planning, deployment, and day-to-day maintenance like resizing, coping with HW failures and keeping up the seemingly always-changing business requirements? In this session we'll explore all of this - and more!

Published in: Technology
0 Comments
7 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,172
On Slideshare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
0
Comments
0
Likes
7
Embeds 0
No embeds

No notes for slide

Transcript of "Examining the new search core in SharePoint 2013"

  1. 1. Examining the newSearch core in SP2013Marcus JohanssonOSLO STOCKHOLM LONDON BOSTON SINGAPORE
  2. 2. Marcus Johansson• Senior Consultant, Comperio• V-TSP Enterprise Search, MicrosoftEmail: marcus.johansson@comperiosearch.comTwitter: @marcjohaBlog: http://blog.comperiosearch.comLinkedIn: http://www.linkedin.com/in/marcusjohansson
  3. 3. End of an era, birth of a New age• FAST now “fully integrated” – True, but there’s more!• No longer a “FAST license” – SP2013 contains everything – Enterprise version• Migration from FS4SP? – Brr…  1997 – 2013
  4. 4. The evolution of FAST Secret sauce (incl. Mars) FSIS Search in FDS ESP FS4SP SP2013 FSIA Search in SP2010
  5. 5. All this talk about the new Sheriff…• Search in SP2013 gets a lot of attention – Revamped user/admin interface – Hover panels, previews – Query rules, result blocks – Result types, display templates – “You’ve seen this result before” – Query Builder – Content Search web part – Etc.• Notice the pattern?
  6. 6. …what Search in SP13 really is Empowering Better, more Major user the whole powerful experience SharePoint extensibility overhaul experience Finally a Vastly single search improved architecture search core• How come most of the buzz is about the UX?
  7. 7. For the first time, Search isn’t defined by the nuts and bolts,but from the User Experienceand high-level tools around it.
  8. 8. Examining SharePoint 2013’sNEW SEARCH CORE
  9. 9. Search architecture Public API Extensibility Points Unit of scale/role boundary Crawl Link Analytics Admin Reporting
  10. 10. The search components• A “node” is an instance of a component• Scale by adding nodes
  11. 11. RESTful interfaces• Directly interact with SharePoint artifacts by using any technology supporting REST• Also: – CSOM JavaScript, Silverlight – SSOM Managed code
  12. 12. The new Search Service Application
  13. 13. Keeping it all together Services ProcessesProcess name Descriptionhostcontrollerservice.exe Process controller. Monitors and restarts children.noderunner.exe A search component (except the crawl component)mssearch.exe The crawl component.
  14. 14. Crawl component • Changes from SP2010 mssearch.exe – Only crawling • No indexing – Continuous crawl • Improves freshness – Crawl Log • More details • Document removal Crawl – Crawl Health Report • Huge improvement!
  15. 15. Continuous crawls• Not event-driven indexing• Starts crawl regardless of prior crawl session• Large change sets no longer bad for freshness Full and incremental Continuous Default 15 min time• Only available for SharePoint content types – Possible to crawl SP 2010 and 2007
  16. 16. Crawl health reports Crawl rate per type Crawl load CPU and Content Rate Latency Freshness memory Processing Etc. load activity
  17. 17. Crawl component performance• Anecdotal: feels faster, more stable• Bound by CPU and network – Documents per second – Link discovery• Some I/O – files temporarily stored on disk• Adjust performance by: – Crawler impact rules – Performance level (number of threads) Set-SPEnterpriseSearchService -PerformanceLevel X
  18. 18. Content processing component • Schema mapping – Crawled  Managed properties • Entity extraction – Companies and custom • Advanced Filter Pack is gone – Though PDFs are out of the box • Extensible through web service • Internally: processing flows Link – Replaces pipeline in FS4SP – Based on FSIS/CTS. Hidden 
  19. 19. Content processing flows• Hidden in SP2013. In FSIS, flows could be created, modified and debugged in real-time.• Why on earth was this not included in SP2013!? The flow designer in FSIS, not available in SP2013.
  20. 20. Processing flow execution
  21. 21. Index component • Disk-based and atomic(!) • Divided into partitions • One partition per 10M docs • 1 partition contains 1+ replicas – fault-tolerance – query volume • 1 replica, 1 server • Indexing partially in-memory
  22. 22. Example: Partitions and replicasSame contentDifferent content
  23. 23. Query processing component• Prepares the queries – Query rules – Result sources – Linguistics/dictionaries – Etc.• Manipulates the results – Display templates – Late security trimming – Etc.• Internally: processing flows – Derived from FSIS/IMS. Again, this is hidden  – Still MAJOR improvement compared to FS4SP
  24. 24. Query rules• For a certain term  trigger certain action: – Add/change query terms – Use alternate sorting/relevance – Hybrid search (or other federated results) – Etc.• Replaces search keywords in SP2010• Configure at farm, site collection or site-level• Warning: Triggering the query rules engine comes with a penalty – Anecdotal tests: ~70ms + excl. parallel queries
  25. 25. Query builder• Easily builds KQL – CSWP, result sources, query rules, etc.
  26. 26. Query client types• Adjust throttling per client type
  27. 27. Query health reports Latency per processing node in SharePoint flow Latency in Latency in Trend Overall each Index times Etc. main flow subflow
  28. 28. Analytics processing component• Analyzes crawled items and search usage• Updates index without re-indexing documents• Result: relevance becomes self-learning – Also: search reports and recommendations Link Analytics Reporting
  29. 29. Type 1: Search analyticsInfluences relevanceType DescriptionAnchor processing Comparable to Google PageRank.Click Distance Number of clicks to an authoritative page.Search clicks Keeps track of how users click in the results.Used in search centerType DescriptionSocial tags Tags that users apply to content. Not used per default, but could be integrated as e.g. refiners.Social distance Used for sorting in People search.Deep links Subsite that users click on are added as deep links on the top-site result.
  30. 30. Type 2: SP usage analytics• Usage counts – Opened and viewed items – From all of SharePoint, not just search results – Improves relevance• Activity ranking – Looks for trends and boosts “hot” items• Recommendations – Looks for usage patterns within a site – “People who viewed this also viewed…”
  31. 31. Search reports• Self-learning relevance aside, never underestimate manual effort! – Query rules, synonyms, boosts, etc.• Automatic reports: – Number of queries – Top queries – Abandoned queries – No-result queries – Query rule usage
  32. 32. Search administration component• Provisions other search components• Talks to Admin database on behalf of: Crawl, Content and Query processing components• In previous FAST products, it was impossible to make the admin component redundant – Not the case in SP2013! – Scale appropriately Admin
  33. 33. Hardware propertiesComponent CPU Memory Disk I/O NetworkCrawl Medium Medium Medium HighContent processing High High MediumIndex High High High MediumQuery processing Low Medium MediumAnalytics processing Medium Medium Medium HighSearch administration Low Low Low Low• Special cases – Crawler temporarily store files on disk – Memory usage of admin component increases with topology size
  34. 34. Changes in HW requirements• I/O bound, lots of IOPS! • Still I/O-bound, but:• VMs not recommended – VMs are fine!• Often issues with SANs – SANs are fine! • More RAM required, but: – Lower indexing latency – Lower search times• Thresholds: • Thresholds : – 15M items/server – 10M items/server – Tested at 500M items – Tested at 500M items
  35. 35. A note on RAM consumption• Search is a BIG thief of RAM in SP13• Memory limit configurable in: <15 hive>SearchRuntime1.0noderunner.exe.config – Warning: Components may crash at limit• Safer options: – Decrease memory limit for the Distributed Cache service. – Tell your boss: “RAM is cheap. I’m not!”
  36. 36. Questions?Email: marcus.johansson@comperiosearch.com ThankTwitter:Blog: @marcjoha http://blog.comperiosearch.com you!LinkedIn: http://www.linkedin.com/in/marcusjohansson

×