0
Search Topology and OptimizationApril 12, 2013
Mike Maadarani
SharePoint Architect
Bio..
Mike Maadarani
App Dev and Architecture for over 18 years (15 Years Microsoft, 3
Years with the “Other Guys”)
Busin...
Configuring SSA
and PS
Topology
Scenarios
Agenda
Closing and
Q&A
Relevancy, Query Builder, &
Optimization
SharePoint 2013
...
Search in 2010
Crawl Component
Query Component
SharePoint 2010 Search Service Application
Query
Engine
Property
Store
(SQL)
FAST Search for SharePoint 2010
FAST
Content
SSA
FAST
Query
SSA
FAST back-end components
(managed separately)
Extensibilit...
… In SharePoint 2013
SharePoint 2013 Search Service Application
Index
Component
Query
Engine
Content
Pipeline
Content
Proc...
SharePoint 2013 Search Architecture
SharePoint
SP Apps
Devices
Non-SP UX
HTTP
File shares
SharePoint
User profiles
Lotus N...
Why Search is so important?
I just uploaded a
document.
Make it searchable,
quick!
FAST
Why Search is so important?
EASY
Why Search is so important?
EASY
Why Search is so important?
Search Driven
Applications
Why Search is so important?
Search
Everything
I can find ALL of Rob
Ford’s hidden videos!
noderunner.exe noderunner.exe noderunner.exe noderunner.exe
Where does Search live in the farm?
Windows services
SharePoin...
Where do I host my
components?
CPU load
Driving factors
QPS
Query transformations
Network load
Driving factors
Number of index partitions
Size of queries...
CPU load
Driving factors
QPS and item count
Guidelines per index component @ 2 GHz CPU
1M items: 5 QPS per CPU core
5M ite...
Crawl component
CPU load
Driving factors
Documents per second
Link discovery
Crawl management
Network load
Driving factors...
Content processing component (CPC)
CPU load
Driving factors
Documents per second
Document size and complexity
Feature extr...
Analytics processing component (APC)
CPU load
Driving factors
Number of items
Site activity
Disk load
Local disk used for ...
Search administration component
Low CPU and network load
Load increase with more components in the
search topology
Item
co...
Create your SSA
Small Search Topology
Fault tolerant small search topology
Host
VM
Index QPC
VM
Admin
Crawl
CPC
APC
Host
VM
Index QPC
VM
Admin
Crawl
CPC
APC
Other
SharePoint
applications
Web front
end
Admin
Crawl
CPC
APC
Index
QPC
Small search farm (up to 10M items)
Scaling from small to medium search topology
Adm
Adm
Extend your SSA
Medium Search Topology
Hybrid Search
Why Hybrid Search?
 Hybrid SharePoint environment
 Pieces of content distributed across multiple environments
 Complexi...
Benefits
 Provide integrated search results allowing for a single place to find
content
 One Enterprise Search center to...
One-way outbound topology
WFE
SharePoint Online
Local search
results only
Site collection
Office365 tenant SharePoint Serv...
One-way inbound topology
WFE
SharePoint Online
Local search
results only
Site collection
Office365 tenant SharePoint Serve...
One-way inbound topology
WFE
SharePoint Online
Local search
results only
Site collection
Office365 tenant SharePoint Serve...
Tweaking Your results
Challenges: Intent
Where is my talk
Project Plan?
Are Documents held at
the same place?
I wonder if there are
references f...
Authorities: SSA-level configuration
Sites that are important
Sites with low intrinsic relevance
Takes ~24hrs to
propagate
Authorities: Connected
Authorities: Connected
Setting an authority affects all sites connected through hyperlinks
Sites are weighted
by distance ...
Query Rules
Tune Search Results
Created at the SSA, Tenant, Site Collection or Site
SSA
Site Collection
Site
Query Rules
Condition
When Do I apply the rule?
Action
What to do when the rule is matched?
Publishing
When should the rul...
Query Rules
 Exact match, beginning or end
 Ad-hoc or term store dictionary
 Match a regex (advanced)
 Is this query m...
Query Builder
Dynamically Ranking Change
Part of the query
Results Ranking
Query Builder
Configuration in the Conceptual Relevance Flow
For all queries:
Authorities: Level 1: http://employment
Ranking model: {in...
Create a Query Rule – Hybrid
From Result Source drop-down list, select the specified result source
Under Query is performe...
Hybrid Results
Results from
SharePoint
Online
Results from
SharePoint
Server
Session Objective and Takeaways
High Availability and Performance
Better Search Quality
Better management
Friendly results...
Thank You!
www.maadarani.com, mike@maadarani.com , @mikemaadarani
www.slideshare.net/maadarani
Upcoming SlideShare
Loading in...5
×

SharePoint 2013 Search Topology and Optimization

1,012

Published on

In this presentation, I am explaining the details of all search components, how to properly configure the search topology, and the options to extend the search farm in a hybrid “cloud/on-premises” scenario. This presentation will explain what you need to consider to design your search, in order to handle your organization's needs. We will dive into scripting a high availability search topology, keeping it healthy and manage your day-to-day search operations.

Learn about how to optimize your search for best performance and search relevancy, to support reliable search applications.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,012
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
43
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • On-premises SharePoint Server 2013 Enterprise Search portal: Local and remote search results are availableSharePoint Online search portal: Local search results are available
  • Reverse proxy devices play a role in the secure configuration of a hybrid SharePoint Server 2013 deployment when inbound traffic from SharePoint Online needs to be relayed to your on-premises SharePoint Server 2013 farmWindows Server 2012 with Web Application ProxyG5 Big-IP
  • Reverse proxy devices play a role in the secure configuration of a hybrid SharePoint Server 2013 deployment when inbound traffic from SharePoint Online needs to be relayed to your on-premises SharePoint Server 2013 farmWindows Server 2012 with Web Application ProxyG5 Big-IPTwo-way trust is needed
  • Transcript of "SharePoint 2013 Search Topology and Optimization"

    1. 1. Search Topology and OptimizationApril 12, 2013 Mike Maadarani SharePoint Architect
    2. 2. Bio.. Mike Maadarani App Dev and Architecture for over 18 years (15 Years Microsoft, 3 Years with the “Other Guys”) Business focused on Enterprise Content Management, Publishing Sites, & Search Technology focused on SharePoint, SQL Server and SharePoint Integration Architect, trainer, and presenter Blog: www.maadarani.com mike@maadarani.com; @mikemaadarani
    3. 3. Configuring SSA and PS Topology Scenarios Agenda Closing and Q&A Relevancy, Query Builder, & Optimization SharePoint 2013 Search Overview Architecture and Resource Utilization Hybrid… Say What?
    4. 4. Search in 2010 Crawl Component Query Component SharePoint 2010 Search Service Application Query Engine Property Store (SQL)
    5. 5. FAST Search for SharePoint 2010 FAST Content SSA FAST Query SSA FAST back-end components (managed separately) Extensibility: • Sandbox • Entity Extraction
    6. 6. … In SharePoint 2013 SharePoint 2013 Search Service Application Index Component Query Engine Content Pipeline Content Processing Component Crawl Component Query Processing Component Analytics Processing Component Query Pipeline Search Admin Admin Component Entire index on local disk Property Store (SQL) Analysis Engine Crawl Indexing Engine Link/query analysis & recommendations Separate crawl and indexing Extensibility: • Web callout • Entity Extraction
    7. 7. SharePoint 2013 Search Architecture SharePoint SP Apps Devices Non-SP UX HTTP File shares SharePoint User profiles Lotus Notes Documentum Exchange folders Custom - BCS Public API Search topology components
    8. 8. Why Search is so important? I just uploaded a document. Make it searchable, quick! FAST
    9. 9. Why Search is so important? EASY
    10. 10. Why Search is so important? EASY
    11. 11. Why Search is so important? Search Driven Applications
    12. 12. Why Search is so important? Search Everything I can find ALL of Rob Ford’s hidden videos!
    13. 13. noderunner.exe noderunner.exe noderunner.exe noderunner.exe Where does Search live in the farm? Windows services SharePoint Search Host Controller service Runtime/lifecycle control of search components (except crawler)  hostcontrollerservice.exe SharePoint Server Search service Crawl Component  mssearch.exe  mssdmn.exe Processes Noderunner.exe Runtime environment for search components (except crawler) msseearch.exe mssdmn.exe Crawl Componentnoderunner.exe Search Runtime Environment hostcontrollerservice.exe Host Controller SharePointAppServer Search Service Instance: Provisioning of the search service on each box Search Service Application: SharePoint Configuration entity Still there, but only Crawl Component Admin Component Query Processing Component Content Processing Component Index Component Analytics Processing Component
    14. 14. Where do I host my components?
    15. 15. CPU load Driving factors QPS Query transformations Network load Driving factors Number of index partitions Size of queries and results Example: 20 index partitions @ 20 qps => 200/100 Mbit/s in/outbound Query processing component (QPC) Item count DPS QPS Load impact (relative) CPU Network Disk http://social.technet.microsoft.com/wiki/contents/articles/16002.sharepoint-2013-capacity-planning-sizing-and-high-availability-for-search-in- spc172.aspx
    16. 16. CPU load Driving factors QPS and item count Guidelines per index component @ 2 GHz CPU 1M items: 5 QPS per CPU core 5M items: 2 QPS per CPU core 10M items: 1 QPS per CPU core Disk load Driving factors QPS and item count New content invalidates caches Disk size: 500GB @ 10M items per index component Index component Item count DPS QPS Load impact (relative) CPU Network Disk
    17. 17. Crawl component CPU load Driving factors Documents per second Link discovery Crawl management Network load Driving factors Downloading items from content sources Passing items on to CPC Disk load All documents are temporarily stored in data folder Item count DPS QPS Load impact (relative) CPU Network Disk
    18. 18. Content processing component (CPC) CPU load Driving factors Documents per second Document size and complexity Feature extraction Estimate: 5-10 DPS per CPU core Network load Driving factors Documents per second Document size Item count DPS QPS Load impact (relative) CPU Network Disk
    19. 19. Analytics processing component (APC) CPU load Driving factors Number of items Site activity Disk load Local disk used for temporary storage Bulk load, primacy concern is load isolation Network load Same as for CPU load PLUS: Network traffic increases when distributing APC across multiple machines Item count DPS QPS Load impact (relative) CPU Network Disk
    20. 20. Search administration component Low CPU and network load Load increase with more components in the search topology Item count DPS QPS Load impact (relative) CPU Network Disk
    21. 21. Create your SSA
    22. 22. Small Search Topology
    23. 23. Fault tolerant small search topology Host VM Index QPC VM Admin Crawl CPC APC Host VM Index QPC VM Admin Crawl CPC APC
    24. 24. Other SharePoint applications Web front end Admin Crawl CPC APC Index QPC Small search farm (up to 10M items)
    25. 25. Scaling from small to medium search topology Adm Adm
    26. 26. Extend your SSA
    27. 27. Medium Search Topology
    28. 28. Hybrid Search
    29. 29. Why Hybrid Search?  Hybrid SharePoint environment  Pieces of content distributed across multiple environments  Complexity due to multiple locations  Many top level domains requiring knowledge of where to go to locate the most relevant content  No single Enterprise Search Center for finding content  Lost user productivity and added frustration while trying to locate relevant content
    30. 30. Benefits  Provide integrated search results allowing for a single place to find content  One Enterprise Search center to reduce User Interface complexity  Query all of your SharePoint content at the same time  Allow O365 and On-Premises solutions to coexist  Provides a solution allowing customers to move to the cloud on their own terms  Reduce operation cost  Take advantage of newer SharePoint feature updates in O365  Hybrid search solves many problems as data is moving from on- premises to O365
    31. 31. One-way outbound topology WFE SharePoint Online Local search results only Site collection Office365 tenant SharePoint Server 2013 Farm Hybrid search results Outbound Inbound SharePoint Online can NOT query SharePoint On-prem Internet Microsoft data center On-premises SharePoint Server can query SharePoint Online
    32. 32. One-way inbound topology WFE SharePoint Online Local search results only Site collection Office365 tenant SharePoint Server 2013 Farm Hybrid search results Outbound Inbound SharePoint Online can query SharePoint On-prem Internet Microsoft data center On-premises SharePoint Server can NOT query SharePoint Online Reverse Proxy DMZ
    33. 33. One-way inbound topology WFE SharePoint Online Local search results only Site collection Office365 tenant SharePoint Server 2013 Farm Hybrid search results Outbound Inbound SharePoint Online can query SharePoint On-prem Internet Microsoft data center On-premises SharePoint Server can query SharePoint Online Reverse Proxy DMZ
    34. 34. Tweaking Your results
    35. 35. Challenges: Intent Where is my talk Project Plan? Are Documents held at the same place? I wonder if there are references from previous projects? Different people have different intents Query Rules help you handle intents There is rarely a single right answer Infrastructure Project
    36. 36. Authorities: SSA-level configuration Sites that are important Sites with low intrinsic relevance Takes ~24hrs to propagate
    37. 37. Authorities: Connected
    38. 38. Authorities: Connected Setting an authority affects all sites connected through hyperlinks Sites are weighted by distance to the authority
    39. 39. Query Rules Tune Search Results Created at the SSA, Tenant, Site Collection or Site SSA Site Collection Site
    40. 40. Query Rules Condition When Do I apply the rule? Action What to do when the rule is matched? Publishing When should the rule be active?
    41. 41. Query Rules  Exact match, beginning or end  Ad-hoc or term store dictionary  Match a regex (advanced)  Is this query more likely aimed at the following source…?  Do people mostly click on result of the following type…?  Show a promoted result  Show a block of results  Replace the core results with a different query
    42. 42. Query Builder Dynamically Ranking Change Part of the query Results Ranking
    43. 43. Query Builder
    44. 44. Configuration in the Conceptual Relevance Flow For all queries: Authorities: Level 1: http://employment Ranking model: {incorporate user ratings} Query: HR Employment quarterly report Search Web Part Query Processing Engine Document Collection Thesaurus: HR  Human Resources Best bets: HR Employment /HR/employment (WORDS HR, Human Resources) AND (WORDS employees, employed) AND (WORDS quarterly, quarterlies) AND (WORDS report, reports, reported) Mixed Results for: • HR Employment best bet • HR Employment quarterly report • HR Employment ContentType=reports Dynamic Reordering Rules: Quarterly Report  {prefer docs from http://reports} Query Rule: {Terms} Quarterly Report  {Terms} ContentType=“reports”
    45. 45. Create a Query Rule – Hybrid From Result Source drop-down list, select the specified result source Under Query is performed on these sources, if you select “One of these sources”, make sure to select the result source you created
    46. 46. Hybrid Results Results from SharePoint Online Results from SharePoint Server
    47. 47. Session Objective and Takeaways High Availability and Performance Better Search Quality Better management Friendly results and tools
    48. 48. Thank You! www.maadarani.com, mike@maadarani.com , @mikemaadarani www.slideshare.net/maadarani
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×