November 23rd, 2013

Search Topology and Optimization

Mike Maadarani
SharePoint Architect
mike@maadarani.com
Thank you to all of our Sponsors!!
Bio..

Mike Maadarani
App Dev and Architecture for over 18 years (15 Years Microsoft, 3
Years with the “Other Guys”)
Busi...
Architecture and Resource
Utilization
Configuring SSA
and PS

Agenda
SharePoint 2013
Search Overview

Topology
Scenarios

...
Search in 2010

SharePoint 2010 Search Service Application
Query Component
Query
Engine
Property
Store
(SQL)

Crawl Compon...
FAST Search for SharePoint 2010

FAST
Content
SSA

FAST back-end components
(managed separately)

Extensibility:
• Sandbox...
… In SharePoint 2013

SharePoint 2013 Search Service Application
Extensibility:
• Web
callout
• Entity
Extraction
Crawl
Co...
SharePoint 2013 Search Architecture

Public API

Search topology components

HTTP
File shares
SharePoint
User profiles
Lot...
Why Search is so important?

FAST

I just uploaded a
document.
Make it searchable, quick!
Why Search is so important?

EASY
Why Search is so important?

EASY
Why Search is so important?

Search Driven Applications
Why Search is so important?

Search Everything

I can find ALL of Rob
Ford’s hidden videos!
Where does Search live in the farm?

Windows services

Processes

SharePoint Search Host Controller
service

Noderunner.ex...
Where do I host my
components?
Query processing component (QPC)

CPU load

Load impact (relative)

Driving factors
QPS
Query transformations

Network loa...
Index component

CPU load

Load impact (relative)

Driving factors
QPS and item count

Guidelines per index component @ 2 ...
Crawl component

CPU load

Load impact (relative)

Driving factors
Documents per second
Link discovery

Crawl management

...
Content processing component (CPC)

CPU load

Load impact (relative)

Driving factors
Documents per second
Document size a...
Analytics processing component (APC)

CPU load

Load impact (relative)

Driving factors
Number of items
Site activity

Dis...
Search administration component

Low CPU and network load

Load impact (relative)

Load increase with more components in t...
Create your SSA
Small Search Topology
Fault tolerant small search topology

Admin

CPC

Crawl

Index

APC

VM
Host

Admin

CPC

Crawl

VM

QPC

Index

APC

VM

...
Small search farm (up to 10M items)

Other
SharePoint
applications

Web front
end

Admin

CPC

Index

Crawl

APC

QPC
Scaling from small to medium search topology

Adm

Adm
Extend your SSA
Medium Search Topology
Tweaking Your results
Challenges: Intent

Infrastructure
Project

Where is my talk
Project Plan?

Are Documents held at
the same place?

There i...
Configuration in the Conceptual Relevance Flow

Query:
HR Employment
quarterly
report

Search
Web Part

(WORDS
(WORDS
(WOR...
Authorities: SSA-level configuration

Takes ~24hrs to
propagate

Sites that are important

Sites with low intrinsic releva...
Authorities: Connected
Authorities: Connected

Setting an authority affects all sites connected through hyperlinks

Sites are weighted
by distanc...
Query Rules
Tune Search Results
Created at the SSA, Tenant, Site Collection or Site
SSA

Site Collection

Site
Query Rules

Condition
When Do I apply the rule?
Action
What to do when the rule is matched?
Publishing
When should the ru...
Query Rules

Exact match, beginning or end
Ad-hoc or term store dictionary
Match a regex (advanced)
Is this query more lik...
Query Builder

Dynamically Ranking Change
Part of the query

Results Ranking
Query Builder
Session Objective and Takeaways

High Availability and Performance

Better Search Quality

Better management

Friendly res...
Q&A
Thank You / Merci
www.maadarani.com, mike@maadarani.com , @mikemaadarani
Remember to fill out your evaluation forms to win some
great prizes!

&

Join us for SharePint today!
Date & Time: Nov 23r...
Upcoming SlideShare
Loading in …5
×

SharePoint Search Topology and Optimization

957 views

Published on

This presentation covers the architecture of SharePoint Search Topology, how to extend search and how to optimize your search farm for better results. It describes how you can build your Search topology with PowerShell commands and it explains how you can use the Query Rules and Query Builder for a great search results.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
957
On SlideShare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
24
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

SharePoint Search Topology and Optimization

  1. 1. November 23rd, 2013 Search Topology and Optimization Mike Maadarani SharePoint Architect mike@maadarani.com
  2. 2. Thank you to all of our Sponsors!!
  3. 3. Bio.. Mike Maadarani App Dev and Architecture for over 18 years (15 Years Microsoft, 3 Years with the “Other Guys”) Business focused on Enterprise Content Management & Publishing Sites Technology focused on SharePoint, SQL Server and SharePoint Integration Architect, trainer, and presenter Blog: www.maadarani.com mike@maadarani.com; @mikemaadarani
  4. 4. Architecture and Resource Utilization Configuring SSA and PS Agenda SharePoint 2013 Search Overview Topology Scenarios Relevancy, Query Builder, & Optimization Closing and Q&A
  5. 5. Search in 2010 SharePoint 2010 Search Service Application Query Component Query Engine Property Store (SQL) Crawl Component
  6. 6. FAST Search for SharePoint 2010 FAST Content SSA FAST back-end components (managed separately) Extensibility: • Sandbox • Entity Extraction FAST Query SSA
  7. 7. … In SharePoint 2013 SharePoint 2013 Search Service Application Extensibility: • Web callout • Entity Extraction Crawl Component Crawl Index Component Query Engine Content Processing Component Content Pipeline Analytics Processing Component Separate crawl and indexing Query Pipeline Query Processing Component Property Store (SQL) Indexing Engine Admin Component Analysis Engine Search Admin Link/query analysis & recommendations Entire index on local disk
  8. 8. SharePoint 2013 Search Architecture Public API Search topology components HTTP File shares SharePoint User profiles Lotus Notes Documentum Exchange folders Custom - BCS SharePoint SP Apps Devices Non-SP UX
  9. 9. Why Search is so important? FAST I just uploaded a document. Make it searchable, quick!
  10. 10. Why Search is so important? EASY
  11. 11. Why Search is so important? EASY
  12. 12. Why Search is so important? Search Driven Applications
  13. 13. Why Search is so important? Search Everything I can find ALL of Rob Ford’s hidden videos!
  14. 14. Where does Search live in the farm? Windows services Processes SharePoint Search Host Controller service Noderunner.exe Runtime environment for search components (except crawler) Runtime/lifecycle control of search components (except crawler)  hostcontrollerservice.exe SharePoint Server Search service SharePoint App Server Crawl Component  mssearch.exe  mssdmn.exe Still there, but only Crawl Component Search Service Instance: Provisioning of the search service on each box Search Service Application: SharePoint Configuration entity hostcontrollerservice.exe Host Controller Search Runtime Environment msseearch.exe mssdmn.exe Crawl Component noderunner.exe noderunner.exe noderunner.exe noderunner.exe noderunner.exe Admin Component Query Processing Component Content Processing Component Index Component Analytics Processing Component
  15. 15. Where do I host my components?
  16. 16. Query processing component (QPC) CPU load Load impact (relative) Driving factors QPS Query transformations Network load Driving factors Number of index partitions Size of queries and results Example: 20 index partitions @ 20 qps => 200/100 Mbit/s in/outbound Item count CPU DPS Network QPS Disk
  17. 17. Index component CPU load Load impact (relative) Driving factors QPS and item count Guidelines per index component @ 2 GHz CPU 1M items: 5 QPS per CPU core 5M items: 2 QPS per CPU core 10M items: 1 QPS per CPU core Disk load Driving factors QPS and item count New content invalidates caches Disk size: 500GB @ 10M items per index component Item count CPU DPS Network QPS Disk
  18. 18. Crawl component CPU load Load impact (relative) Driving factors Documents per second Link discovery Crawl management Network load Driving factors Downloading items from content sources Passing items on to CPC Disk load All documents are temporarily stored in data folder Item count CPU DPS Network QPS Disk
  19. 19. Content processing component (CPC) CPU load Load impact (relative) Driving factors Documents per second Document size and complexity Feature extraction Estimate: 5-10 DPS per CPU core Network load Driving factors Documents per second Document size Item count CPU DPS Network QPS Disk
  20. 20. Analytics processing component (APC) CPU load Load impact (relative) Driving factors Number of items Site activity Disk load Local disk used for temporary storage Bulk load, primacy concern is load isolation Network load Same as for CPU load PLUS: Network traffic increases when distributing APC across multiple machines Item count CPU DPS Network QPS Disk
  21. 21. Search administration component Low CPU and network load Load impact (relative) Load increase with more components in the search topology Item count CPU DPS Network QPS Disk
  22. 22. Create your SSA
  23. 23. Small Search Topology
  24. 24. Fault tolerant small search topology Admin CPC Crawl Index APC VM Host Admin CPC Crawl VM QPC Index APC VM QPC VM Host
  25. 25. Small search farm (up to 10M items) Other SharePoint applications Web front end Admin CPC Index Crawl APC QPC
  26. 26. Scaling from small to medium search topology Adm Adm
  27. 27. Extend your SSA
  28. 28. Medium Search Topology
  29. 29. Tweaking Your results
  30. 30. Challenges: Intent Infrastructure Project Where is my talk Project Plan? Are Documents held at the same place? There is rarely a single right answer Different people have different intents Query Rules help you handle intents I wonder if there are references from previous projects?
  31. 31. Configuration in the Conceptual Relevance Flow Query: HR Employment quarterly report Search Web Part (WORDS (WORDS (WORDS (WORDS HR, Human Resources) AND employees, employed) AND quarterly, quarterlies) AND report, reports, reported) Query Processing Thesaurus: HR  Human Resources Best bets: HR Employment /HR/employment Dynamic Reordering Rules: Quarterly Report  {prefer docs from http://reports} Query Rule: {Terms} Quarterly Report  {Terms} ContentType=“reports” Mixed Results for: • HR Employment best bet • HR Employment quarterly report • HR Employment ContentType=reports Engine Document Collection For all queries: Authorities: Level 1: http://employment Ranking model: {incorporate user ratings}
  32. 32. Authorities: SSA-level configuration Takes ~24hrs to propagate Sites that are important Sites with low intrinsic relevance
  33. 33. Authorities: Connected
  34. 34. Authorities: Connected Setting an authority affects all sites connected through hyperlinks Sites are weighted by distance to the authority
  35. 35. Query Rules Tune Search Results Created at the SSA, Tenant, Site Collection or Site SSA Site Collection Site
  36. 36. Query Rules Condition When Do I apply the rule? Action What to do when the rule is matched? Publishing When should the rule be active?
  37. 37. Query Rules Exact match, beginning or end Ad-hoc or term store dictionary Match a regex (advanced) Is this query more likely aimed at the following source…?  Do people mostly click on result of the following type…?      Show a promoted result  Show a block of results  Replace the core results with a different query
  38. 38. Query Builder Dynamically Ranking Change Part of the query Results Ranking
  39. 39. Query Builder
  40. 40. Session Objective and Takeaways High Availability and Performance Better Search Quality Better management Friendly results and tools
  41. 41. Q&A Thank You / Merci www.maadarani.com, mike@maadarani.com , @mikemaadarani
  42. 42. Remember to fill out your evaluation forms to win some great prizes! & Join us for SharePint today! Date & Time: Nov 23rd, 2013 @6:00 pm Location: The Observatory Pub, Algonquin Student’s Association Address: A-170 on Algonquin Campus Parking: No need to move your car!* Site: http://www.algonquinsa.com/ob.aspx *Please drive responsibly! We are happy to call you a cab 

×