SharePoint 2013 Search Architecture with Russ Houberg

SharePoint 2010 SharePoint 2013
Managed Property (Multiple) Search Schemas
Best Bets Promoted Results (Query Rule)
Scope and Federated Location Result Source
Content By Query Content By Search
Incremental Crawl Continuous Crawl
MCM MCSM

Continous Crawl Benefits Continus Crawl Facts

• No more waiting for index • Runs every 15 minutes by
merge default
• Does not wait for other • Default interval can be
crawls to complete changed with PowerShell
• Can have multiple • Should be used instead of
continuous crawls running incremental crawls for
simultaneously SharePoint content sources
• Continuous crawls ignores
errors

HTTP
Other
File Share End User Query
User Profile
Or
Content Process Initiated
SharePoint Sources Query

Content Query
Crawl Index
Processing Processing
Component Component
Component Component

Analytics
Processing Link Index
Crawl Partition(s)
Component Database
Database(s)

Event Store

Analytics
Database

What it Does Important Facts

• Crawls content sources to • We can have multiple crawl
populate index components
• Delivers crawl items (binary) and • MS Recommends: 2 Crawl
metadata to content processor Components per Search Service
• Invokes connectors or protocol Application
handlers to interact with content • MS Recommends: 8(4vm) CPU /
sources to retrieve data 8GB RAM per Crawl Component
• Uses one or more crawl
databases to store info about
crawl items and crawl history


• Processes crawl items and feeds to index • We must only have one (1) crawl
component processing component per server – more
• Transforms crawl items into artifacts that will hurt, not help crawl performance
can be included in search index • Max of 2 per search service application
(Performs document parsing and • Feeding Sessions are scaled based on
property mapping) CPU cores using a default coefficient of 3
• Writes information about links and urls 8 (cores) * 3 = 24 feeding sessions
in link database (which are analyzed by 4 (cores) * 3 = 12 feeding sessions
analytics to calculate relevance and • MS Recommends: 8(4vm) CPU / 8GB
currency - Results written back to search RAM per Content Processing Component
index by content processing component • Feeding sessions require RAM – More
• Generates phonetic name variations to RAM is necessary when more cores are
improve people search present – monitoring required


• Runs analytics jobs that analyze crawl items • Maximum of 6 per search service
and user interaction with search results to application
perform both search analytics and usage • Add more Analytics Processing Components
analytics to improve analytics performance
• Analyzes Link & Anchor text analysis, Clear • MS Recommends: 8(4vm) CPU / 8GB RAM /
distance, Search Clicks, Deep Links, Social 300GB disk space per Analytics Processing
Tags, Social Distance, Search Reports, Component.
Recommendations, Usage Counts, Activity • Interacts with Analytics Reporting to store
Ranking statistical information
• Improves search relevance and create • Interacts with Link database to store
search results information about searches and crawled
• Output included in search index by content documents
processor


• Receives processed items from content • Maximum of 60 index partitions (20
processing component and writes the index partitions X 3 index replicas) per
items to the index file search service application
• Receives queries from the query • Must provision one Index Component
processing component and returns for each index replica.
result sets • MS Recommends: 8(4vm) CPU / 16GB
• Redistributes content among index RAM / 500GB disk space per Index
partitions when index architecture is Component.
changed by Search Administration
Component

• Index partition is logical portion of entire search index (same
as before)
• Index partition is served by one or more index components
• Index components can be primary "replica" or secondary

Index "replica"
• Primary Replica is contacted by content processing
component to write new data in the index
Architecture • Secondary Replica is read only copy that get updated with
the data.
• Adding replicas improves query performance under load
• Add partitions to handle increased content corpus
• Can't remove partition after it has been added.


• Analyzes and processes queries and • Maximum of 1 per server
results • MS Recommends: 8(4vm) CPU / 8GB
• After receiving a query, it analyzes and RAM per Query Processing
processes the query to optimize Component.
precision, recall and relevance
• Submits processed queries to the index
component
• Processes the result set returned by
the index component before returning
to the querying entity.

Get-SPEnterpriseSearchService
Get-SPEnterpriseSearchServiceApplication
Get-SPEnterpriseSearchStatus Get-SPEnterpriseSearchQueryAndSiteSettingsService Get-SPEnterpriseSearchLanguageResourcePhrase
Get-SPEnterpriseSearchServiceApplicationProxy
New-SPEnterpriseSearchAdminComponent Get- Get-SPEnterpriseSearchSiteHitRule
Get-SPEnterpriseSearchServiceInstance SPEnterpriseSearchQueryAndSiteSettingsServiceInstan New-SPEnterpriseSearchLanguageResourcePhrase
New-SPEnterpriseSearchServiceApplication ce
Get-SPEnterpriseSearchCrawlContentSource New-SPEnterpriseSearchSiteHitRule
New-SPEnterpriseSearchServiceApplicationProxy Get-
Get-SPEnterpriseSearchCrawlCustomConnector Remove-SPEnterpriseSearchLanguageResourcePhrase
Remove-SPEnterpriseSearchServiceApplication SPEnterpriseSearchQueryAndSiteSettingsServiceProxy
Get-SPEnterpriseSearchCrawlDatabase Remove-SPEnterpriseSearchSiteHitRule
Remove-SPEnterpriseSearchServiceApplicationProxy Get-SPEnterpriseSearchQueryAuthority
Get-SPEnterpriseSearchCrawlExtension Get-SPEnterpriseSearchVssDataPath
Restore-SPEnterpriseSearchServiceApplication Get-SPEnterpriseSearchQueryDemoted
Get-SPEnterpriseSearchCrawlMapping Get-
Resume-SPEnterpriseSearchServiceApplication Get-SPEnterpriseSearchQueryKeyword SPEnterpriseSearchContentEnrichmentConfiguration
Get-SPEnterpriseSearchCrawlRule
Set-SPEnterpriseSearchService Get-SPEnterpriseSearchQueryScope Set-SPEnterpriseSearchPrimaryHostController
New-SPEnterpriseSearchCrawlComponent
Set-SPEnterpriseSearchServiceApplication Get-SPEnterpriseSearchQueryScopeRule Set-SPEnterpriseSearchLinguisticComponentsStatus
New-SPEnterpriseSearchCrawlContentSource
Set-SPEnterpriseSearchServiceApplicationProxy Get-SPEnterpriseSearchQuerySuggestionCandidates Set-
New-SPEnterpriseSearchCrawlCustomConnector
Start-SPEnterpriseSearchServiceInstance Get-SPEnterpriseSearchRankingModel SPEnterpriseSearchContentEnrichmentConfiguration
New-SPEnterpriseSearchCrawlDatabase
Stop-SPEnterpriseSearchServiceInstance Get-SPEnterpriseSearchSecurityTrimmer Remove-
New-SPEnterpriseSearchCrawlExtension
Suspend-SPEnterpriseSearchServiceApplication New-SPEnterpriseSearchQueryAuthority SPEnterpriseSearchContentEnrichmentConfiguration
New-SPEnterpriseSearchCrawlMapping
Upgrade-SPEnterpriseSearchServiceApplication New-SPEnterpriseSearchQueryDemoted New-
New-SPEnterpriseSearchCrawlRule SPEnterpriseSearchContentEnrichmentConfiguration
Backup-SPEnterpriseSearchServiceApplicationIndex New-SPEnterpriseSearchQueryKeyword
Remove-SPEnterpriseSearchCrawlContentSource Get-SPEnterpriseSearchLinguisticComponentsStatus
Upgrade- New-SPEnterpriseSearchQueryScope
Remove- Get-SPEnterpriseSearchHostController
SPEnterpriseSearchServiceApplicationSiteSettings New-SPEnterpriseSearchQueryScopeRule
SPEnterpriseSearchCrawlCustomConnector
Restore-SPEnterpriseSearchServiceApplicationIndex New-SPEnterpriseSearchRankingModel Set-SPEnterpriseSearchLinksDatabase
Remove-SPEnterpriseSearchCrawlDatabase
Remove- New-SPEnterpriseSearchSecurityTrimmer Repartition-SPEnterpriseSearchLinksDatabases
Remove-SPEnterpriseSearchCrawlExtension
SPEnterpriseSearchServiceApplicationSiteSettings Remove-SPEnterpriseSearchQueryAuthority Move-SPEnterpriseSearchLinksDatabases
Remove-SPEnterpriseSearchCrawlMapping
Get-SPEnterpriseSearchOwner Remove-SPEnterpriseSearchQueryDemoted Remove-SPEnterpriseSearchTenantSchema
Remove-SPEnterpriseSearchCrawlRule
Suspend-SPEnterpriseSearchServiceApplication Remove-SPEnterpriseSearchQueryKeyword Remove-SPEnterpriseSearchTenantConfiguration
Set-SPEnterpriseSearchCrawlContentSource
Set-SPEnterpriseSearchServiceInstance Remove-SPEnterpriseSearchQueryScope Remove-SPEnterpriseSearchLinksDatabase
Set-SPEnterpriseSearchCrawlDatabase
Remove-SPEnterpriseSearchQueryScopeRule Remove-SPEnterpriseSearchFileFormat
Set-SPEnterpriseSearchCrawlRule
Get-SPEnterpriseSearchMetadataCategory Remove-SPEnterpriseSearchRankingModel New-SPEnterpriseSearchLinksDatabase
Set-SPEnterpriseSearchCrawlLogReadPermission
Get-SPEnterpriseSearchMetadataCrawledProperty Remove-SPEnterpriseSearchSecurityTrimmer New-SPEnterpriseSearchFileFormat
Remove-
Get-SPEnterpriseSearchMetadataManagedProperty Set-SPEnterpriseSearchQueryAuthority New-
SPEnterpriseSearchCrawlLogReadPermission
Get-SPEnterpriseSearchMetadataMapping Set-SPEnterpriseSearchQueryKeyword SPEnterpriseSearchAnalyticsProcessingComponent
Remove-
SPEnterpriseSearchCrawlLogReadPermission New-SPEnterpriseSearchMetadataCategory Set-SPEnterpriseSearchQueryScope Import-SPEnterpriseSearchCustomExtractionDictionary
New-SPEnterpriseSearchMetadataCrawledProperty Set-SPEnterpriseSearchQueryScopeRule Get-SPEnterpriseSearchLinksDatabase
Import-SPEnterpriseSearchTopology New-SPEnterpriseSearchMetadataManagedProperty Set-SPEnterpriseSearchRankingModel Get-SPEnterpriseSearchFileFormat
Export-SPEnterpriseSearchTopology New-SPEnterpriseSearchMetadataMapping Start- Set-SPEnterpriseSearchFileFormatState
Set-SPEnterpriseSearchTopology Remove-SPEnterpriseSearchMetadataCategory SPEnterpriseSearchQueryAndSiteSettingsServiceInstan Get-SPEnterpriseSearchComponent
Remove-SPEnterpriseSearchTopology Remove- ce Get-
SPEnterpriseSearchMetadataManagedProperty Stop- SPEnterpriseSearchServiceApplicationBackupStore
Remove-SPEnterpriseSearchComponent
Remove-SPEnterpriseSearchMetadataMapping SPEnterpriseSearchQueryAndSiteSettingsServiceInstan
New-SPEnterpriseSearchTopology ce
Set-SPEnterpriseSearchMetadataCategory
New- Import-SPEnterpriseSearchPopularQueries
SPEnterpriseSearchQueryProcessingComponent Set-SPEnterpriseSearchMetadataCrawledProperty
Set-SPEnterpriseSearchMetadataManagedProperty Set-SPEnterpriseSearchResultItemType
New-SPEnterpriseSearchIndexComponent
Set-SPEnterpriseSearchMetadataMapping Set-SPEnterpriseSearchQuerySpellingCorrection

Host 1 Host 2 Host 5 Host 6

Web server Web server Web server Web server
All SharePoint databases All SharePoint databases

Application Office Application Office Search admin db Link db
Server Web Apps Server Web Apps
Server Server Crawl db Analytics db Redundant copies of all databases using
SQL clustering, mirroring, or SQL Server
SharePoint Config db 2012 AlwaysOn

All other SharePoint databases
Host 3 Host 4

Application Server Application Server

Query Processing Query Processing

Replica Index part ition 0 Replica


Crawl Crawl

Admin Admin

Analytics Analytics

Content processing Content processing

Host A Host B Host E Host F


Query Processing

Analytics Analytics
Application Server Application Server Content processing Content processing

Admin Admin

Crawl Content processing Crawl Content processing

Host C Host D
Host G Host H

Query Processing SharePoint databases SharePoint databases

Replica Index part ition 2 Replica Crawl db

Search admin db Crawl db
Redundant copies of all databases using
Application Server Application Server Link db Analytics db SQL clustering, mirroring, or SQL Server
2012 AlwaysOn


Host A Host B Host C Host D Host K Host L Host M Host N

Application Server Application Server Application Server Application Server


Analytics Analytics Analytics Analytics

Application Server Application Server Application Server Application Server Content processing Content processing
Content processing Content processing

Index part ition 1 Replica Index part ition 3 Replica
Replica Replica
Analytics Analytics

Crawl Admin Crawl Admin Content processing Content processing

Host E Host F Host G Host H
Host O Host P Host Q Host R

SharePoint databases SharePoint databases SharePoint databases SharePoint databases

Index part ition 4 Replica Replica Index part ition 6 Replica
Replica
Search admin db Link db
Redundant copies of all databases using Crawl db Redundant copies of all databases using
Analytics db SQL clustering, mirroring, or SQL Server
Application Server Application Server Application Server Application Server SQL clustering, mirroring, or SQL Server
2012 AlwaysOn Crawl db 2012 AlwaysOn
Analytics db

Crawl db Crawl db

Replica Index part ition 5 Replica Replica Index part ition 7 Replica
Crawl db

Host I Host J





Schema can be managed by site admins, reducing the load on search administrator

Schema can be configured to allow more granularity (query, retrieve, refine, sort, etc) - Affects
content index size

Remote result sources can be crawled locally and then queried by remote farms. Huge impact
on geo-distributed search… KL may be able to help!

Individual items can be re-crawled easily

Automatic URL balancing in crawl databases minimizes host name restrictions for large archive
repositories

Scalability limit changes will have a big impact on farm design for large archive content repositories in
the near future.

SharePoint 2013 Search Architecture with Russ Houberg

SharePoint 2013 Search Architecture with Russ Houberg

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to SharePoint 2013 Search Architecture with Russ Houberg

Similar to SharePoint 2013 Search Architecture with Russ Houberg (20)

More from knowledgelakemarketing

More from knowledgelakemarketing (7)

Recently uploaded

Recently uploaded (20)

SharePoint 2013 Search Architecture with Russ Houberg