• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
SharePoint 2013  Search Architecture with Russ Houberg
 

SharePoint 2013 Search Architecture with Russ Houberg

on

  • 528 views

 

Statistics

Views

Total Views
528
Views on SlideShare
528
Embed Views
0

Actions

Likes
1
Downloads
30
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    SharePoint 2013  Search Architecture with Russ Houberg SharePoint 2013 Search Architecture with Russ Houberg Presentation Transcript

    • SharePoint 2010 SharePoint 2013Managed Property (Multiple) Search SchemasBest Bets Promoted Results (Query Rule)Scope and Federated Location Result SourceContent By Query Content By SearchIncremental Crawl Continuous CrawlMCM MCSM
    • Continous Crawl Benefits Continus Crawl Facts• No more waiting for index • Runs every 15 minutes by merge default• Does not wait for other • Default interval can be crawls to complete changed with PowerShell• Can have multiple • Should be used instead of continuous crawls running incremental crawls for simultaneously SharePoint content sources• Continuous crawls ignores errors
    • HTTP Other File Share End User QueryUser Profile Or Content Process Initiated SharePoint Sources Query Content Query Crawl Index Processing Processing Component Component Component Component Analytics Processing Link Index Crawl Partition(s) Component Database Database(s) Event Store Analytics Database
    • What it Does Important Facts• Crawls content sources to • We can have multiple crawl populate index components• Delivers crawl items (binary) and • MS Recommends: 2 Crawl metadata to content processor Components per Search Service• Invokes connectors or protocol Application handlers to interact with content • MS Recommends: 8(4vm) CPU / sources to retrieve data 8GB RAM per Crawl Component• Uses one or more crawl databases to store info about crawl items and crawl history
    • What it Does Important Facts• Processes crawl items and feeds to index • We must only have one (1) crawl component processing component per server – more• Transforms crawl items into artifacts that will hurt, not help crawl performance can be included in search index • Max of 2 per search service application (Performs document parsing and • Feeding Sessions are scaled based on property mapping) CPU cores using a default coefficient of 3• Writes information about links and urls 8 (cores) * 3 = 24 feeding sessions in link database (which are analyzed by 4 (cores) * 3 = 12 feeding sessions analytics to calculate relevance and • MS Recommends: 8(4vm) CPU / 8GB currency - Results written back to search RAM per Content Processing Component index by content processing component • Feeding sessions require RAM – More• Generates phonetic name variations to RAM is necessary when more cores are improve people search present – monitoring required
    • What it Does Important Facts• Runs analytics jobs that analyze crawl items • Maximum of 6 per search service and user interaction with search results to application perform both search analytics and usage • Add more Analytics Processing Components analytics to improve analytics performance• Analyzes Link & Anchor text analysis, Clear • MS Recommends: 8(4vm) CPU / 8GB RAM / distance, Search Clicks, Deep Links, Social 300GB disk space per Analytics Processing Tags, Social Distance, Search Reports, Component. Recommendations, Usage Counts, Activity • Interacts with Analytics Reporting to store Ranking statistical information• Improves search relevance and create • Interacts with Link database to store search results information about searches and crawled• Output included in search index by content documents processor
    • What it Does Important Facts• Receives processed items from content • Maximum of 60 index partitions (20 processing component and writes the index partitions X 3 index replicas) per items to the index file search service application• Receives queries from the query • Must provision one Index Component processing component and returns for each index replica. result sets • MS Recommends: 8(4vm) CPU / 16GB• Redistributes content among index RAM / 500GB disk space per Index partitions when index architecture is Component. changed by Search Administration Component
    • • Index partition is logical portion of entire search index (same as before) • Index partition is served by one or more index components • Index components can be primary "replica" or secondary Index "replica" • Primary Replica is contacted by content processing component to write new data in the indexArchitecture • Secondary Replica is read only copy that get updated with the data. • Adding replicas improves query performance under load • Add partitions to handle increased content corpus • Cant remove partition after it has been added.
    • What it Does Important Facts• Analyzes and processes queries and • Maximum of 1 per server results • MS Recommends: 8(4vm) CPU / 8GB• After receiving a query, it analyzes and RAM per Query Processing processes the query to optimize Component. precision, recall and relevance• Submits processed queries to the index component• Processes the result set returned by the index component before returning to the querying entity.
    • Get-SPEnterpriseSearchService Get-SPEnterpriseSearchServiceApplicationGet-SPEnterpriseSearchStatus Get-SPEnterpriseSearchQueryAndSiteSettingsService Get-SPEnterpriseSearchLanguageResourcePhrase Get-SPEnterpriseSearchServiceApplicationProxyNew-SPEnterpriseSearchAdminComponent Get- Get-SPEnterpriseSearchSiteHitRule Get-SPEnterpriseSearchServiceInstance SPEnterpriseSearchQueryAndSiteSettingsServiceInstan New-SPEnterpriseSearchLanguageResourcePhrase New-SPEnterpriseSearchServiceApplication ceGet-SPEnterpriseSearchCrawlContentSource New-SPEnterpriseSearchSiteHitRule New-SPEnterpriseSearchServiceApplicationProxy Get-Get-SPEnterpriseSearchCrawlCustomConnector Remove-SPEnterpriseSearchLanguageResourcePhrase Remove-SPEnterpriseSearchServiceApplication SPEnterpriseSearchQueryAndSiteSettingsServiceProxyGet-SPEnterpriseSearchCrawlDatabase Remove-SPEnterpriseSearchSiteHitRule Remove-SPEnterpriseSearchServiceApplicationProxy Get-SPEnterpriseSearchQueryAuthorityGet-SPEnterpriseSearchCrawlExtension Get-SPEnterpriseSearchVssDataPath Restore-SPEnterpriseSearchServiceApplication Get-SPEnterpriseSearchQueryDemotedGet-SPEnterpriseSearchCrawlMapping Get- Resume-SPEnterpriseSearchServiceApplication Get-SPEnterpriseSearchQueryKeyword SPEnterpriseSearchContentEnrichmentConfigurationGet-SPEnterpriseSearchCrawlRule Set-SPEnterpriseSearchService Get-SPEnterpriseSearchQueryScope Set-SPEnterpriseSearchPrimaryHostControllerNew-SPEnterpriseSearchCrawlComponent Set-SPEnterpriseSearchServiceApplication Get-SPEnterpriseSearchQueryScopeRule Set-SPEnterpriseSearchLinguisticComponentsStatusNew-SPEnterpriseSearchCrawlContentSource Set-SPEnterpriseSearchServiceApplicationProxy Get-SPEnterpriseSearchQuerySuggestionCandidates Set-New-SPEnterpriseSearchCrawlCustomConnector Start-SPEnterpriseSearchServiceInstance Get-SPEnterpriseSearchRankingModel SPEnterpriseSearchContentEnrichmentConfigurationNew-SPEnterpriseSearchCrawlDatabase Stop-SPEnterpriseSearchServiceInstance Get-SPEnterpriseSearchSecurityTrimmer Remove-New-SPEnterpriseSearchCrawlExtension Suspend-SPEnterpriseSearchServiceApplication New-SPEnterpriseSearchQueryAuthority SPEnterpriseSearchContentEnrichmentConfigurationNew-SPEnterpriseSearchCrawlMapping Upgrade-SPEnterpriseSearchServiceApplication New-SPEnterpriseSearchQueryDemoted New-New-SPEnterpriseSearchCrawlRule SPEnterpriseSearchContentEnrichmentConfiguration Backup-SPEnterpriseSearchServiceApplicationIndex New-SPEnterpriseSearchQueryKeywordRemove-SPEnterpriseSearchCrawlContentSource Get-SPEnterpriseSearchLinguisticComponentsStatus Upgrade- New-SPEnterpriseSearchQueryScopeRemove- Get-SPEnterpriseSearchHostController SPEnterpriseSearchServiceApplicationSiteSettings New-SPEnterpriseSearchQueryScopeRuleSPEnterpriseSearchCrawlCustomConnector Restore-SPEnterpriseSearchServiceApplicationIndex New-SPEnterpriseSearchRankingModel Set-SPEnterpriseSearchLinksDatabaseRemove-SPEnterpriseSearchCrawlDatabase Remove- New-SPEnterpriseSearchSecurityTrimmer Repartition-SPEnterpriseSearchLinksDatabasesRemove-SPEnterpriseSearchCrawlExtension SPEnterpriseSearchServiceApplicationSiteSettings Remove-SPEnterpriseSearchQueryAuthority Move-SPEnterpriseSearchLinksDatabasesRemove-SPEnterpriseSearchCrawlMapping Get-SPEnterpriseSearchOwner Remove-SPEnterpriseSearchQueryDemoted Remove-SPEnterpriseSearchTenantSchemaRemove-SPEnterpriseSearchCrawlRule Suspend-SPEnterpriseSearchServiceApplication Remove-SPEnterpriseSearchQueryKeyword Remove-SPEnterpriseSearchTenantConfigurationSet-SPEnterpriseSearchCrawlContentSource Set-SPEnterpriseSearchServiceInstance Remove-SPEnterpriseSearchQueryScope Remove-SPEnterpriseSearchLinksDatabaseSet-SPEnterpriseSearchCrawlDatabase Remove-SPEnterpriseSearchQueryScopeRule Remove-SPEnterpriseSearchFileFormatSet-SPEnterpriseSearchCrawlRule Get-SPEnterpriseSearchMetadataCategory Remove-SPEnterpriseSearchRankingModel New-SPEnterpriseSearchLinksDatabaseSet-SPEnterpriseSearchCrawlLogReadPermission Get-SPEnterpriseSearchMetadataCrawledProperty Remove-SPEnterpriseSearchSecurityTrimmer New-SPEnterpriseSearchFileFormatRemove- Get-SPEnterpriseSearchMetadataManagedProperty Set-SPEnterpriseSearchQueryAuthority New-SPEnterpriseSearchCrawlLogReadPermission Get-SPEnterpriseSearchMetadataMapping Set-SPEnterpriseSearchQueryKeyword SPEnterpriseSearchAnalyticsProcessingComponentRemove-SPEnterpriseSearchCrawlLogReadPermission New-SPEnterpriseSearchMetadataCategory Set-SPEnterpriseSearchQueryScope Import-SPEnterpriseSearchCustomExtractionDictionary New-SPEnterpriseSearchMetadataCrawledProperty Set-SPEnterpriseSearchQueryScopeRule Get-SPEnterpriseSearchLinksDatabaseImport-SPEnterpriseSearchTopology New-SPEnterpriseSearchMetadataManagedProperty Set-SPEnterpriseSearchRankingModel Get-SPEnterpriseSearchFileFormatExport-SPEnterpriseSearchTopology New-SPEnterpriseSearchMetadataMapping Start- Set-SPEnterpriseSearchFileFormatStateSet-SPEnterpriseSearchTopology Remove-SPEnterpriseSearchMetadataCategory SPEnterpriseSearchQueryAndSiteSettingsServiceInstan Get-SPEnterpriseSearchComponentRemove-SPEnterpriseSearchTopology Remove- ce Get- SPEnterpriseSearchMetadataManagedProperty Stop- SPEnterpriseSearchServiceApplicationBackupStoreRemove-SPEnterpriseSearchComponent Remove-SPEnterpriseSearchMetadataMapping SPEnterpriseSearchQueryAndSiteSettingsServiceInstanNew-SPEnterpriseSearchTopology ce Set-SPEnterpriseSearchMetadataCategoryNew- Import-SPEnterpriseSearchPopularQueriesSPEnterpriseSearchQueryProcessingComponent Set-SPEnterpriseSearchMetadataCrawledProperty Set-SPEnterpriseSearchMetadataManagedProperty Set-SPEnterpriseSearchResultItemTypeNew-SPEnterpriseSearchIndexComponent Set-SPEnterpriseSearchMetadataMapping Set-SPEnterpriseSearchQuerySpellingCorrection
    • Host 1 Host 2 Host 5 Host 6 Web server Web server Web server Web server All SharePoint databases All SharePoint databases Application Office Application Office Search admin db Link db Server Web Apps Server Web Apps Server Server Crawl db Analytics db Redundant copies of all databases using SQL clustering, mirroring, or SQL Server SharePoint Config db 2012 AlwaysOn All other SharePoint databasesHost 3 Host 4 Application Server Application Server Query Processing Query Processing Replica Index part ition 0 Replica Application Server Application Server Crawl Crawl Admin Admin Analytics Analytics Content processing Content processing
    • Host A Host B Host E Host F Application Server Application Server Query Processing Replica Index part ition 0 Replica Application Server Application Server Analytics Analytics Application Server Application Server Content processing Content processing Application Server Application Server Replica Index part ition 1 Replica Admin Admin Crawl Content processing Crawl Content processingHost C Host D Host G Host H Application Server Application Server Query Processing SharePoint databases SharePoint databases Replica Index part ition 2 Replica Crawl db Search admin db Crawl db Redundant copies of all databases using Application Server Application Server Link db Analytics db SQL clustering, mirroring, or SQL Server 2012 AlwaysOn Replica Index part ition 3 Replica
    • Host A Host B Host C Host D Host K Host L Host M Host N Application Server Application Server Application Server Application Server Query Processing Query Processing Replica Index part ition 2 Replica Replica Index part ition 0 Replica Application Server Application Server Application Server Application Server Analytics Analytics Analytics Analytics Application Server Application Server Application Server Application Server Content processing Content processing Content processing Content processing Application Server Application Server Application Server Application Server Index part ition 1 Replica Index part ition 3 Replica Replica Replica Analytics Analytics Crawl Admin Crawl Admin Content processing Content processingHost E Host F Host G Host H Host O Host P Host Q Host R Application Server Application Server Application Server Application Server SharePoint databases SharePoint databases SharePoint databases SharePoint databases Query Processing Query Processing Index part ition 4 Replica Replica Index part ition 6 Replica Replica Search admin db Link db Redundant copies of all databases using Crawl db Redundant copies of all databases using Analytics db SQL clustering, mirroring, or SQL Server Application Server Application Server Application Server Application Server SQL clustering, mirroring, or SQL Server 2012 AlwaysOn Crawl db 2012 AlwaysOn Analytics db Crawl db Crawl db Replica Index part ition 5 Replica Replica Index part ition 7 Replica Crawl dbHost I Host J Application Server Application Server Replica Index part ition 8 Replica Application Server Application Server Replica Index part ition 9 Replica
    • Schema can be managed by site admins, reducing the load on search administrator Schema can be configured to allow more granularity (query, retrieve, refine, sort, etc) - Affects content index size Remote result sources can be crawled locally and then queried by remote farms. Huge impact on geo-distributed search… KL may be able to help! Individual items can be re-crawled easily Automatic URL balancing in crawl databases minimizes host name restrictions for large archive repositoriesScalability limit changes will have a big impact on farm design for large archive content repositories inthe near future.