Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Building a scalable search architecture in share point 2013


Published on

Building a scalable search architecture in share point 2013

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Building a scalable search architecture in share point 2013

  1. 1. Building a scalable Search architecture in SharePoint 2013 Thuan Nguyen, SharePoint MVP @nnthuan Vietnam SharePoint User Group
  2. 2. About Me  SharePoint Practice Lead, Solution Architect – FPT Software  Microsoft SharePoint MVP (2011, 2012, 2013, 2014)  Used to love start-up with two SharePoint-based products.  Now focus on building a SharePoint core standard and framework for Singapore Government. Vietnam SharePoint User Group
  3. 3. Agenda  Common Misunderstandings  Architecture & Topology  Practical Guide  Question & Answer Vietnam SharePoint User Group For those who are looking into having multiple Search servers handling millions documents.
  4. 4. Common Misunderstandings  For High Availability, create two Search Service Applications.  There is only one machine playing Search role in your farm  Scale out Search architecture by adding more servers.  Start Search service is to make search functionality work. Vietnam SharePoint User Group
  5. 5. Architecture & Topology Logical Architecture Crawl Content Processing Analytics Processing Index Administration Query Processing Understand each component will help better design a scalable & maintainable Search for your organization. Vietnam SharePoint User Group
  6. 6. Crawl Component  Responsible for crawling content from different sources  SharePoint sites  Exchange  Lotus Notes  Documentum  HTTP Website  Deliver crawled items to content processing component.  Crawl database stores information about crawl items and crawl history Vietnam SharePoint User Group dbo.MSSCrawlHistoryLocal
  7. 7. Content Processing  Processes crawled items and passes these items to the index component  Performs linguistic processing at index time (e.g. language detection and entity extraction)  Writes information about links and URLs to the Link database Vietnam SharePoint User Group dbo.MSSQLogResultDocs
  8. 8. Analytics Processing Vietnam SharePoint User Group  Analyzes crawled items and how users interact with search results.  When an user does an action (e.g. view a page) the event is collected in usage files on the WFE’s and regularly pushed to event store where they are stored until processed  Results are then returned to the Content Processing Component to be included in the search index dbo.SearchReportData
  9. 9. Index Component Vietnam SharePoint User Group  Receives the processed items from the content processing component and writes them to the search index.  Handles incoming queries, retrieves information from the search index, and sends back the result set to the query processing component.
  10. 10. Index Architecture Vietnam SharePoint User Group  An index partition is a logical portion of the entire search index.  Each partition is served by one or more index components (or “replicas”)  In a partition there’s only one primary (or “Active”) replica who’s the only one that writes data in a partition  Other secondary (or “passive”) replicas are there for fault tolerance and increased query throughput  Index can scale in both horizontal (partitions) and vertical (replicas) ways  Partitions can be added but NOT removed Secondary Replica 1 Secondary Replica 2 Secondary Replica 1 Secondary Replica 2 Secondary Replica 3 Secondary Replica 2 Secondary Replica 1 Partition #1 Partition #2 Partition #3 Secondary Replica 3 Secondary Replica 3 Servers Index Servers 1, 2 & 3 Index Servers 4, 5 & 6 Index Servers 7, 8 & 9 Index Servers 10, 11 & 12
  11. 11. Query Processing  Analyses and processes search queries and results.  The processed query is then submitted to the index component, which returns a set of search results for the query. Vietnam SharePoint User Group
  12. 12. Search Administration Vietnam SharePoint User Group  Search Admin Component  Runs number of system processes required for search  Is responsible for search provisioning and topology changes  Coordinates search components – Content Processing, Query Processing, Analytics, and Indexing.  Search Admin DB  Stores search configuration data:  Topology  Crawl rules  Query rules  Managed property mappings  Content sources  Crawl schedules  Stores Analytics settings dbo.MSSConfiguration
  13. 13. Practical Guide Vietnam SharePoint User Group Assessment Design Implementation Verification
  14. 14. Practical Guide- Assessment  Don’t hastily touch your SharePoint. Leave it alone!  Think about your content  What are your content sources (SharePoint document library, Exchange, File Server..)?  How much of content you want to search? (e.g. 100,000 documents)  Assess the number of concurrent users.  Search database sizing Vietnam SharePoint User Group
  15. 15. Practical Guide- Assessment Vietnam SharePoint User Group  Sizing factor:  Total Database Size  Total Index Size  Query Component Index Size  Disk Storage  Link Database  Search Admin Database  Total Crawl Database Size  Total Crawl Database Log Size  Analytics Database Size => Total database size for Search Microsoft already published the formula for these things above.
  16. 16. Practical Guide - Assessment Vietnam SharePoint User Group What is exactly High Availability for Search?  Business language: Search doesn’t stop end users searching something.  Technical language: All search logical components and Search databases must be functional as always.  Two or more Search service applications  Two or more Search servers
  17. 17. Practical Guide- Design  Don’t hastily touch your SharePoint. Leave it alone!  Start with one machine hosting all components Vietnam SharePoint User Group
  18. 18. Practical Guide - Design Vietnam SharePoint User Group  Don’t hastily touch your SharePoint. Leave it alone!  Think about two machines for Search but different set of components Redundant set of (Query + Crawl). If one goes down, Query component in another machine still keeps functioning.
  19. 19. Practical Guide - Design Vietnam SharePoint User Group  Don’t hastily touch your SharePoint. Leave it alone!  Do you need three machines for Search?  Speed up Query component?  Reduce crawling time?  Balance CPU utilization in machine? With more three machines, go to start an assessment of components in terms of the usage of hardware resources
  20. 20. Practical Guide - Design Component CPU Network Disk RAM Crawl Component MEDIUM HIGH MEDIUM MEDIUM Content processing (CPC) HIGH MEDIUM HIGH Analytics processing (APC) MEDIUM HIGH MEDIUM MEDIUM Index Component HIGH MEDIUM HIGH HIGH Query processing (QPC) MEDIUM MEDIUM MEDIUM Search Admin Component LOW LOW LOW Vietnam SharePoint User Group Microsoft Ignite – BK3176 If logical architecture requires scale-out, consider utilization
  21. 21. Practical Guide - Design Volume of content Sample Search Architecture < 1 mil items Single-server Search farm 1 mil – 5 mil Two-server Search farm 5 mil – 10 mil Small Search farm (3-4 servers) 10 mil – 40 mil Medium Search farm (5-6 servers) > 40 mil Large Search farm Vietnam SharePoint User Group
  22. 22. Sample Search Architecture Vietnam SharePoint User Group  Handle number of different content sources (with 20 custom applications)  Nearly 1 million items currently  Full crawl takes 2 hours  Serving for nearly 20,000 users with 500 concurrent users.
  23. 23. Sample Search Architecture Vietnam SharePoint User Group  Optimize search query to serve hundreds of concurrent users.  Handle million of documents (approx. 5 TB)
  24. 24. Sample Search Architecture Vietnam SharePoint User Group  Serve for 20 million documents & items (approx. 10-15 TB).
  25. 25. Central Administration doesn’t help much. PowerShell is your friend 1. Create Search Service Application 2. Clone existing topology 3. Modify Search component based on your designated architecture 4. Assign Index component and location 5. Activate the new Search topology Vietnam SharePoint User Group Practical Guide- Implementation Build Search farm with PowerShell
  26. 26. $app1 = "APP-Server-01" $app2 = "APP-Server-02" $SearchAppPoolName = "SharePoint_SearchApp" $SearchAppPoolAccountName = "TestDomainSPSearchPool" $SearchServiceName = "SharePoint_Search_Service" $SearchServiceProxyName = "SharePoint_Search_Proxy" $DatabaseName = "SharePoint_Search_AdminDB" #Create a Search Service Application Pool $spAppPool = New-SPServiceApplicationPool -Name $SearchAppPoolName -Account $SearchAppPoolAccountName -Verbose #Start Search Service Instance on all Application Servers Start-SPEnterpriseSearchServiceInstance $App1 -ErrorAction SilentlyContinue Start-SPEnterpriseSearchServiceInstance $App2 -ErrorAction SilentlyContinue Start-SPEnterpriseSearchQueryAndSiteSettingsServiceInstance $App1 -ErrorAction SilentlyContinue Start-SPEnterpriseSearchQueryAndSiteSettingsServiceInstance $App2 -ErrorAction SilentlyContinue #Create Search Service Application $ServiceApplication = New-SPEnterpriseSearchServiceApplication -Partitioned -Name $SearchServiceName -ApplicationPool $spAppPool.Name -DatabaseName $DatabaseName #Create Search Service Proxy New-SPEnterpriseSearchServiceApplicationProxy -Partitioned -Name $SearchServiceProxyName - SearchApplication $ServiceApplication Vietnam SharePoint User Group Practical Guide- Implementation
  27. 27. Practical Guide- Implementation #We need only one admin component New-SPEnterpriseSearchAdminComponent –SearchTopology $clone -SearchServiceInstance $App1SSI #We need two content processing components for HA New-SPEnterpriseSearchContentProcessingComponent –SearchTopology $clone -SearchServiceInstance $App1SSI #We need two analytics processing components for HA New-SPEnterpriseSearchAnalyticsProcessingComponent –SearchTopology $clone -SearchServiceInstance $App1SSI #We need two crawl components for HA New-SPEnterpriseSearchCrawlComponent –SearchTopology $clone -SearchServiceInstance $App1SSI New-SPEnterpriseSearchCrawlComponent –SearchTopology $clone -SearchServiceInstance $App2SSI #We need two query processing components for HA New-SPEnterpriseSearchQueryProcessingComponent –SearchTopology $clone -SearchServiceInstance $App1SSI New-SPEnterpriseSearchQueryProcessingComponent –SearchTopology $clone -SearchServiceInstance $App2SSI Vietnam SharePoint User Group $clone = $ServiceApplication.ActiveTopology.Clone() $App1SSI = Get-SPEnterpriseSearchServiceInstance -Identity $app1 $App2SSI = Get-SPEnterpriseSearchServiceInstance -Identity $app2
  28. 28. $IndexLocation = “APP2Index_Search" New-SPEnterpriseSearchIndexComponent –SearchTopology $clone -SearchServiceInstance $App2SSI - RootDirectory $IndexLocation -IndexPartition 0 $clone.Activate() Practical Guide- Implementation Vietnam SharePoint User Group
  29. 29. Practical Guide- Verification Vietnam SharePoint User Group Central Administration can help  PowerShell  Get- SPEnterpriseSearchStatus  Get- SPEnterpriseSearchTopolog y Search PowerShell
  30. 30. Helpful References  SharePoint 2013: SharePoint and Enterprise Search Survival Guide  Plan enterprise search architecture in SharePoint Server 2013  Search Architecture for SharePoint 2013 Vietnam SharePoint User Group
  31. 31. Vietnam SharePoint User Group Thank you