Large Scale SharePoint SQL Deployments

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Notes on slide 1

    Pre-construct and pre-size your content databases. Once the content database size has been specified, it is recommended that the database be created using a script that appropriately generates the empty database. Note that the “Autogrow” feature should be left on to prevent any future issues.Place the content database file or files on RAID 5 or RAID 10 logical units. RAID 10 is the best choice when cost is not a concern. RAID 5 will be sufficient and will save on costs, since content databases tend to be more read intensive than write intensive.For a large-scale document management solution, with a multi-core computer running SQL Server, the primary file group for the content database could potentially consist of a data file for each CPU core present in SQL Server. If possible, move each data file to separate logical units consisting of unique physical disk spindles.Database storage for content items will be between 1.2 and 1.5 time the raw file size when stored in SharePoint.

    If you would like to host your demo on the Virtual Server, please use the myVPC demo slide, not this slide.

    1 Favorite

    Large Scale SharePoint SQL Deployments - Presentation Transcript

    1. Considerations for large-scale SharePoint deployments on SQL Server
      Name: Joel Oleson
      Title: Sr. Tech Prod Mgr
      Company: Quest Software
    2. Audience Poll
      New to SharePoint?
      SQL Admins?
      Large-scale Implementation (>1TB) experience?
      Scalability or performance issues in SharePoint deployments?
    3. Session Overview
      Lightweight
      Understanding SharePoint databases
      SQL Performance
      SQL Server 2008 with SharePoint
      Heavyweight
      Architectural Design Considerations
      Real-world scenarios
      Business Requirements
      Logical and Physical Architecture
      Architectural Design Statistical Results
      Appendix: DB Sizes, Content Distribution…
    4. =
      Lightweight
    5. Real World Examples
      Information based on real-world, large-scale SharePoint Implementations.
      Large software company (Microsoft)
      Intranet Portal for 120K users
      Global Enterprise Collaboration Solution (~20TB)
      Scalable Hosting Solution (SharePoint Online)
      Large automotive manufacturer
      Loan Origination Application / Document Repository
      ~50 Million content items (~6 TB)
    6. Understanding the SharePoint Databases
    7. Disk I/O Demand
      Most Demand
      Medium Demand
      Low Demand
      *Content..
      Search
      Config
      Temp
      Model
      +SSP
      Master
      Tlogs
      * Except during backup and Indexing + Except during Profile Import
    8. Top Performance Killers
      Indexing/Crawling
      Backup (SQL & Tape)
      Profile Import
      Misc Timer Jobs – User Sync for large #s of Users
      STSADM Backup/Restore
      Large List Operations
      Heavy User Operation List Import/Write
    9. Content Db
    10. Config
    11. SQL Server 2008 with Windows Server 2008
      Transactional Performance with SQL Server 2008 Dramatically outperformed SQL 2005 on Win 2003.
      Compressed backup in the box
      Support for SQL External Blob Storage
      Increased resiliency
      Transparent Encryption
      See Performance Gains athttp://msdn.microsoft.com/en-us/library/dd263442.aspx
    12. =
      heavyweight
    13. Architectural Design Considerations
      Database Volumes
      Separate database volumes into unique LUN’s consisting of unique physical disk spindles.
      In a heavily read-oriented internet (portal) site, prioritize data over logs.
      Separate out Search database transaction log from content database transaction logs.
    14. Architectural Design Considerations
      SQL TempDB Data Files
      Optimal TempDB data file sizes can be calculated using the following formula:
      [MAX DB SIZE (KB)] X [.25] / [# CORES] = DATA FILE SIZE (KB)
      Calculation result (starting size) should be roughly equal to 25% of the largest content or search DB.
      Use RAID 10; separate LUN from other database objects (content, search, etc…).
      “Autogrow” feature set to a fixed amount; if auto grow occurs, permanently increase TempDB size.
      TempDB Log file separated to unique LUN.
    15. Architectural Design Considerations
      Content Databases
      100 content databases per Web application
      100GB per content database
      CAUTION: DB locking issues reported in collaborative DM scenarios above 100GB
      Need to ensure that you understand the issues based on number of users, usage profiles, etc…
      Service Level Agreement (SLA) requirements for backup and restore will also have an impact on this decision.
    16. Architectural Design Considerations
      Content Databases - Continued
      Pre-construct and pre-size
      Use RAID 5 or RAID 10 logical units
      RAID 10 is the best choice when cost is not a concern.
      RAID 5 will be sufficient and will save on costs, since content databases tend to be more read intensive than write intensive.
      Multi-core computer running SQL Server
      Primary file group could consist of a data file for each CPU core present in SQL Server.
    17. Architectural Design Considerations
      Database Maintenance
      SQL Server SP2 is needed if using the DB maintenance wizard (KB930887).
      Plan regular defrag of databases
      Performance - Average Disk Queue Length
      Single Digit values are optimal.
      Occasional double-digit values aren’t a large concern.
      Sustained triple-digit values require attention.
    18. Architectural Design Considerations
      Performance
      The recommended practice for separating the database volume types for the transaction log files to unique LUN’s follows.
      Content Database Log Files.
      Search Database Log Files.
      Consider filegroups for search database
    19. Architectural Design Considerations
      Topology
      A single list should not have more than 2,000 items per list container.
      A container represents the root of the list, as well as any folders within the list; a folder is a container because other list items are stored within it.
      Whitepaper: Working with large lists in Office SharePoint Server 2007 (Steve Peschka)
      http://go.microsoft.com/fwlink/?LinkId=95450
      Disk Drive Speed
      15K RPM recommended.
      IIS Application Pools
      Ensure “Max Used Memory” setting utilizes all the available RAM in your WFE’s.
    20. Architectural Design Considerations
      STSADM Command-line Tool and CreateSiteInNewDBOperation
      Gary Lapoint STSADM Extensions for Site Collection DB maintenance
      Codeplex.com/governance tools for archive & delete capture
    21. Large Scale Manufacture
    22. Real-world Scenarios
      Automotive Mfgr. Business Requirements (Phase I)
      Loan Origination Application built on Office SharePoint Server 2007
      Ability to manage10.5 million images.
      System performance with a “normal” input load defined as receipt of 27,000 images per business day = 10 hours.
      Simulate user load to represent 200 users for search, view & update with 2x peak
    23. Real-world Scenarios
      Data Load Process (Phase I)
      Used KnowledgeLake Document Release Engine
      Loaded 9.17 documents/second per server
      Employs a high-volume, storage-based folder architecture within SharePoint to ensure UI responsiveness.
      Executed on 4 servers. Using this application, we were able to achieve:
      An average document load throughput of 36.6 documents per second!
      An average daily input of 3.17 million documents!
      10.5 million documents with only 28% utilization!
    24. Real-world Scenarios
      Data Load Process (Phase II)
      15 million documents consisting of Word (.docx), Excel (.xlsx), PowerPoint (.pptx) and Adobe PDF.
      Five Web Front-Ends were used for the load process.
      Peak Load Rate:
      24.3 docs per second/2.1 million documents per day.
      Average Load Rate:
      ~1.9 million documents per day.
      Load Time:
      8 days.NOTE: Load rates included automation process that created the PDF files.
    25. What does the logical architecture look like?!
    26. What does the physical architecture look like?!
      Scale OUT…
      Scale UP…
    27. What does the site topology look like?!
      Phase I
      17 Divisional Site Collections / DB’s
      Phase II
      10 Departmental Site Collections / DB’s
    28. What does the storage architecture look like?
    29. Database SizesPhase I
    30. Architectural Design Statistical ResultsPhase I
      Designed Once / Built Once
      No architecture OR configuration changes were required after the initial build was completed.
      10.5+ million documents loaded into the system in approximately 60 hours!
      Full Crawl indexed 10 Million items in 32 hours!
      Average content database size for divisional breakouts was 60GB
    31. Architectural Design Statistical ResultsPhase II
      Search database size was 539GB.
      Lesson Learned: Large search database caused disk I/O contention; break this out into multiple data file allocations matching the number of core processors on SQL Server, and spread them over unique LUN’s.
      Total Index size was 162GB!
      Average Content database size for Divisional breakouts was 200.65GB!
      Average Content database size for Departmental breakouts was 137.60GB!
    32. Large Scale Pharma
    33. Real-world Scenarios
      Pharmaceutical Business Requirements
      Collaboration Portal built on Office SharePoint Server 2007
      Validate ~40TB of content storage.
      Identify performance characteristics and provide guidance around content database sizing
      FAST search integration
    34. Real-world Scenarios
      Data Load Process
      71,524,357 documents loaded across two SharePoint Farms 10.92 days!
      Content was spread across the farms into 165 unique content databases.
      6,240 Site Collections, each containing 10 sub-sites for a total of 62,400 sites.
      Database sizes were pre-configured to vary in size from 100GB to 350GB to determine performance and/or SLA impacts.
    35. What does the logical architecture look like?!
    36. What does the physical architecture look like?!
    37. What does the site topology look like?!
      165 Content DB’s
      6,240 Site Collections
      10 Sub-Sites in each collection:
      62,400 Sites!
    38. What does the storage architecture look like?
    39. Architectural Design Statistical ResultsConclusion
      User Loads
      Stress tests included 2 - 3,000 concurrent users.
      Based on the 10% rule, testing completed equated to an environment representing 300,000 users!
      RAWnumber of RPS during peak times is 1,469 at Pharma.
      773 RPS, which equates to 346.59 ACTUAL RPS!
      FAST Search Integration
      Successfully integrated FAST search capabilities, indexed content corpus and served search results as expected.
    40. Large-Scale Case Study Available
      SharePoint Scalability and Performance Whitepaper
      Contains majority of content you will see here, along with test results you won’t see here.
      TechNet topic: http://go.microsoft.com/fwlink/?LinkId=120901
      Word 2007 format: http://go.microsoft.com/fwlink/?LinkId=120881
      Word 2000-2003 format: http://go.microsoft.com/fwlink/?LinkId=120890
      PDF format: http://go.microsoft.com/fwlink/?LinkId=120891
    41. question & answer
    42. Appendix
    43. Database SizesMPSC/Nissan Phase I
    44. Database SizesMPSC/Nissan Phase II
    45. Performance of Components Over Time
      MPSC/Nissan Phase I
      14 individual performance tests were run to simulate various load scenarios.
    46. How do we pull all this together?!
      PharmaContent Database Distribution
      Substitute “F1” with SQL Server number to generate unique DB’s
      Farm 1: 2 SQL
      Farm 2: 1 SQL
      165 Content Databases!
    47. How do we pull all this together?!
      PharmaData Load Statistics
    48. Architectural Design Statistical ResultsTesting Results – 300GB Content Databases
    49. Architectural Design Statistical ResultsTesting Results – 350GB Content Databases
    50. Architectural Design Statistical ResultsTesting Results – 250GB Content Databases
    51. Architectural Design Statistical ResultsTesting Results – 150GB Content Databases
    52. Required Slide
      © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
      The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
    SlideShare Zeitgeist 2009

    + Joel OlesonJoel Oleson Nominate

    custom

    1035 views, 1 favs, 1 embeds more stats

    Large Scale SQL deployments for SharePoint.

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 1035
      • 1031 on SlideShare
      • 4 from embeds
    • Comments 0
    • Favorites 1
    • Downloads 42
    Most viewed embeds
    • 4 views on http://blog.van-huizen.com

    more

    All embeds
    • 4 views on http://blog.van-huizen.com

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories