Automating SQL Server Database Creation for SharePoint


Published on

In this session, Talbott will discuss the use of the SharePoint API for provisioning content databases in SQL Server to store documents. There are several scenarios that you will want to control and manage the database creation when building specialized applications using SharePoint. Topics include planning and estimating size requirements plus strategies around partitioning data into content databases. Attendees include SQL Server DBA's supporting SharePoint installations and applications. Presented at New England Data Camp 1.0, Jan 24, 2009, at Microsoft Waltham, MA.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Automating SQL Server Database Creation for SharePoint

  1. 1. Automating SQL Server Database Creation for SharePoint <ul><li>Talbott Crowell </li></ul><ul><li> </li></ul><ul><li> </li></ul>
  2. 2. Automating SQL Server Database Creation <ul><li>SharePoint SQL Server Overview </li></ul><ul><li>Planning for Document Content Storage </li></ul><ul><ul><li>Plan for MOSS Software Boundaries </li></ul></ul><ul><ul><li>Search Indexing </li></ul></ul><ul><ul><li>Backup/Restore and Availability </li></ul></ul><ul><li>Structuring Data in SharePoint </li></ul><ul><ul><li>Site Collections </li></ul></ul><ul><ul><li>Content Databases </li></ul></ul><ul><li>Partitioning Data in SQL Server </li></ul><ul><li>Sample Solution </li></ul><ul><li>Other Considerations for Document Storage, Future... </li></ul>
  3. 3. SharePoint SQL Server Overview <ul><li>What is SharePoint? </li></ul><ul><ul><li>WSS (Windows SharePoint Services 3.0) – Free with Win Server </li></ul></ul><ul><ul><li>MOSS (Microsoft Office SharePoint Server 2007) </li></ul></ul><ul><li>SharePoint Farm </li></ul><ul><ul><li>Small, Medium, Large </li></ul></ul><ul><li>SQL Server </li></ul><ul><ul><li>Configuration Database </li></ul></ul><ul><ul><li>Content Database(s) </li></ul></ul><ul><ul><li>Search Index Database </li></ul></ul><ul><ul><li>Shared Service Provider Database </li></ul></ul>SQL Server Windows Server Windows Server IIS (Internet Info Svcs) ASP.NET WSS 3.0 MOSS 2007
  4. 4. SharePoint Farm Sizes Single-server Farm Two-server Farm Four-server Farm Five-server Farm
  5. 5. SharePoint Large Farms
  6. 6. Planning for Document Content Storage <ul><li>MOSS Software Boundaries </li></ul><ul><li>Logical Structure </li></ul><ul><ul><li>Site Collections, Sites, Lists, Folders </li></ul></ul><ul><li>Using Folders for Scalability </li></ul><ul><li>Search Indexing </li></ul><ul><li>Indexing Columns for Performance </li></ul><ul><li>Backup/Restore and Availabilty </li></ul>
  7. 7. Typical Large-Scale Content Management Scenarios <ul><li>Large-scale authoring environment </li></ul><ul><ul><li>Document Center site template, most users are authors </li></ul></ul><ul><ul><li>50K or more docs, 500 or more folders, versioning turned on </li></ul></ul><ul><ul><li>Single database up to 150 GB </li></ul></ul><ul><li>Large-scale content archive </li></ul><ul><ul><li>Knowledge base site, document archive, Records Center site template </li></ul></ul><ul><ul><li>1 million or more documents </li></ul></ul><ul><ul><li>Single database up to 400 GB </li></ul></ul><ul><li>Extremely large-scale content archive </li></ul><ul><ul><li>10 million docs across 5,000 or more folders </li></ul></ul><ul><ul><li>Users (50K or more) browse content by searching </li></ul></ul><ul><ul><li>Content submitted by using custom submission form </li></ul></ul>
  8. 8. Plan for MOSS Software Boundaries <ul><li>Limitations </li></ul><ul><ul><li>SSP: 20 per farm (3 per farm recommended) </li></ul></ul><ul><ul><li>Web app: 99 per SSP </li></ul></ul><ul><ul><li>Content database: 100 per web app </li></ul></ul><ul><ul><li>Site collections: 50K per content DB, 150K per web app </li></ul></ul><ul><ul><li>Web site: 250K per site collection </li></ul></ul><ul><li>Recommended content child items </li></ul><ul><ul><li>Site collections: 50K per web app </li></ul></ul><ul><ul><li>Site hierarchy: 2000 sub-sites for any parent site </li></ul></ul><ul><ul><li>Site: 2000 lists (or document libraries) per site </li></ul></ul><ul><ul><li>Document Library: 10 million documents, 2000 documents per view (folder) </li></ul></ul><ul><ul><li>Folder: 2000 items per folder </li></ul></ul><ul><ul><li> </li></ul></ul>
  9. 9. Flat document library <ul><li>quickest drop in throughput occurs when the total number of documents is less than 2,000 </li></ul>Source: Technet article “Plan for software boundaries (Office SharePoint Server)”
  10. 10. Hierarchical document library <ul><li>500 documents per folder </li></ul><ul><li>No significant throughput degradation up to 1 million documents </li></ul>Source: Technet article “Plan for software boundaries (Office SharePoint Server)”
  11. 11. Search Indexing <ul><li>GB of disk space required = Total_Corpus_Size (in GB) x File_Size_Modifier x 2.85 </li></ul><ul><ul><li>File_Size_Modifier </li></ul></ul><ul><ul><ul><li>1.0 for very small files (average 1 KB) </li></ul></ul></ul><ul><ul><ul><li>0.12 for moderate size (average 10 KB) </li></ul></ul></ul><ul><ul><ul><li>0.05 for large files (average 100 KB or larger) </li></ul></ul></ul><ul><li>Example: 1 GB files average size 10 KB </li></ul><ul><ul><li>1 GB x 0.12 = 0.12 GB (estimated size of index file is 120MB) </li></ul></ul><ul><ul><li>Next, multiply the estimated size of the index file by 2.85: </li></ul></ul><ul><ul><li>120 MB x 2.85 = 342 MB </li></ul></ul><ul><li>See: Estimate performance and capacity requirements for search environments </li></ul><ul><li> </li></ul>
  12. 12. Indexing Columns for Performance <ul><li>Improves performance for sorting or filtering list </li></ul><ul><li>Not done in SQL Server! </li></ul><ul><li>Changes made in SharePoint List Settings </li></ul><ul><ul><li>Manually via browser </li></ul></ul><ul><ul><li>Programmatically through the API </li></ul></ul><ul><ul><li>Declaratively using CAML (WSP) </li></ul></ul>
  13. 13. Backup/Restore and Availability <ul><li>Databases to back up </li></ul><ul><ul><li>Content Databases </li></ul></ul><ul><ul><li>Config Databases (Config, Admin, SSP Admin) </li></ul></ul><ul><ul><li>Search Databases (WSS, SSP) </li></ul></ul><ul><li>Other backups </li></ul><ul><ul><li>IIS config, WFE 12 folder, search index files </li></ul></ul><ul><li>Under 200 GB </li></ul><ul><ul><li>You can use STSADM -o backup </li></ul></ul><ul><li>Over 200 GB </li></ul><ul><ul><li>Microsoft System Center Data Protection Manager 2007 (DPM) </li></ul></ul><ul><ul><li>AvePoint or any backup vendor that supports MOSS 2007 </li></ul></ul>
  14. 14. Structuring Data in SharePoint <ul><li>Farm </li></ul><ul><ul><li>Can have 1 or more Web Applications </li></ul></ul><ul><ul><li>Can have 1 or more Shared Service Providers (SSP) </li></ul></ul><ul><li>Web Application </li></ul><ul><ul><li>Logical “portal” or destination (simple URL like: http://documents ) </li></ul></ul><ul><ul><li>Each web app must belong to a single SSP (MOSS only) </li></ul></ul><ul><ul><li>Can have 1 or more site collections </li></ul></ul><ul><li>Site collection </li></ul><ul><ul><li>Can have 1 or more sites </li></ul></ul><ul><li>Site </li></ul><ul><ul><li>Can have 0 or more lists (document libraries) </li></ul></ul><ul><li>List </li></ul><ul><ul><li>Can have 0 or more items (documents, folders, etc…) </li></ul></ul>
  15. 15. DEMO <ul><li>Walk through of SharePoint Logical Structure </li></ul><ul><li>SharePoint Farm </li></ul><ul><ul><li>Web Application </li></ul></ul><ul><ul><ul><li>Site Collection </li></ul></ul></ul><ul><ul><ul><ul><li>Site </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>List (Document Library) </li></ul></ul></ul></ul></ul>
  16. 16. Site Collections <ul><li>Can use site collection to partitioning data in SQL Server </li></ul><ul><ul><li>From database perspective is the smallest logical unit that can have its own database </li></ul></ul><ul><ul><li>All sites and documents created inside site collection will be stored in same database </li></ul></ul><ul><ul><ul><li>Content Database </li></ul></ul></ul><ul><li>Site collection </li></ul><ul><ul><li>Must have at least one “root” site </li></ul></ul><ul><ul><li>Can be created programmatically </li></ul></ul>
  17. 17. DEMO <ul><li>Manually creating SharePoint Structure </li></ul><ul><ul><li>Site Collection </li></ul></ul><ul><ul><li>Content Database </li></ul></ul><ul><ul><li>List (Document Library) </li></ul></ul>
  18. 18. Content Database <ul><li>SharePoint Central Administrator (Central Admin) </li></ul><ul><li>Disabled or Offline Database what does it mean? </li></ul><ul><ul><li>Still used by existing site collections </li></ul></ul><ul><ul><li>Can still create a new site, library, upload document, changes </li></ul></ul><ul><ul><li>Can’t create a new Site Collection! </li></ul></ul><ul><li>How do you force documents into a specific content database? </li></ul><ul><ul><li>Disable all content databases except the one you want to use </li></ul></ul><ul><ul><ul><li>Take “Offline” in Central Admin </li></ul></ul></ul><ul><ul><li>Create new site collection </li></ul></ul>
  19. 19. Programmatic Approach <ul><li>Imitate manual approach </li></ul><ul><li>Steps </li></ul><ul><ul><li>Disable all content databases (Offline) </li></ul></ul><ul><ul><li>Create a new content database and make it enabled/online </li></ul></ul><ul><ul><li>Create a new site collection </li></ul></ul><ul><ul><ul><li>Site collection is stored in new content database </li></ul></ul></ul><ul><ul><li>Disable new content database </li></ul></ul><ul><ul><li>Restore the state back to the default online content database(s) for web application </li></ul></ul>
  20. 20. Create Content Databases Programmatically <ul><li>Reference Microsoft.SharePoint.dll </li></ul><ul><li>Using Microsoft.SharePoint.Administration </li></ul><ul><li>SPWebApplication webApplication = SPWebApplication.Lookup(webAppUri); </li></ul><ul><li>SPContentDatabaseCollection contentDBs = webApplication.ContentDatabases; </li></ul><ul><li>... </li></ul><ul><li>if (contentDB.Status == SPObjectStatus.Online) </li></ul><ul><li>{ </li></ul><ul><li>contentDB.Status = SPObjectStatus.Disabled; </li></ul><ul><li>contentDB.Update(); </li></ul><ul><li>} </li></ul><ul><li>... </li></ul><ul><li>CreateContentDatabase(yearString) </li></ul><ul><li>SPSite siteCollection = webApplication.Sites.Add(... </li></ul>Collection of databases Loop through collection and disable online databases Call routine (next slide) to crate DB Finally add new site collection
  21. 21. Create Content Databases Programmatically <ul><li>private void CreateContentDatabase(string suffix) </li></ul><ul><li>{ </li></ul><ul><li>SPContentDatabase copyFromDb = webApplication.ContentDatabases[0]; </li></ul><ul><li>string prefix = &quot;MyApp_ContentDB_&quot;; </li></ul><ul><li>string dbName = prefix + suffix; </li></ul><ul><li>int warningCount = 0; </li></ul><ul><li>int maximumSiteCount = 1; </li></ul><ul><li>int status = 0; // 0=ready, 1=Offline </li></ul><ul><li>SPContentDatabase contentDb = webApplication.ContentDatabases.Add(copyFromDb.Server, dbName, copyFromDb.Username, copyFromDb.Password, warningCount, maximumSiteCount, status); </li></ul><ul><li>} </li></ul>
  22. 22. DEMO <ul><li>Creating SharePoint Structure using API </li></ul><ul><ul><li>Site Collection </li></ul></ul><ul><ul><li>Content Database </li></ul></ul>
  23. 23. Sample Solution <ul><li>Document repository </li></ul><ul><ul><li>Needed to scale to millions of documents </li></ul></ul><ul><ul><li>Needed to be searched </li></ul></ul><ul><ul><li>Needed to be maintainable by SQL Admins </li></ul></ul><ul><ul><li>Stored files programmatically </li></ul></ul><ul><li>Solution </li></ul><ul><ul><li>Single web app </li></ul></ul><ul><ul><li>Site collection per time period </li></ul></ul><ul><ul><li>Each site collection had one site with one document library </li></ul></ul><ul><ul><li>Use folders to handle scalability </li></ul></ul><ul><ul><ul><li>Combination of date and business division and document meta data </li></ul></ul></ul>
  24. 24. MOSS Content Database Storage Estimates Year New Docs Total Docs Doc File Size (GB) Doc Lib Size (GB) Year 1 2,000,000 2,000,000 200 300 Year 2 2,300,000 4,300,000 430 650 Year 3 2,700,000 7,000,000 700 1000
  25. 25. Search Index Storage Estimates <ul><li>Search Index files stored on the WFE </li></ul><ul><li>Search DB stores additional information used by search service </li></ul>Year Total Docs Document Lib Size (GB) Index Size(GB) Search DB Size(GB) Year 1 2,000,000 200 15 60 Year 2 4,300,000 430 30 130 Year 3 7,000,000 700 50 200
  26. 26. Content Database Partition Strategy Goals <ul><li>Optimize Availability or Manageability? </li></ul><ul><li>Optimize Availability </li></ul><ul><ul><li>Backup/Restore recovery time </li></ul></ul><ul><ul><ul><li>Smaller (and quicker) is better </li></ul></ul></ul><ul><li>Optimize Manageability </li></ul><ul><ul><li>Manageable number of databases </li></ul></ul><ul><ul><ul><li>Less is better </li></ul></ul></ul>
  27. 27. Database Partition Strategy Database Size and Count Database Partition Strategy 1 Content Database per Year 1 Content Database per Quarter Average Size 340 GB 85 GB # of DB Year 1 1 4 # of DB Year 2 2 8 # of DB Year 3 3 12 DB Per Year 1 4
  28. 28. Other Considerations for document storage <ul><li>RBS </li></ul><ul><ul><li>Remote BLOB Storage (SQL Server) </li></ul></ul><ul><li>EBS </li></ul><ul><ul><li>External BLOB Storage (SharePoint) </li></ul></ul>
  29. 29. Remote BLOB Store Provider Library Implementation Specification <ul><li>Store BLOBs on a Remote Blob Store (RBS) </li></ul><ul><ul><li>RBS Typically a separate box on same the network as the SQL Server.  </li></ul></ul><ul><ul><li>Managed by SQL Server </li></ul></ul><ul><ul><li>Integrity between the database records and the RBS external store is maintained </li></ul></ul><ul><li>Microsoft SQL Server 2008 Feature Pack, October 2008 </li></ul>
  30. 30. External BLOB Storage (EBS) in WSS <ul><li>EBS runs parallel to the site's SQL Server content database, which stores the site's structured data </li></ul><ul><ul><li>To coordinate the two data stores, you must implement a COM interface </li></ul></ul><ul><ul><li>ISPExternalBinaryProvider </li></ul></ul><ul><ul><ul><li>Uses simple semantics to recognize file Save and Open commands </li></ul></ul></ul><ul><ul><ul><li>Invokes redirection calls to BLOB store when it recognizes BLOB data streams </li></ul></ul></ul><ul><ul><li>You must install, register, and configure the EBS Provider on each WFE </li></ul></ul>
  31. 31. Resources <ul><li>EBS for WSS </li></ul><ul><ul><li>Kyle’s Blog (extensive coverage of his implementation) </li></ul></ul><ul><ul><li> </li></ul></ul><ul><li>Plan for MOSS Software Boundaries (Folder Structure) </li></ul><ul><ul><li> </li></ul></ul>
  32. 32. Future Tools <ul><li>Visual Studio 2010 Tools for SharePoint </li></ul><ul><ul><li>Server Explorer for SharePoint viewing Lists and other artifacts in SharePoint directly inside of Visual Studio </li></ul></ul><ul><ul><li>Windows SharePoint Services Project (WSP file) Import to create a new solution </li></ul></ul><ul><ul><li>Packaging explorer and packaging editor lets you structure the SharePoint features and WSP file that is created </li></ul></ul><ul><li>SharePoint speculation </li></ul><ul><ul><li>Will SharePoint v.Next utilize RBS or EBS for document storage? Maybe 3 rd party options? </li></ul></ul>
  33. 33. Summary <ul><li>Automating SQL Server Database Creation </li></ul><ul><li>Planning for Document Content Storage </li></ul><ul><ul><li>Plan for MOSS Software Boundaries </li></ul></ul><ul><ul><li>Search Indexing </li></ul></ul><ul><ul><li>Backup/Restore and Availability </li></ul></ul><ul><li>Structuring Data in SharePoint </li></ul><ul><ul><li>Site Collections </li></ul></ul><ul><ul><li>Content Databases </li></ul></ul><ul><li>Partitioning Data in SQL Server </li></ul><ul><li>Sample Solution </li></ul><ul><li>Other Considerations for Document Storage, Future... </li></ul>
  34. 34. Thank you <ul><li>Questions? </li></ul><ul><li>Talbott Crowell </li></ul><ul><li> </li></ul><ul><li> </li></ul>