Effective SharePoint Scalability & Management. To BLOB or not to BLOB, that’s the question


Published on

In this session, we discuss challenges that remain when attempting to scale only using SharePoint’s native functionality. Afterward, we’ll share vital strategies and available solutions for ensuring seamless, centralized enterprise-wide management and efficient externalization of Binary Large Objects in order to free up valuable SQL Server space and subsequently improve SharePoint performance.

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • 150k AUD with 100k still in pipe
  • Every company has a vision to use SharePoint as the sole ECM. There are many steps and challenges along the way. The greatest leaps are going from ‘collaboration’ to ‘development’ and customization, and then again from ‘development’ to full ‘enterprise content management’ (ECM). You must have tools to get you there, and DocAve will accelerate your growth.
  • Single farm to full, multi-platformSaaS
  • Considerations for scaling your SharePoint environment for growth
  • Best practice for the recommended architecture:Redundancy & HA for each tier in your production env.Maintain a test environment where you QA all changes and code before deploying in production…improve production stability
  • Scaling brings StabilityStability brings AvailabilityAvailability through deployment (Dev -> Prod)Deployment through Architecture (Virtual is common)One of the best ways to ensure production stability is to abide by the proven practice of maintaining a multiple farm environment- keeping all development and testing completely separate from production. This means that if for some reason, something like a looping workflow is created- only the dev or test farm will be brought down, and production will remain unaffected. If we were testing in a separate web-application, we would have been screwed. Maintaining this staged environment means all workflows, features, customizations, and metadata modifications, for instance, are all properly tested, and the affect they’ll have on production is known before anything’s even installed. But assuming we set up global SharePoint environments, and comply with MSFT best practices of maintaining separate SharePoint farms for Dev/Testing/Staging/Prod- how do we manage changes- change to design elements, solutions, features, workflow, site content- across these sharepoint deployments? Documentation is key here- if a step is missed when installing a feature, if we forget to reset IIS or forget that one file that was stored on the other web front end… repeatability is key. You’ll have to be diligent at documenting change, and who’s responsible for which components of change- To make your lives easier in this respect, there are 3rd party tools available to help you thru this process.All Inclusive Deployment – Design, Solutions, WFE
  • One of the great new features of service applications is that they can be published and shared between farms. This is achieved by the source server “publishing” one or more of its service applications. When a service application is published a URL is created that can be fed into the second farm, being the consumer of the service. The servers are required to be configured appropriately first, by doing things such as exchanging security certificates so the farms will trust each other, but once that is done the consumer will be able to interact with the service as if it was local, and it can associate it with its local web applications.
  • This theory of publishing a service application opens up some interesting possibilities around how you can architect environments that have multiple farms, and one such way to work with this is the model of having a services farm. The premise is that you can have one farm that will run service applications that would be common to your organisation, such as user profile import, or enterprise search, or managed metadata for an enterprise taxonomy. These could then be managed and run in only one location and then published out for other farms in the organisation to consume. This means that user profiles can be imported once and used globally, or content can be indexed in one location and the index published out to other farms, or a taxonomy can be built in one place and then shared to the entire organisation.
  • A fully distributed global architecture will provide quick access to local SharePoint content with good user experience;You want to replicate only the relevant/global content You also want to handle special remote location (like the Alaska oil ring in this slide) via local Infrastructure and replication; Requires 3rd party tool
  • Distribute admin tasks- site collection administration, permissions management, etc … It may suit you to break up into separate site collections for different business units in order to achieve the desired GovernanceCan only distribute tasks so muchWill require additional personnel or an admin tool
  • There are also significant cost savings to be realized by moving your data through different tiers of storage. I’ve provided an example for a 1TB content database stored on Tier 1 storage. In this example, the customer saved $11k for 1 single database by moving data to a cheaper tier of storage.Discussion:Are these numbers accurate? What are people paying for Tier 1 storage these days? Tier 2? Tier 3?In my example, I’ve referenced Cloud Storage in Tier 3. Is anyone storing data in the cloud? Are people comfortable storing data in the cloud?
  • Going back one step, what is actually stored in SharePoint and how is it stored? 95% of what’s stored are BLOBs = Binary Large Objects BLOB = anything I’m adding to SharePoint (> 256kb)
  • So, SharePoint content = BLOB + Metadata
  • So, content DB = database of … BLOBs + Metadata
  • This graph shows requests-per-second for varying number of users. Requests-per-second for a SQL DB is on the left, RPS with RBS is on the right.For SharePoint, the ability to scale is critical.We see with the first graph- regarding SharePoint’s Ability to Scale, that as we increase user threads, the number of requests SharePoint’s able to handle per second decreases- this correlates to the green line here- showing us that as user count increases, the time it takes for SharePoint to respond to a request drastically increases. Now lets look at a SharePoint environment with its BLOBs externalized… here we see right off the bat that SharePoint’s able to handle many more requests per second than before, and the time of response per request stays extremely low. This is definitely something that requires further investigation- to see what happens when we get up into the thousands. For the purpose of this test, we kept the user threads to 500 and under, as this is where we see drastic changes is how many requests per second SharePoint’s able to handle. Now, this test was also conducted on SharePoint 2007, so it’d also be interesting to see how this changes between SharePoint 2007 and 2010, natively, and then with EBS or RBS enabled.
  • For SharePoint 2010, Microsoft recommends you split content db when it hits 200GBFor 2007, split when it’s around 90GB
  • Here’s a look at a very basic SharePoint architecture. Here we have the front end with the object model, and the SharePoint content (blobs and metadata) are stored in SQL.
  • No native archiving toolsEBS extended to include RBS for BLOB removal Available only in SQL Server 2008 SP2Only accessible via APIBCS (BDC in 2007) extended to allow for easier connectivity with legacy data systems Not intended for controlling growth, only exposing additional data from other systemsIntegrated with SharePoint!Users can access contents by:Clicking and downloading directly through SharePointOpening the file using their Office clientReferencing the URLSearching for contents natively in SharePointUsers can interact with contents by:Modifying metadata and content typesModifying permissionsApplying alertsUsing workflows or publishing templatesUsing site Quotas and Locks
  • So now let’s take a look at RBS… As RBS is SQL specific, it can be used across applications that leverage SQL, not just SharePoint, so this gives you more of an enterprise-wide storage architecture versus EBS, and here’s how enabling an RBS provider would affect your SharePoint storage architecture. You can have an RBS provider per database. No context, no ability to manage the object
  • Free to help with your migration planning.
  • Anticipation starts when you plan your architecture
  • When you anticipate growth and architect your environment for scalability, you’d reap the benefits!HOWEVER…. We do need to realize that externalization, while great for performance and user experience and cost, does create complexities in how we manage our infrastructure. Backup and Recovery operations- for instance- we still will need a way to back up BLOBs SYNCHRONOUSLY. If we want to leverage content management tools, for restructuring, or replication technology, for keeping multiple sites or farms in synch, those tools need to be able to account for externalized content- whether these tools copy or move BLOBs as well, whether they just copy or move the stubs, or even whether replication technology can copy the stubs and redirect with DFSR.Backup Implications - Need a method to backup BLOBs synchronously or your DBs can get out of sync with your filesystem SharePoint 2007 – this isn’t very efficient SharePoint 2010 – this works very wellTo BLOB or not to BLOB (MIT Research for Microsoft)<256kb, SQL better256kb to 1mb, SQL and file system comparable>1mb, file system betterMicrosoft will provide a powershellsolution to migrate from EBS to RBS (check this fact!)AND LAST BUT NOT LEAST>>>> IT”S COMPLETELY SEAMLESS FOR ENDUSERS!!!! End users can still access, and INTERACT with content! Content still works with workflows, alerts, office applications, etc.
  • It is extremely critical to factor in BLOB externalization to your Data Protection Strategies- just as you should consider it in other content management strategies as well.And here’s why…
  • These are just a few options for planning for BLOB externalization- The thing to remember here- is that even if some of these methods will keep BLOBs protected- if they’re not allowing you to perform SYNCHRONOUS backups of your databases AND BLOB store- whether its in the cloud, etc- the BLOB backups will only be useful if, by chance, the timer jobs are in sync. Make sure and test whatever strategy you want to leverage.
  • To optimize user experience, and their reliance on SharePoint as a platform- a safe place to store their documents, etc- you’ll need to find away to provide for granular restore capabilities. Perfect example- I’ve set up countless SharePoint libraries internally- one, I made the mistake of not turning versioning on right away to keep track of document histories- critical to collaboration. While looking thru a SharePoint doc- I could have sworn I’d made other edits- so I thought, maybe the latest version is on my laptop. Instinctively, I uploaded the document and overwrote the file. Come to find out, I’d actually uploaded a previous version, overwritten my latest edits, and lost a week’s worth of work. This is a risk. Because I didn’t delete the file, it wasn’t in recycle bin- the document had to be restored for me to continue on.
  • Now lets look at Archiving- we mentioned this was another way to optimize storage to save costs, increase sharepoint’s scalability, and improve performance. So first I just want to do a re-cap on the types of Archiving… because depending on who you’re talking to- you could mean a couple different things. First, there’s archiving for compliance….Natively, SharePoint offers the records center. If you’re leveraging the Records center, be aware that it is essentially just another location, still in SQL, to store content. The best practice here would be to put the records center on its own database, and leverage RBS to offload content. Next, we have archiving for storage savings. Essentially, leveraging multiple tiers of storage- not just 0 or 1 or 1 and 2 like we do with BLOB externalization, but maybe leverage tiers 1-4, for example, to really achieve the greatest cost savings. Now, archiving… natively, I mentioned there were no tools, but in reality, you could essentially just create backup files of the content you’d want to “archive” and then delete them out of SharePoint. OR, you look at 3rd parties, like AvePoint’s DocAve Archiver to build business rules into your archival plans.
  • Things to look for in 3rd party tools- EBS/RBS is key- keeps content “in”- accessible, allows interaction
  • Effective SharePoint Scalability & Management. To BLOB or not to BLOB, that’s the question

    1. 1. Twitter: @GarthLuke<br />Garth Luke (MCSE, MCP)<br />Vice President, Sales<br />AvePoint, Inc.<br />Effective SharePoint Scalability & ManagementTo BLOB or not to BLOB, that’s the question…<br />
    2. 2. AvePoint Company Overview<br />AvePoint Confidential and Proprietary<br />World’s Largest Provider of Integrated SharePoint Infrastructure Management<br />Backup & Recovery, Administration, Replication, Migration, Compliance, Storage Optimization<br />Products & Customer Growth<br />Products<br />Customers<br />
    3. 3. Reduced total migration time to Microsoft's internal hosted SharePoint 2010 environment by two months Consolidation is actively happening<br />Migrated 12,000 site collections from SharePoint 2007 to SharePoint 2010<br />Transferred approximately 200 lists to SharePoint 2010 while maintaining customizations, metadata, and field values<br />Minimized business disruption by scheduling migration jobs to automatically occur off-hours<br />http://www.avepoint.com/about/mtc-migration-to-2010/<br />
    4. 4. Real World Scalability Examples<br />Access 15TB of file-share data within SharePoint without migration<br />Reduce project time by 9-12 months <br />Enabled full SharePoint presentation & management of legacy file-share content without extra storage cost<br />http://www.avepoint.com/resources/case-studies/<br />AvePoint Confidential and Proprietary<br />
    5. 5. Architecting Scalability for Growing SharePoint Environments<br />
    6. 6. Growing With SharePoint<br />Enterprise Content Management<br />Line of Business Applications<br />Return on Investment<br />Collaboration<br />Tool<br />More Valuable<br />Content Repository<br />More Complex<br />AvePoint Confidential and Proprietary<br />
    7. 7. SharePoint Needs<br />AvePoint Confidential and Proprietary<br />
    8. 8. SharePoint Lessons<br />AvePoint Confidential and Proprietary<br />Information Architecture is ongoing<br />Changing Topology<br />Changing Taxonomy<br />Consolidation is actively happening<br />Global Farms – Central Farms<br />Service applications are the future<br />SharePoint as a Business O.S.<br />
    9. 9. Implications of Growing Deployments<br />Platform availability and integrity<br />Scalability on settings, permissions, and policies<br />High cost of storage<br />Binary large objects’ (BLOBs) impact on performance and scalability<br />
    10. 10. Optimizing Scalability<br />Architect for Scale and Global Access<br />Physical Architecture<br />Administration Considerations<br />Network Considerations<br />Bandwidth Considerations<br />Accommodating Growth: Storage<br />RBS or EBS (Plus a 3rd Party Provider)<br />FileStream<br />
    11. 11. Architecting for Scalability:Physical Architecture<br />Build redundancy into production- decrease downtime<br />Recommend using a multi-stage approach<br />Development<br />Testing / Quality Assurance<br />Staging / Pre-production<br />Production<br />Ensure all multi-stage environments are identical<br />
    12. 12. Architecting for Scalability:Physical Architecture<br /><ul><li>Scaling brings Stability
    13. 13. Stability brings Availability
    14. 14. Availability through multi-stage (Dev, Test, Prod)
    15. 15. Deployment through Architecture (Virtual is common)</li></ul>12<br />
    16. 16. Example: Multiple Farms Sharing Services<br />Farm B<br />Farm A<br />Remote farm consumes published services via HTTP/S<br />Servers providing service apps can publish specific apps<br />Other servers in the farm<br />
    17. 17. Example: Shared Services Farm<br />Services Farm<br />Farm A<br />Farm B<br />Farm C<br />
    18. 18. <ul><li>Plan for bandwidth limitations
    19. 19. Support externalized content
    20. 20. Consider geo-replication</li></ul>Provide Fast Access for Global Users<br />
    21. 21. Content Publication <br />Consistency is Key – Sharepoint Ecosystem<br />Two-way replication with conflict resolution<br />Local server for data survivability<br />Publication of solutions / applications<br />Business-rule driven replication and publication<br />AvePoint Confidential and Proprietary<br />
    22. 22. Plan for Growth: Scaling Administration<br />Distribute Admin tasks<br />Don’t forget about governance!<br />Who can create sites and subsites? Who can delete them?<br />What are my main content types and what metadata should be required for each?<br />Who manages term stores and content type hubs? Who can add terms?<br />Who can add content? Is there a review process?<br />Who can add users and edit permissions? What are the security groups?<br />Consider 3rd Party Administration Tools<br />
    23. 23. Accommodating Growth: the BLOB problem, performance issues and costs<br />
    24. 24. Planning for Growth: The Big Picture<br />Problem begins with initial migration<br />Need data for legal retention<br />SLAs still cover ALL SharePoint content<br />Data in SQL Server<br />
    25. 25. Storage Decisions for SharePoint<br />Comfort level vs. Cost of Storage<br />What makes the most sense for SharePoint Data?<br />AvePoint Confidential and Proprietary<br />
    26. 26. What is stored in SharePoint?<br />BLOB(Binary Large OBject)<br />Basically, a file<br />=<br />21<br />
    27. 27. What is stored in SharePoint?<br /> Metadata<br />BLOB<br />22<br />
    28. 28. What is stored in SharePoint?<br />Content Database<br />Metadata<br />Metadata<br />Metadata<br />Metadata<br />Metadata<br />Metadata<br />Metadata<br />Metadata<br />Metadata<br />Metadata<br />Metadata<br />Metadata<br />BLOB<br />BLOB<br />BLOB<br />BLOB<br />BLOB<br />BLOB<br />BLOB<br />BLOB<br />BLOB<br />BLOB<br />BLOB<br />BLOB<br />23<br />
    29. 29. Performance Scalability<br />AvePoint Confidential and Proprietary<br />
    30. 30. Solution to BLOB problem: Externalization<br />
    31. 31. Preventative Measures<br />Set site quotas and alerts!<br />10 GB quota, 8 GB alert is my favorite<br />Monitor growth trends<br />Sites: slow over time or large jump in size?<br />Overall content DB size<br />Split Content DBs if they get “too big”<br />
    32. 32. Modify your storage architecture<br />Extend BLOBs out of SQL<br />BLOBs: Binary Large Objects <br />SharePoint Content = BLOB + Metadata<br />Content DB = database of … BLOBs + Metadata<br />Archive content<br />
    33. 33. Default SharePoint Storage<br />SharePoint WFE<br />SharePoint Object Model<br />BLOBs &<br /> Metadata<br />SQL Server<br />Content DB<br />Config DB<br />
    34. 34. BLOB Externalization: RBS & EBS<br />RBS: Remote BLOB Storage<br />For 2010 only<br />introduced in SQL Server 2008R2 Feature Pack<br />EBS: External BLOB Storage<br />Introduced in SharePoint 2007 SP1<br />On deprecation list in SP2010<br />EBS to RBS migration can be performed with Powershell or 3rd party tool<br />
    35. 35. RBS<br />SharePoint WFE<br />Not unique to SharePoint, available to any application<br />A Provider Library can be associated with each database<br />SharePoint Object Model<br />BLOB &<br /> Metadata<br />SQL Server<br />Relational Access<br />RBS Client Library<br />Metadata<br />BLOB<br />Provider Library X<br />Provider Library Y<br />Content DB<br />X<br />Content DB<br />Y<br />BLOB Store<br />BLOB Store<br />
    36. 36. Anticipate Growth from the Start<br />Leverage RBS in SharePoint – 3rd party tools<br />User and API driven<br />Transparent user access<br />Transparent to development<br />Stub<br />Metadata<br />BLOB<br />Upload<br />Database<br />Extender<br />File<br />Disk Storage<br />WebFront-end<br />User<br />
    37. 37. Architecture Scalability: Anticipate Growth<br />Web Front-End Servers<br />ApplicationServer<br />ApplicationServer<br />Extender<br />Connector<br />Storage<br />Storage<br />Storage<br />Access<br />Access<br />Cloud Storage<br />File Server<br />Clustered SQL Server<br />
    38. 38. Benefits of Extending BLOBs<br />Performance- Improving User Experience<br />Performance increases as the BLOB sizes decrease. <br /><256kb, SQL better<br />256kb to 1mb, SQL and file system comparable<br />>1mb, file system better<br />Saves storage costs<br /><ul><li>Beware of Misconceptions!</li></ul>Backup & Recovery operations improved?? COMPLICATED!<br />Databases are 60-80% smaller, , but metadata & BLOBs are covered under same SLA. Synchronous backups (all-inclusive)are necessary to maintain consistency. <br /><ul><li>Integration of EBS/RBS Providers with other infrastructure management solutions is critical!</li></li></ul><li>Backing up BLOBs<br />Because we’ve changed the storage location of the content (BLOBs)…<br />Database-based backup solutions will NOT capture the content, only the metadata.<br />Need a plan to backup BLOBs synchronously<br />Out of sync timer jobs could cause data corruption!<br />
    39. 39. Complete SharePoint Data Protection<br />SharePoint Ecosystem – Item Level Recovery <br />AvePoint Confidential and Proprietary<br /><ul><li> Hive
    40. 40. GAC
    41. 41. Gallery
    42. 42. Site Definition
    43. 43. Solutions
    44. 44. Cust. Features</li></ul>Content<br /><ul><li>Content DB
    45. 45. Search Index
    46. 46. Web Application
    47. 47. Site Collection
    48. 48. Site
    49. 49. List/Library
    50. 50. Folder
    51. 51. Item/Document
    52. 52. Version
    53. 53. Metadata</li></ul>Customisations<br />SharePoint Configurations<br /><ul><li>Central Admin DB
    54. 54. Config DB</li></ul>Externalised Data (BLOB)<br /><ul><li>IIS Metabase
    55. 55. IIS Settings
    56. 56. Web.Config
    57. 57. InetPub</li></ul>System Configurations<br />Binary File (OS / SharePoint)<br />
    58. 58. Granular<br />Platform vs. Granular Backup<br />Contents within database<br />Quickest for day-to-day recovery<br />Flexible for aggressive SLA<br />Segment data by business unit<br />Full farm consistency<br />Consistency / DR<br />Requires staging / indexing<br />Larger roll-back points<br />VSS / hardware point of integration<br />Platform<br />AvePoint Confidential and Proprietary<br />
    59. 59. Not All Data is Created Equal<br />Hourly<br />Hourly<br />Daily<br />WikisSupport FAQs/ReferencesDocument Libraries etc.<br />Ongoing projectsActive meeting sitesetc.<br />Sales leadsCustomer recordsetc.<br />WikisSupport FAQs/ReferencesDocument Libraries etc.<br />Ongoing projectsActive meeting sitesetc.<br />Sales leadsCustomer recordsetc.<br />Hourly<br />Daily<br />Weekly<br />Support User GuidesTraining MaterialsBlogs<br />Time SheetsPrice SheetsOther meeting sites etc.<br />Financial reportsDaily sales reportsetc.<br />Support User GuidesTraining MaterialsBlogs<br />Time SheetsPrice SheetsOther meeting sites etc.<br />Financial reportsDaily sales reportsetc.<br />SQLDatabase<br />Daily<br />Weekly<br />Weekly<br />HR employee guidesPersonal sitesVacation Policies etc.<br />Marketing brochuresSales materialsPre-sales literature etc.<br />Annual reportsMo. sales reportsBoard reports etc.<br />HR employee guidesPersonal sitesVacation Policies etc.<br />Marketing brochuresSales materialsPre-sales literature etc.<br />Annual reportsMo. sales reportsBoard reports etc.<br />AvePoint Confidential and Proprietary<br />
    60. 60. BLOB Backup and Recovery Options<br />Hardware snapshots (if externalizing to same Hardware, e.g. NetApp)<br />Cloud storage (Offers built-in redundancy for DR)<br />Most SLAs will be for entire databases/content stores, many may not have granular recovery SLAs, or allow for synchronous backups<br />DFSR - Replication of File Shares storing BLOBs<br />Restore from replicated location<br />Most SLAs will be for entire databases/content stores, consider data corruption, ability to perform synchronous backups, etc<br />3rd party platform tools<br />Are synchronous backups of File Shares and SharePoint DBs achievable?<br />Insert / Header & Footer to change<br />38<br />
    61. 61. Planning for Platform Recovery<br />Account for:<br />Data corruption<br />Accidental deletions<br />Etc…<br />Test!<br />How long does it take?<br />What are the compliance implications?<br />If metadata (author, time, etc.) changes on restore, have I “falsified records”?<br />If I can’t recover a single document, have we accidentally “destroyed data”?<br />39<br />Don’t forget about your item-level recovery strategy<br />
    62. 62. Managing the content lifecycle of BLOBs<br />Archiving for RM: Records Center<br />Another SharePoint site<br />Higher % inactive content<br />Consider separate Content DB, with an RBS provider implemented for this DB<br />Archiving for Storage Savings: <br />Backup and delete<br />Workflow<br />3rd Party tools solutions<br />
    63. 63. 3rd Party Archiving Tools<br />What rules are available?<br />Last modified time<br />Author<br />Versions<br />What scope can I apply rules to? (farm to item)<br />Does it use RBS/EBS APIs?<br />Does it integrate with other infrastructure management tools? (backup, replication, etc.)<br />
    64. 64.
    65. 65. DocAve Architecture<br />SharePoint 2010<br />Hosted SharePoint<br />SharePoint 2007<br />SQL Databases<br />Cloud Storage<br />File Server<br />AvePoint Confidential and Proprietary<br />
    66. 66. Reporting and Analytics<br />AvePoint Confidential and Proprietary<br />Monitoring and Reporting<br /><ul><li>Track SharePoint index status, network bandwidth usage, etc.</li></ul>Infrastructure Reporting<br /><ul><li>Storage usage growth by business unit
    67. 67. Difference reports and policy enforcement</li></ul>SharePoint Usage Analysis<br /><ul><li>Comprehensive auditing
    68. 68. Track user behavior and individual disk space usage</li></li></ul><li>30 Day FREE Trial<br />Download a free evaluation<br />at www.avepoint.com/download<br />FREE modules:<br /><ul><li> DocAve Extender
    69. 69. DocAve Monitor
    70. 70. DocAve Restore Controller</li>