2. Long time IT guy
Using SharePoint since 2001
Author of multiple books and articles.
Regular Speaker at SharePoint Saturday and conferences
3. About BLOBs and storage in SharePoint
Introduction to Content Externalization and why do it?
EBS
RBS
Pulling it all together
4. Everything goes into the content database (SQL)
Metadata
Files
Metadata is vital to SharePoint Success
Great for SQL small amounts of structured data
Easy for SQL pages architecture (Good I/O Profile)
Files Stored as a BLOB (Binary Large Object)
5. Binary Large Object
A file in a database
90% of a typical content database is made up of BLOB’s
6. How does SQL handle a BLOB
Rows of data cannot exceed 8K
Pointer in a row to larger data made up of multiple pages.
Its all about the pages
8K at a time
50MB file = 6000 pages
Don’t forget about database fragmentation
BLOBS are a big cause of this
BLOB’s CAN BE problematic
Especially for files over 1MB
Also does not help in DB fragmentation
7. SQL Server
Get Request
2. Enforce
SharePoint Object Model Content Config
Biz Logic
Database Database
Save Request
8.
9. Its Status Quo
Its not different.
No Change (who moved my cheese)
Transactional consistency
1 stop shop
10. On average, 90-95% of a SharePoint content database’s storage overhead is
comprised of content BLOBs.
BLOBs can be stored on less-expensive storage and SQL is
DOC
TIFF
PDF
no longer burdened with inefficient BLOB I/O.
XLS PPT
Addressable
Network-
BLOBs can be remoted to WORM-compliant storage
Addressable
Content-
platforms like EMC Centera, Hitachi HCAP, or OSAR.
You can implement HSM where content can be moved to
less expensive storage tiers as it becomes less relevant.
Cloud
Database sizing guidelines become largely irrelevant.
11. Generally faster content upload and retrieval.
Substantially for large (> 100MB) content and bulk operations.
Implement multi-tiered content storage chargeback models leveraging on-
premise and/or Cloud-based platforms.
Substantially increase the speed of upgrade/migration processes from
SharePoint 2007 to 2010.
Content can be compressed or de-duplicated, adding to storage cost
savings.
Content can be encrypted for greater transmission and storage security.
12. All Content in SQL BLOBs Remoted to NAS
Total Cost $120,000 Total Cost $75,000
TIFF TIFF
DOC PDF DOC PDF
XLS PPT XLS PPT
SAN or DAS
SAN or DAS
NAS
13. Total Cost $35,200
TIFF
DOC PDF
XLS PPT
SAN or DAS
NAS (Tier 2) NAS (Tier 3) Cloud (Tier 4)
14. 5TB of Content.
Costs $120K to store in SQL on Tier 1 storage.
Costs $75k to offload 90% to Tier 2 (NAS) storage.
Costs $35k to archive BLOBs to less expensive tiers as it ages, making room
for new content in the more expensive tiers.
The ultimate goal of archiving/tiered storage is to make incremental
investments in storage on the least expensive tiers only, move content to those
tiers as quickly as possible, and make room for new content in the more
expensive tiers.
16. Introduced by SharePoint team WSS V3 SP1
Farm Scoped
COM Interface, requires implementation by a provider
Implements Save and Load functions(Save binary, Retrieve Binary)
Once saved the provider returns a BLOB id to the system which is saved in
place of the blob itself.
When retrieving SharePoint recognizes there is a blob id and not a blob and
hands the ID to the EBS provider
Not deemed long term solution
No planned for migration to future technologies
17. Provider based
No default garbage collection
Provider must account for this
No affect on existing content
Provider must account for this
Export –Import Options
No direct SQL integration
Purely implemented by SharePoint
18. Remote Blob Storage
Implemented entirely by SQL Server 2008 and later
Uses a Provider Model
Its all about the Provider
Implemented by Managed Code (Not COM)
Microsoft provides a default provider based on FILESTREAM
Only available for local disks because of the FILESTREAM limitations
No encryption
No Mirroring Support
19.
20. SQL Server
Content
Database
BLOB BLOB BLOB
Store 1 Store 2 Store 3
21. 7. Back to User
2. Enforce SharePoint Object Model
1. Save Request Biz Logic
Relational Access SQL Server
RBS Client Library
6. Save Metadata
3. Save BLOB & BLOB ID
5. Return BLOB Id Content Config
Database Database
BLOB Store Provider Library
4. Write BLOB
BLOB Store
22. 7. BLOB Data to User
2. Enforce SharePoint Object Model
1. Open Document Biz Logic
Relational Access SQL Server
RBS Client Library
3. Get
4. Read BLOB BLOB Id
6. Return BLOB Content Config
Database Database
BLOB Store Provider Library
5. Read BLOB
BLOB Store
23. Provider based on “FileStream” functionality
All blobs must be local to the SQL server
No management interface
No monitoring
Rudimentary garbage collection
24. Console Application
Takes parameters to run per database
Can be used with Task Scheduler
Reference Scanning
Find orphans
Deletion Propagation
Delete them
Orphan Cleanup
Get rid of the laggers
Keep in mind the true garbage time line (Recycle Bin)
25.
26. Create File Stream group for content database
Install RBS
Activate Provider (PowerShell)
Test
27. use [WSS_Content_Blob]
if not exists (select * from sys.symmetric_keys where name =
N'##MS_DatabaseMasterKey##')create master key encryption by password = N'Admin
Key Password !2#4'
use [WSS_Content_Blob]
if not exists (select groupname from sysfilegroups where
groupname=N'RBSFilestreamProvider')alter database [WSS_Content_Blob]
add filegroup RBSFilestreamProvider contains filestream
use [WSS_Content_Blob]
alter database [WSS_Content_Blob] add file (name = RBSFilestreamFile, filename =
'c:Blobstore') to filegroup RBSFilestreamProvider
29. $cdb = Get-SPContentDatabase -webapplication http://falcon
$rbss =$cdb.RemoteBlobStorageSettings
$rbss.Installed()
Retun should be True
$rbss.Enable()
$rbss.GetProviderNames()
Return will be RBSFilestreamFile2
$rbss.SetACtiveProviderName($rbss.GetProviderNames()[0])
30.
31. Backup Restore Order of Operations
Backup:
Backup Start Both Backups Complete
Restore Start Both Restores are Complete
32. By default getting the content out of the database only solves a small
percentage of the real problem
You must strive to drive real efficiency in your environment
Tiered Storage!
Easier planning for RTO, RPO and Recovery Targets
EBS VS RBS
RBS is said to be the path forward
RBS is application agnostic
EBS does not require SQL 2008
EBS works in both 2007 and 2010
SQL Server stores data on disk in 8KB chunks called pages. When SQL Server needs tochange a single byte of data, it loads—at a minimum—8KB of data off disk, makes thechange in memory, then writes that 8KB page back to disk. Normally, SQL Server has tostore an entire table row within a single page, meaning a single row of data can’t exceedthat 8KB limit (the actual number is slightly smaller, since each page has a small amount ofoverhead for management data). However, SQL Server does allow a row of data to contain apointer to larger pieces of data, which can then be spread across multiple pages. Figure 1.3illustrates this storage mechanism, with a single row of data on the first page, containing apointer to several sequential pages that contain a large string of data—perhaps a photo, aWord document, or some other large piece of information.
New white paper says that if avg file size is >80K RBS will improve performance
Make sure you have storage appropriatenessMany people put critical data in SharePointMany people put trivial data in SharePointThey are all stored in the same place (typically)Analyze what you are putting where because…..Typical Tier 1 storage (The good stuff) costs$12 per GIGTypical Tier 2 storage costs$7 per GIGTypical Tier 3 storage costs$3 per GIGCloud.10 per GB (plus transfer)
The COM interface recognizes file Save and Open commands and invokes redirection calls to the EBS. The EBS Provider also ensures that the SQL Server content database contains metadata references to their associated BLOB streams in the external BLOB store.You install and configure the EBS Provider on each Web front end server in your farm. In its current version, external BLOB storage is supported only at the scope of the farm (SPFarm).
The SharePoint object model interacts with SQL Server configuration and content databases.RBS introduces a new SQL RBS Client Library that allows the SharePoint 2010 object model to store BLOB data externally.An RBS Provider API allows storage vendors to create an external storage system for SharePoint 2010. You should note that the adoption of a provider model makes it extremely easy to switch to another provider, which ultimately comes down to choosing another repository for storing BLOBs.An RBS installation requires one or more BLOB store providers implementing the Provider API. Please note that you can associate one active BLOB store provider for a specific content database.Every BLOB store provider uses it's own specialized type of BLOB store. BLOB stores are only accessed through their custom providers.RBS Maintainer. The RBS Maintainer runs as a Windows scheduled task that is responsible for handling maintenance tasks and doing garbage collection. It can either be installed on a web front-end (WFE) or on the database server itself.It allows you to manage document metadata and the document BLOBs themselves separately. This allows you to set up more cost effective backup/restore scenario's such as separate back up schedules (for instance, in backup/restore scenario's). It may give you, depending on the BLOB store you choose, advanced storage capabilities. Please refer to the previous side note about BLOB store provider vendors for more information.Better support for handling hierarchical data.
Essentially, the FILESTREAM data type is something you assign to a column in a table. Thatis, instead of declaring the column as a varbinary() type, you add the FILESTREAMattribute to the varbinary() column. SQL Server then automatically writes the BLOB data tothe file system rather than into the database. You still use SQL Server to add and retrieveBLOB data; all that’s changed is where SQL Server physically stores it. Using a simpleFILESTREAM column doesn’t change the way you do backup and recovery, in fact; SQLServer “knows” about FILESTREAM columns and integrates them into the backupprocesses. They even work within SQL Server’s security model and are fully supported bytransactions. Figure 2.8 shows how it works: Essentially, the FILESTREAM type tells SQLServer to split out the BLOB data into normal files on the file system.
Blob Store is the finalstoragelocation
RBS Maintainer runs as a seperate executable that must be scheduled outside of SharePoint's control. It's on a per-database level (for SharePoint this would be a content database) so any number of providers can be installed in a db and only one maintainer process needs to be run.The maintainer runs against RBS and the application's metadata tables to determine which blobs require removal. Once a blob requires deletion, a specific deletion request is sent to the provider for that individual blob. The provider only needs to implement the provider interface (of which one of the methods is DeleteBlob) and does not have any knowledge about the application or even the RBS metadata tables.
Reference scan. The first step compares the contents of the application's RBS tables with RBS's own internal tables and determines which BLOBs are no longer referenced. Any unreferenced BLOBs are marked for deletion.Delete propagation. The next step determines which BLOBs have been marked for deletion for a period of time longer than the garbage_collection_time_window value and deletes them from the BLOB store.Orphan cleanup. The final step determines whether any BLOBs are present in the BLOB store but absent in the RBS tables. These orphaned BLOBs are then deleted.