Whitepaper To Study Filestream Option In Sql Server


Published on

Topic: Document Storage Management for PCMS
To: Development Team
Dated: 3rd March 2010
To do the analysis for the large file storage in MS SQL Database.

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Whitepaper To Study Filestream Option In Sql Server

  1. 1. Topic: Document Storage Management for PCMS To: Development Team Dated: 3rd March 2010
  2. 2. 5.1.Transact-SQL Access:...............................................................................................5 5.2.File System Streaming Access: .................................................................................7 10.Integrated Management................................................................................................10 1. Objective: To do the analysis for the large file storage in MS SQL Database. 2. Problem Definition: PCMS has many documents that needs to be uploaded corresponding to Job Cards in all the modules. As volume of documents is increased with passage of time, so it causes major development and operational overheads. It grows more then GBs with in few months of period. A study was conducted few months back; to adopt a third party file system to maintain documents out side the actual database. Many solutions were analyzed but no concrete option was able to qualify all the selection parameters like security, access speed, storage efficiency and operational management. Microsoft has provided a native solution to this problem. They have merged the benefit of file storage and Database storage under one umbrella with technology named as Filestream. Large file storage is managed via Filestream. 3. FILESTREAM Definition:  FILESTREAM integrates the SQL Server Database Engine with an NTFS file system by storing varbinary(max) binary large object (BLOB) data as files on the file system.  To specify that a column should store data on the file system, specify the FILESTREAM attribute on a varbinary(max) column. This causes the Database Engine to store all data for that column on the file system, but not in the database file. 4. Filestream operations Summary:
  3. 3. 4.1. How to: Enable FILESTREAM Go to Start >> All Programs >> Microsoft SQL Server 2008>> Configuration Tools>> SQL Server Configuration Manager >> SQL Server Services >> instance >> Select the Enable FILESTREAM for Transact-SQL access Then: EXEC sp_configure filestream_access_level, 2 RECONFIGURE 4.2. How to: Create a FILESTREAM-Enabled Database CREATE DATABASE Archive ON PRIMARY ( NAME = Arch1, FILENAME = 'c:dataarchdat1.mdf'), FILEGROUP FileStreamGroup1 CONTAINS FILESTREAM( NAME = Arch3, FILENAME = 'c:datafilestream1') LOG ON ( NAME = Archlog1, FILENAME = 'c:dataarchlog1.ldf') GO The database contains three filegroups: PRIMARY, Arch1, and FileStreamGroup1. PRIMARY and Arch1 are regular filegroups that cannot contain FILESTREAM data. FileStreamGroup1 is the FILESTREAM filegroup. 4.3. How to: Move a FILESTREAM-Enabled Database  To Displays the location of the physical database files that the FILESTREAM database uses.  To Takes the Archive database offline. USE master EXEC sp_detach_db Archive  Create the folder C:moved_location, and then move the files and folders that are listed in step 2  sets the Archive database online. USE master EXEC sp_detach_db Archive GO CREATE DATABASE Archive ON
  4. 4. PRIMARY ( NAME = Arch1, FILENAME = 'c:moved_locationarchdat1.mdf'), FILEGROUP FileStreamGroup1 CONTAINS FILESTREAM( NAME = Arch3, FILENAME = 'c:moved_locationfilestream1') LOG ON ( NAME = Archlog1, FILENAME = 'c:moved_locationarchlog1.ldf') FOR ATTACH 4.4. How to: Create a Table for Storing FILESTREAM Data To specify that a column contains FILESTREAM data, you create a varbinary(max) column and add the FILESTREAM attribute. CREATE TABLE Archive.dbo.Records ( [Id] [uniqueidentifier] ROWGUIDCOL NOT NULL UNIQUE, [SerialNumber] INTEGER UNIQUE, [Chart] VARBINARY(MAX) FILESTREAM NULL ) GO 4.5. Managing FILESTREAM Data by Using Transact-SQL Inserting NULL INSERT INTO Archive.dbo.Records VALUES (newid (), 1, NULL); Creating a Data File INSERT INTO Archive.dbo.Records VALUES (newid (), 3, CAST ('Seismic Data' as varbinary(max))); GO Updating FILESTREAM Data UPDATE Archive.dbo.Records SET [Chart] = CAST('Xray 1' as varbinary(max)) WHERE [SerialNumber] = 2; Deleting FILESTREAM Data DELETE Archive.dbo.Records WHERE SerialNumber = 1;
  5. 5. GO Selecting File Path DECLARE @filePath varchar(max) SELECT @filePath = Chart.PathName() FROM Archive.dbo.Records WHERE SerialNumber = 1 PRINT @filepath GO 5. Dual Programming Model to Access BLOB Data: 5.1. Transact-SQL Access: By using Transact-SQL, you can insert, update, and delete FILESTREAM data See Sample: SqlConnection sqlConnection = new SqlConnection( "Data Source=COMSOFT-23;Initial Catalog=Archive;Integrated Security=True"); SqlCommand sqlCommand = new SqlCommand(); sqlCommand.Connection = sqlConnection; try { sqlConnection.Open(); //The first task is to retrieve the file path //of the SQL FILESTREAM BLOB that we want to //access in the application. sqlCommand.CommandText = "SELECT Chart.PathName()" + " FROM Archive.dbo.Records" + " WHERE SerialNumber = 3"; String filePath = null; Object pathObj = sqlCommand.ExecuteScalar(); if (DBNull.Value != pathObj) filePath = (string)pathObj; else { throw new System.Exception( "Chart.PathName() failed" + " to read the path name " + " for the Chart column."); } //The next task is to obtain a transaction
  6. 6. //context. All FILESTREAM BLOB operations //occur within a transaction context to //maintain data consistency. //All SQL FILESTREAM BLOB access must occur in //a transaction. MARS-enabled connections //have specific rules for batch scoped transactions, //which the Transact-SQL BEGIN TRANSACTION statement //violates. To avoid this issue, client applications //should use appropriate API facilities for transaction management, //management, such as the SqlTransaction class. SqlTransaction transaction = sqlConnection.BeginTransaction("mainTranaction"); sqlCommand.Transaction = transaction; sqlCommand.CommandText = "SELECT GET_FILESTREAM_TRANSACTION_CONTEXT()"; Object obj = sqlCommand.ExecuteScalar(); byte[] txContext = (byte[])obj; //The next step is to obtain a handle that //can be passed to the Win32 FILE APIs. SqlFileStream sqlFileStream = new SqlFileStream(filePath, txContext,System.IO.FileAccess.ReadWrite); byte[] buffer = new byte[512]; int numBytes = 0; //Write the string, "EKG data." to the FILESTREAM BLOB. //In your application this string would be replaced with //the binary data that you want to write. string someData = "EKG data."; Encoding unicode = Encoding.GetEncoding(0); sqlFileStream.Write(unicode.GetBytes(someData.ToCharArray()), 0, someData.Length); //Read the data from the FILESTREAM //BLOB. sqlFileStream.Seek(0L ,System.IO.SeekOrigin.Begin); numBytes = sqlFileStream.Read(buffer, 0, buffer.Length); string readData = unicode.GetString(buffer); if (numBytes != 0) { // Console.WriteLine(readData); System.Windows.MessageBox.Show(readData); }
  7. 7. //Because reading and writing are finished, FILESTREAM //must be closed. This closes the c# FileStream class, //but does not necessarily close the the underlying //FILESTREAM handle. sqlFileStream.Close(); //The final step is to commit or roll back the read and write //operations that were performed on the FILESTREAM BLOB. sqlCommand.Transaction.Commit(); } catch (System.Exception ex) { Console.WriteLine(ex.ToString()); } finally { sqlConnection.Close(); } return; 5.2. File System Streaming Access: The Win32 streaming support works in the context of a SQL Server transaction. Steps:  Read the FILESTREAM file path.  Read the current transaction context.  Obtain a Win32 handle and use the handle to read and write data to the FILESTREAM BLOB. Each cell in a FILESTREAM table has a file path that is associated with it. To read the path, use the PathName property of a varbinary(max) column in a Transact-SQL statement. //Assumes GetConnectionString returns a valid connection string. using (SqlConnection connection = new SqlConnection("Data Source=COMSOFT-23;Initial Catalog=Archive;Integrated Security=True")) { connection.Open(); SqlCommand command = connection.CreateCommand(); try { // Setup the command to execute the stored procedure. command.CommandText = "GetData"; command.CommandType =System.Data.CommandType.StoredProcedure; // Set up the input parameter for the DocumentID. SqlParameter paramID = new SqlParameter("@Id", System.Data.SqlDbType.Int); paramID.Value = 3; command.Parameters.Add(paramID); // Set up the output parameter to retrieve the summary. SqlParameter paramSummary = new SqlParameter("@Chart", System.Data.SqlDbType.VarChar, -1); paramSummary.Direction =System.Data.ParameterDirection.Output;
  8. 8. command.Parameters.Add(paramSummary); // Execute the stored procedure. command.ExecuteNonQuery(); Console.WriteLine((String)(paramSummary.Value)); System.Windows.MessageBox.Show( (String)(paramSummary.Value)); } catch (Exception ex) { Console.WriteLine(ex.Message); } } 6. FILESTREAM Best Practices 6.1. Physical Configuration and Maintenance When you set up FILESTREAM storage volumes, consider the following guidelines: • Turn off short file names on FILESTREAM computer systems. Short file names take significantly longer to create. To disable short file names, use the Windows fsutil utility. • Regularly defragment FILESTREAM computer systems. • Use 64-KB NTFS clusters. Compressed volumes must be set to 4-KB NTFS clusters. • Disable indexing on FILESTREAM volumes and set disablelastaccess To set disablelastaccess, use the Windows fsutil utility. • Disable antivirus scanning of FILESTREAM volumes when it is not unnecessary. If antivirus scanning is necessary, avoid setting policies that will automatically delete offending files. • Set up and tune the RAID level for fault tolerance and the performance that is required by an application. RAID level Write Read Fault Remarks performance performance tolerance RAID 5 Normal Normal Excellent Performance is better than one disk or JBOD; and less than RAID 0 or RAID 5 with striping. RAID 0 Excellent Excellent None RAID 5 + Excellent Excellent Excellent Most expensive option. stripping 6.2. Physical Database Design When you design a FILESTREAM database, consider the following guidelines: • FILESTREAM columns must be accompanied by a corresponding uniqueidentifier ROWGUID column. These kinds of tables must also be accompanied by a unique index. Typically this index is not a clustered index. If the databases business logic requires a clustered index, you have to make
  9. 9. sure that the values stored in the index are not random. Random values will cause the index to be reordered every time that a row is added or removed from the table. • For performance reasons, FILESTREAM filegroups and containers should reside on volumes other than the operating system, SQL Server database, SQL Server log, tempdb, or paging file. • Space management and policies are not directly supported by FILESTREAM. However, you can manage space and apply policies indirectly by assigning each FILESTREAM filegroup to a separate volume and using the volume's management features. 7. Application Design and Implementation • When you are designing and implementing applications that use FILESTREAM, consider the following guidelines: • Use NULL instead of 0x to represent a non-initialized FILESTREAM column. The 0x value causes a file to be created, and NULL does not. • Avoid insert and delete operations in tables that contain nonnull FILESTREAM columns. Insert and delete operations can modify the FILESTREAM tables that are used for garbage collection. This can cause an application's performance to decrease over time. • In applications that use replication, use NEWSEQUENTIALID() instead of NEWID(). NEWSEQUENTIALID() performs better than NEWID() for GUID generation in these applications. • The FILESTREAM API is designed for Win32 streaming access to data. Avoid using Transact- SQL to read or write FILESTREAM binary large objects (BLOBs) that are larger than 2 MB. If you must read or write BLOB data from Transact-SQL, make sure that all BLOB data is consumed before you try to open the FILESTREAM BLOB from Win32. Failure to consume all the Transact-SQL data might cause any successive FILESTREAM open or close operations to fail. • Avoid Transact-SQL statements that update, append or prepend data to the FILESTREAM BLOB. This causes the BLOB data to be spooled into the tempdb database and then back into a new physical file. • Avoid appending small BLOB updates to a FILESTREAM BLOB. Each append causes the underlying FILESTREAM files to be copied. If an application has to append small BLOBs, write the BLOBs into a varbinary(max) column, and then perform a single write operation to the FILESTREAM BLOB when the number of BLOBs reaches a predetermined limit. • Avoid retrieving the data length of lots of BLOB files in an application. This is a time-consuming operation because the size is not stored in the SQL Server Database Engine. If you must determine the length of a BLOB file, use the Transact-SQL DATALENGTH() function to determine the size of the BLOB if it is closed. DATALENGTH() does not open the BLOB file to determine its size. If an application uses Message Block1 (SMB1) protocol, FILESTREAM BLOB data should be read in 60- KB multiples to optimize performance.
  10. 10. 8. When to Use: The size and use of the data determines whether you should use database storage or file system storage. • Objects that are being stored are, on average, larger than 1 MB. • Fast read access is important. • You are developing applications that use a middle tier for application logic. • For smaller objects, storing varbinary(max) BLOBs in the database often provides better streaming performance. • The sizes of the File system based BLOBs are limited only by the volume size of the file system. The standard varbinary(max) limitation of 2-GB file sizes does not apply to BLOBs that are stored in the file system. 9. Integrated Security • In SQL Server, FILESTREAM data is secured just like other data is secured: by granting permissions at the table or column levels. If a user has permission to the FILESTREAM column in a table, the user can open the associated files. • Encryption is not supported on FILESTREAM data. 10. Integrated Management  SQL Server management tools and functions work without modification for FILESTREAM data.  All backup and recovery models works with FILESTREAM data.  Can use a partial backup to exclude FILESTREAM filegroups. 11. Using FILESTREAM with Other SQL Server Features Some limitations: • Database Snapshots • Replication • Log Shipping • Database Mirroring • Full-Text • Failover Clustering • SQL Server Express
  11. 11. References: www.google.com www.msdn.com