A Technical Overview of Microsoft® SQL Server™ 2005 High ...


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

A Technical Overview of Microsoft® SQL Server™ 2005 High ...

  1. 1. A Technical Overview of Microsoft® SQL Server™ 2005 High Availability Features Slide 1. A Technical Overview of Microsoft® SQL Server™ 2005 High Availability Features Hello and welcome to this Microsoft TechNet session, "A Technical Overview of SQL Server 2005 High Availability." My name is Bob Carver. Data is a key component to the operation of an organization, and databases are among the most important services that Information Technology professionals provide to their organizations. This session will cover several technologies in SQL Server 2005 designed to increase the availability of the server, the databases on that server, and the data in those databases. Slide 2. What We Will Cover This session will start by covering several barriers to database availability. Although we can't cover all availability barriers in a single session, we'll cover those that may cause either server or database failure. We'll also address how to protect your databases from user or application errors that may result in deleted or incorrect data. We will look at barriers that prevent concurrent access of data even when the database and data are intact but the data is being modified or the underlying structures are being altered. We will then talk about how failover clustering, database mirroring, and peer-to-peer replication in SQL Server 2005 can be used to help mitigate the effects of server failure or disaster. We will also compare these technologies to other high-availability alternates, such as replication, log shipping, and backup and restore procedures. Next, we'll discuss how to configure SQL Server 2005 database snapshots to provide point-in-time reporting and quick data recovery in the event of accidental or malicious damage to the database. Finally, we'll show you how to mitigate the effects of locking and blocking in OLTP environments by implementing the new optimistic concurrency controls in SQL Server 2005. You will also be able to optimize index operations by creating, altering, and dropping indexes online. Online index operations limit locking of the base table during the operation; therefore, transactions and applications do not have to wait until the operation is complete to read, modify, or insert data. We'll be discussing many advanced concepts today, and we won't have time to explain many common terms. Slide 3. Helpful Experience With this in mind, this session assumes you have some knowledge of SQL Server 7 and 2000. You need to know basic T-SQL syntax. For this session, you will really only need to understand the SELECT and UPDATE operators. You also need to understand how to run stored procedures and DBCC statements. In addition, you need to understand how SQL Server uses transaction logs to store transactions. It is important that you know the difference between the three recovery models and understand how SQL Server uses transaction logs in the recovery process. Finally, you need a basic understanding of how SQL Server handles locking and blocking. Over the next 60 minutes, we'll cover the five items listed in our agenda.
  2. 2. Slide 4. Agenda: Understanding barriers to availability We'll start with a discussion of those items that may affect the availability of your database. You'll learn about availability solutions and issues that affect availability, such as processing environment, hardware, software, malicious attacks, and natural disasters. Once we've covered the barriers to availability, we'll then discuss the options and features available in SQL Server 2005 that minimize their impact on your databases. We'll start by covering clustering and then move on to examine database mirroring. Then we'll compare these solutions to other high-availability solutions in order to see the advantages and disadvantages of each. Next we'll look at how snapshot technologies can be used to mitigate user or application errors. Snapshots provide point-in-time reporting and quick data recovery in the event of accidental or malicious damage to the data in a database. We'll close out the session by looking at data-access concurrency limitations. There are new features in SQL Server 2005 that can make the server more available in environments with heavy transaction processing. We'll examine pessimistic concurrency controls that have historically been available in previous versions of SQL Server and then examine new concurrency controls, including optimistic concurrency controls and snapshot isolation. We'll close out the session by looking at peer-to-peer replication. We will see how peer-to-peer replication can be used to provide a hot standby of a replication publication. To start, we should examine the business requirements that demand database availability. Slide 5. Database Availability Factors Database availability is a complex concept. Many elements are evaluated in determining if systems are sufficiently available to meet an organization's needs. Business requirements include consideration of business hours and locations. A longer workday and locations that span a large number of time zones require a different level of availability than a business with a shorter workday in a single time zone. In addition, business processes also impact the definition of availability. How business transactions are handled could affect whether the database must always be available or whether business can continue when failure occurs. User and management expectations also set different availability requirements on your organization. Technology provides both solutions and challenges to availability. Slide 6. Availability Solutions Computer hardware, especially for servers, is becoming more and more reliable. This increases availability, but hardware can still fail at times. You should consider that it's not only the server hardware that affects availability, also but network and client hardware. The Database Management System server software can also greatly affect availability. How the system accesses data, stores data, and maintains consistency and concurrency affects database availability, usually limiting it. DBMSs can also implement several features that help increase availability. The application software that accesses databases in a DBMS also affects availability. Properly programmed applications can increase availability by implementing good error checking, by monitoring the server connection and possibly failing over automatically to another available server, and by using small transactions to limit concurrency issues. The rest of the session will focus on SQL Server 2005 features that will increase availability of your data-management solutions. There are three general availability barriers that can be mitigated by SQL Server 2005. Slide 7. Issue: Database Failure or Disaster The first barrier is hardware failure. Despite the increasing reliability of hardware, hardware failure is still probably the most common reason an entire database will fail or become unavailable. The failure could be caused by any of the server's hardware components, including the hard drive controller or the hard disks on which the database resides. The general procedure to handle this kind of failure is to maintain a second server that hosts a copy of the database. Other issues are caused by human error. These disasters can range from terrorist attacks to accidents or errors that result in loss of function to your data center. Network or Internet connectivity or power failure are perhaps the most common, whether it is only the room or floor your data center resides on or the entire building that is affected. Depending on
  3. 3. your availability requirements, you could wait until the situation is fixed or you may need another server with a copy of the database in a separate physical location. Another type of disaster that can occur is natural disaster. This could be fire, flood, earthquake, volcanic eruption, and so on. These disasters frequently damage the hardware or infrastructure past the point of quick repair, so you will again require a copy of the database on a remote server location. Another barrier to availability are application errors. Slide 8. Issue: Application Errors Application errors can manifest themselves in any number of events. The most common error is accidental data modifications. These are usually the result of an UPDATE or DELETE statement without a WHERE clause. An stored procedure could also have a bug that causes the application to damage that data. Other possibilities include an administrator, developer, or user programming and running the statement directly, or the application programmer programming the statement incorrectly. It's also possible that a developer may accidentally connect to the production database instead of to the testing database when modifying data or altering the schema. Malicious data modifications can also damage data in a database. All of the options, combined with malicious intent, apply here as well. Concurrency is another barrier to availability. Slide 9. Issue: Concurrency A database is designed to maintain data integrity by preventing simultaneous data modification. From a business requirement and user perspective, having to wait for certain operations can reduce the availability of the database. Making changes to persistent data structures, namely tables and indexes, can also prevent access to data. The operations frequently will move and sort data. Allowing access, particularly update access, would be dangerous for data integrity. You will see in this session how new features in SQL Server 2005 can help recover and repair data and limit these concurrency restrictions. Now that we've looked at the barriers to availability, let's see how we can mitigate them with features in SQL Server 2005. Slide 10. Agenda: Understanding database server failure or disaster In this section, we'll discuss the options and features available in SQL Server 2005 that mitigate our barriers to availability. We'll start by covering clustering and then move on to database mirroring. Then we'll compare these solutions to other high-availability solutions in order to see the advantages and disadvantages of each. Let's start by looking at failover clustering. Slide 11. Failover Clustering Failover clustering provides a hot standby server. By hot standby server, we are referring to the server that takes over database operations when the main server fails. SQL Server failover clustering is built on Microsoft Cluster Services, or MSCS. It can therefore take advantage of all the benefits that MSCS provides. If you are running SQL Server 2005 on Windows Server 2003, this means that you can now have eight nodes in your SQL Server clusters. MSCS provides for automatic failover in approximately 20 seconds, plus the time to complete the recovery process. SQL Server 2005 allows access to a database after all committed transactions have been written to disk but before the uncommitted transactions have been rolled back. This is faster access than SQL Server 2000 when failing over to a secondary cluster node. Microsoft Cluster Services does require special certified hardware. The special hardware requirement is a shared disk array and controllers that are certified by Microsoft for use in a cluster. Some of these products can be quite expensive. No committed data is lost since the clustered servers all share the same database file and transaction log. You will lose any uncommitted transactions because the secondary server initiates the recovery process on all SQL
  4. 4. Server instance databases. SQL Server 2005 failover clustering now supports more SQL services. In addition to the SQL Server Service for OLTP databases, you can now use clustering for Analysis Services, Reporting Services, and Notification Services. You should also be aware that clustering uses shared disk arrays, so only one copy of the database exists. Clustering does not protect against disk failure nor does it protect the database from a disk controller card that writes bad data to the disk array. One more thing to be aware of is that you cannot actively use the database instance on the standby server, although other instances can be run on the standby servers. As an example, let's look at a two-node cluster. Slide 12. Implementing Failover Clustering The best implementations of failover clustering use a second private network for the clusters so the server will not fail over in the event of a network hardware failure. Clients access a cluster virtual server that has its own name and IP address. The use of a virtual server prevents the clients from needing to manually reconnect if a node fails. You should note the shared disk array is a single point of failure. The shared disk array does create a distance constraint. Clustered servers almost always will reside in the same data center, making them susceptible to disasters, both human- made and natural. Database mirroring is a high-availability solution that does not have a single point of failure. Slide 13. Database Mirroring Database mirroring is a new technology designed specifically for high-availability solutions without the drawbacks of clustering. Like clustering, mirroring provides a hot standby server and failover occurs faster than in clustering. Database mirroring can be configured to provide automatic failover. The automatic failover requires a third database instance to act as a witness server. It also requires the Synchronous with Automatic Failover option be selected. This option requires the transaction to be able to be committed at both servers before it commits in the principal database. This ensures that in the case of server failure the mirror has all committed transactions. You can use database mirroring using asynchronous transactions and no automatic failover for a warm standby solution. Database failover occurs very quickly with database mirrors, usually in less than three seconds. If the client application was programmed using ADO.NET 2.0, the client can implicitly fail over to the mirror. The ADO.NET libraries automatically perform this operation without the programmer needing to code it. If the client application is not programmed using ADO.NET 2.0 libraries, the application will need to programmatically handle the server failover. You will not lose any committed data if database mirroring is configured with synchronous transactions. If configured with asynchronous transactions, some committed data loss can occur. Using database mirroring in synchronous transaction mode for high availability will result in some performance impact. Transaction throughput, or number of transactions per second, will likely be minimally impacted. Transactional latency will be affected by the round-trip communication time between the two mirroring servers. If you are using asynchronous transaction mode, transactional latency will not be affected. You can configure a maximum of one mirror for each mirrored database. You can, however, mirror separate databases to different SQL Server instances on separate servers. The database mirror is not available for any other use. It cannot be queried or used for reporting. You can, however, create database snapshots of a database mirror and use those snapshots for reporting purposes. Let's look at the hardware requirements for database mirroring. Slide 14. Database Mirroring Hardware Unlike clustering, no special hardware is required for database mirroring. Database mirroring just requires a second server. The two servers do not even have to have the same hardware. Slide 15. Database Mirroring Hardware Cont...
  5. 5. Database mirroring has virtually no distance limitations because it does not have any shared hardware like the shared disk arrays used with failover clustering. The actual distance limitation depends on how long you want each transaction to wait for the round-trip network traffic between the mirroring partners. If higher latency is acceptable, then a longer distance is allowed. Of course, network latency is not solely determined by distance, but by other factors as well, such as bandwidth. Slide 16. Mirroring with Failover (Process) This is a sample implementation of database mirroring. A primary server and mirror server are configured with a witness server for a hot standby configuration. The gold arrows show failover communication. The blue arrows show mirrored data flow. When using database mirroring, clients connect to the principal database server. Slide 17. Mirroring with Failover (Process) Cont... In the event that the principal server becomes unavailable, either by server or network failure, the mirror server's connection and the witness server's connection with the principal server will fail. The mirror server and the witness server will agree that the principal server is unavailable, and the mirror will become the principal server. Clients will then be redirected to the new principal server, which formerly was the mirror server. Database mirroring with automatic failover uses quorums. A quorum of three exists when all servers are functioning and configured correctly. If the principal server fails, the witness and mirror form a quorum of two. They both must agree that the principal server is unavailable. If the network connection between the principal and mirror breaks but both can still contact the witness, the mirror server does not become active. If the witness and mirror both agree, then the mirror server becomes active and clients are redirected to it. Slide 18. Mirroring with Failover (Process) Cont... When the original principal server is repaired, it will contact the witness server and discover that the original mirror server is now acting as a principal. The original principal server then assumes the mirror server role and synchronizes from the new principal server. The mirror and the witness must be able to communicate with each other at the time of failure. If the mirror is unavailable or in a disconnected state at the time the principal server fails and later is able to communicate with the witness server, automatic failover will not occur. Slide 19. Demonstration: Configuring a Database Mirror The administrator at Contoso wants to configure the company's SQL servers so that the databases are available in case problems occur. First, we will look at a database mirror and how that helps achieve this goal. This session uses two Microsoft® Windows® Server™ 2003 computers with three instances of SQL Server 2005 installed between them. One server exists in Contoso's London office, and the other is in the Sydney office. The primary administration tool for SQL Server 2005 is SQL Server Management Studio. With the SQL Server Management Studio tool, you can view and configure server and database properties. You can also create and run T-SQL scripts and projects, which are a combination of the functionality of the SQL Server 2000 tools, Enterprise Manager, and Query Analyzer. The Sydney server will have the primary copy of the database, and the London server will have the mirror server role. You can create a database mirror to provide a hot standby of your database. The mirroring partner servers apply all transactions to both databases. The first step in setting up a database mirror is to create a backup of the principal database, which is what this script is doing now. The principal database is the online and active database that applications are reading and writing data to. The principal database needs to be in the full recovery mode. We are using the Sydney SQL Server as the principal server. You will also need to ensure that no transaction-log backups have occurred after the backup of the principal database that the mirror database will be restored from. The last Log Sequence Number, or LSN, that the mirror database has must be later than the earliest LSN on the principal database. This script will restore log backups to update the LSN and bypass this problem.
  6. 6. The next step in creating a database mirror is to restore the principal database backup using the NORECOVERY option on the London server. Without using the NORECOVERY option, the mirror will not be able to apply any new transactions from the principal server. The reason for this is that some transactions that have since been committed at the principal database will have been rolled back at the mirror database. We are using London SQL Server as the mirror server. A second instance on the Sydney server will act as the witness server. It is recommended that in real deployments of database mirroring, you use a third Windows server as the witness. Having the witness on the same physical computer as the principal or mirror reduces the robustness of the mirror for high availability. If you place the witness server role on the Windows server that hosts the mirror database, network connectivity between that server and the principal database server will cause the mirror database to become active. This could occur even though clients may still be able to access the principal server without a problem. The mirror will no longer be applying transactions from the principal database. We now need to configure database mirroring. Because the Sydney SQL Server contains the principal database, we will need to connect to it in Object Explorer. The database-mirroring configuration is located in the Database Properties window. You will need to configure security and the HTTP endpoints used for mirroring by starting the Configure Database Mirroring Security Wizard. First, we are asked if we will be configuring a witness server. A witness server is required for automatic failover. Here we can select which server instances we want to configure. We will configure both the mirror and witness server instances. Because we are already connected to the principal database server and will use the default TCP port number of 5022, we can use the defaults here. We need to connect to the London server's instance that contains the mirror database. We will use the default port number here as well. To configure the witness server, first we need to connect to the Witness server instance on the Sydney server. Because the witness SQL instance and the principal SQL instance are on the same Windows server, different port numbers must be assigned for the HTTP endpoints. We will use 5023 as the listener port for the witness server. We will now complete the security configuration. The server instances are now configured. Two more configuration steps remain. First, we want to make sure that the Synchronous with automatic failover option is selected. This commits changes to both the principal and mirror instances, and the witness instance will control automatic failover. We can now start the mirroring process. You will want to wait until the status changes to Synchronized to verify that mirroring was successfully started. The time to synchronize will depend on your server, network speed, and the number of transactions since the principal database was backed up. No transactions have occurred, so the synchronization process is quick. When the mirroring process began, the connection to this database was temporarily closed. This error is expected, and you can ignore it. Now we will open and run a script in the Sydney server instance to verify that we can query the principal server. This script queries the Production.Product table in the AdventureWorks database for products that have a name containing "HL Mountain Frame.", Now, we'll update the ListPrice field on the principal database. The script updated the ListPrice field. The new list price is now 1,000 dollars higher. We can attempt to access the mirror database in the London server instance, but we should get an error message. This script will run the same query just executed against the principal database, but this time against the London mirror database. Notice that SQL Server returns an error, error message #954, specific to attempting to access a mirror database. To simulate a failure of the principal database, we can stop the SQL Server service on the principal database server. , This will stop the MSSQLSERVER service for the Sydney SQL Server instance. After you fail the principal database server, the mirror server and the witness server will communicate. Both SQL Servers will attempt to contact the principal server and, if unsuccessful, will reach a consensus to failover. You will get a successful execution of a query if you now attempt to run a query against the London server's database. You can see that the ListPrice field was replicated to the London server after we modified it on the Sydney server. Now, we'll once again update the ListPrice field on the London server's database, which is now the principal database. You can see that we've successfully updated rows on the database. We'll run the SELECT query again to make sure the ListPrice field was updated again. It's now 1,000 dollars higher than before.
  7. 7. Start the Sydney SQL Server service to bring it back online. It will contact the mirror and witness servers and discover that the mirror has automatically failed over. It will then become the mirror server, with the old mirror server as the principal server. We also need to start the SQL Server Agent service for the Sydney SQL Server instance. We will need to reconnect to the Sydney server, since the connection to that server was lost when the SQL Server service was stopped. Because the service was stopped and did not crash, the connection did not automatically fail over to the London server. Client applications written using ADO.NET 2.0 will automatically fail over when the database fails over. Now if we try to run a query for the Sydney SQL Server instance, we receive an error because it is now the mirror. Now let's compare hot standby options to warm and cold standby options. Slide 20. Comparing High-Availability Options We've looked at two hot standby solutions: clustering and database mirroring. A third hot standby solution, peer-to- peer replication, will be covered later in this session. These hot standby solutions provide for little or no data loss and easy client redirection. Warm standby solutions include transactional replication and log shipping, while cold standby solutions are things like backup and restore and a detach, copy, and attach process. With each of these solutions, data loss is a concern. They do not have transparent client redirection, and most do not have a read-only standby. There are some additional things you should consider. You should evaluate each available solution and see what's best for your organization's needs. For example, only replication provides the granularity to configure high availability for less than an entire database. Also consider that failover clustering does not protect against disk failure and is the only solution that requires specialized hardware. Finally, of the high-availability solutions, database mirroring is the least complex and requires the least amount of administration. Now that we've discussed and compared high-availability solutions, let's move on to the next section of our session. Slide 21. Agenda: Understanding user or application error In this section, we'll look at how snapshot technologies can be used to mitigate user or application errors. Snapshots provide point-in-time reporting and quick data recovery in the case of accidental or malicious damage to the database. We'll start by exploring the snapshot technology. Slide 22. Database Snapshots A database snapshot provides a read-only, static view of a source database as it existed at snapshot creation minus any uncommitted transactions. Multiple snapshots can exist on a source database, and they always reside on the same server instance as the database. Because a database snapshot provides a static view of a database, a snapshot can extend access to data from a particular point in time. For example, you can create a database snapshot at the end of a given time period for later reporting. You can then run end-of-period reports on the snapshot. Before doing major updates, such as a bulk update, you can create a database snapshot on the database to protect the data. If you make a mistake, you can use the snapshot to recover by reverting the database to the snapshot. Reverting is much faster than restoring from a backup; however, you cannot roll forward afterward. By creating database snapshots on a regular basis, you can mitigate the impact of a major user error, such as a dropped table. For a high level of protection, you can create a series of database snapshots spanning enough time to recognize and respond to most user errors. For instance, you might maintain 6 to 12 rolling snapshots spanning a 24-hour interval, depending on your available storage. Then, each time a new snapshot is created, the oldest snapshot can be deleted. We have two examples to show how database snapshots work.
  8. 8. Slide 23. How Database Snapshots Work Look at the key first: Notice that the grey areas denote unallocated pages, while blue denotes the original page, and red the updated page. Blue arrows are used to show the read operation on the snapshot. The red boxes indicate data pages that have been updated. Notice that the snapshot is only storing original values for those data pages that have been updated. The blue arrows indicate what data pages are read during a read operation on the database snapshot. SQL Server will read the data from the source database when the data pages have not been updated after the creation of the snapshot. The space required by a snapshot depends on how many data modifications occur. Slide 24. How Database Snapshots Work Cont... This is a similar diagram depicting a database, potentially the same database, that has been updated more often. The two examples shown could be two different tables in the same database. One of the tables has been updated more than the other since the creation of the snapshot. This will likely be the case in most databases. Some tables are frequently updated-like an orders table-while some are rarely updated-like products sold or countries in which a company does business. Now let's look at a slightly different example. Slide 25. Database Snapshot Scenarios Database snapshots have several possible uses. You can create database snapshots of a mirror database. This is a method of extending the functionality of database mirroring to make the standby server also function as a reporting server. The data, however, would be a point-in-time reporting scenario and not a true, up-to-date reporting scenario. Slide 26. Database Snapshot Scenarios Cont... Another use of database snapshots is to enable point-in-time historical reporting. This would enable users to generate reports of data, for example quarterly sales, that would be valid for that period of time. While some users are querying the snapshot for the point-in-time data, other users can still be updating the database with the new information. Slide 27. Database Snapshot Scenarios Cont... Database snapshots can also be used to recover from administrative error. An administrator could create a database snapshot prior to running a risky or potentially damaging script. In the event that something goes wrong, the administrator can revert the database to the snapshot. Slide 28. Database Snapshot Scenarios Cont... Lastly, you can use database snapshots to help protect against application or user error. This use will generally require the creation of several snapshots to record different values. You could create a snapshot every two hours and delete snapshots older than 24 hours. You could then extract the data from a snapshot when some negative event occurs. Multiple snapshots allow you to recover in case you are not notified of the problem until the next snapshot is created. Slide 29. Demonstration: Implementing Database Snapshots
  9. 9. Database snapshots can be useful as a quick restore point in case of accidental or malicious damage to the data in the database. They are not a valid substitute for proper backups since a database snapshot does not contain all records in the database. You use the CREATE DATABASE statement with the AS SNAPSHOT clause to create a database snapshot. We will need to specify a database file for every file that exists in the actual database. Database snapshots only contain copies of rows that have changed since the snapshot was created. When data is updated or deleted in the database, SQL Server places the prechanged data into the database snapshot. If we look at the database properties of a snapshot, we can see that the reported size is the same as that of the actual database. The amount of data in the snapshot data files is actually very small after you create the snapshot because no data modifications would have taken place. SQL Server actually makes copies of entire data pages, even if only one row has been updated. Using pages instead of rows is more efficient. The read and write performance for an entire page versus an entire row is similar. A page can contain many rows, so even if 100 rows in a data page are updated, SQL Server only needs to do one copy operation. You access a database snapshot in the same way that you access the actual database. The only exception is that you will specify the name of the snapshot instead of the actual database. If the rows requested from the snapshot have not changed since the time the snapshot was created, SQL Server returns the rows from the actual database. Contoso's subsidiary, AdventureWorks, sells bicycle-related products. Requesting the top 10 products with ties for percent margin from the actual database returns 12 products. Most of the products are cycling jerseys, and the highest gross margin percentage is nearly 135.00 percent. You can make modifications to the data in the database and query the old data in the snapshot. In this case, we will increase the ListPrice by 20 percent for the 12 products with the lowest gross margin percentage. We will run the query again after the modification is made in the database and will see a different result set. The products returned by the query are no longer mostly jerseys, and the highest gross margin percentage is no longer 135.00 percent. If you query the database snapshot with the same query, the result set returned by SQL Server is the original 12 products. This script will let us simulate accidental damage to a database. The fabric required to make AdventureWorks' sock products has risen in price recently. We need to increase the StandardCost value to reflect this cost increase. We also need to increase the ListPrice to help maintain our gross margin percentage. This script will let us query for products that contain the word "sock" in the product name and then perform the update. First, we will query to find all products that are socks. We will see a result set of four products with StandardCost values near $3.36 and a ListPrice between $8.99 and $9.50. We can simulate damage by running an UPDATE to change the StandardCost to $5.00 and the ListPrice to $11.50 for products that are socks. The damage occurs here because the UPDATE did not have a WHERE clause and therefore it updated all products. Query the database for all products to verify that data in the database is now invalid because all products now have a StandardCost of $5.00 and a ListPrice of $11.50. We can quickly recover the data because we have a recent database snapshot created. Developers and administrators who want to use snapshots to protect against application or administrator error will want to create several snapshots. You should delete older snapshots as newer snapshots are created. We can see the original result set by querying the database snapshot. All the products have their original StandardCost and ListPrice. SQL Server copied all the original rows into the database snapshot. We can use the database snapshot to repair the data in the database. The script uses a JOIN between the Product table in the snapshot and the Product table in the database. It sets the StandardCost and ListPrice equal to the snapshot's value of each. Another JOIN on the ProductID will update the products with their corresponding original values. We do not have to set all products to their old values. In fact, we do not want to do that because the sock products should have their new values. The WHERE clause filtered out sock products from being updated. We can see that the actual database products now have their original values if we query the Product table in the database. If we were to scroll the result pane and search for sock products, we would see that socks have the new values and not the original ones. Instead, we can make it easier and query for just the sock products. Notice that they have the new values. You can also restore or revert an entire database to a snapshot state. You might want to do this if the data is extensively modified or if the appropriate JOIN query is highly complex.
  10. 10. You should be aware of two important requirements before you perform a revert operation on the database. You should drop any full-text catalogs in the database before reverting to a database snapshot. You will also need to drop all the database snapshots except for the snapshot you are reverting to. For this reason, be careful to make sure you are reverting to the correct snapshot. SQL Server will return an error message if more than one database snapshot exists when you attempt to revert the database to a snapshot. We can see that the data in the database reverts to the data at the time of the snapshot by querying the Product table for sock products. We can see that sock products now have their original values as well. Database snapshots are useful for generating reports. Running reports against a database snapshot avoids locking issues on the live data, and because the snapshot is read-only, the live data cannot be inadvertently modified. We'll look at how you can use SQL Server 2005 Reporting Services with a database snapshot. We'll start by creating a database snapshot on SYD-SQL-01, where we'll also use the reporting server. Once the snapshot is created, we'll go into Business Intelligence Development Studio to deploy a report. We've configured Business Intelligence Development Studio to automatically open a report project we've created for deployment on the SYD-SQL-01 reporting server. We've already created a dataset for the report. The dataset specifies a SQL server, authentication credentials, and a database to connect to. The report consists of a query that runs on the dataset. Here, we'll take a look at the connection properties for our dataset. We want this dataset to return information about AdventureWorks sales orders that are periodically replicated to the database snapshot. We've named the dataset SalesOrders, and we can see that it's connecting to a data source that uses the snapshot we created earlier. We'll go look at the connection information for the data source. We can see here that we're connecting to SYD-SQL-01 and have set the connection to use the database snapshot we created earlier. You can specify the authentication type you need for a report. Here, we're going to use Windows authentication. The report query selects some fields from the Sales and Customer tables, and performs two inner JOINs to produce the report. Before deploying the report, we should check to make sure it returns the data we expect, so we'll run the query first. Once the query runs and we've verified that we get the data we want, we're ready to design a report layout. Here, we've already designed a simple report with a text box for the title and a table to display headers and report data. Note that our dataset returns five columns of data, but we're only reporting on the first three. You could generate other reports with different data from the dataset. If you want, you can also generate a live preview of the report. Right now, however, we'll just save the report and deploy it to the reporting server. We've specified in the properties for the project that when deployed, it should be deployed to the reporting server on SYD-SQL-01. You can deploy reports from here to any reporting server on which you have permissions. Now let's deploy the report. To get detailed information about the progress of the deployment, we can open the output pane. If any errors occur during the build or deployment processes, the error list pane will appear with details of the error message. Once the deployment is complete, we are ready to view the report. To view the report, we'll use Internet Explorer to browse to the reporting server on SYD-SQL-01. For security purposes, we need to supply authentication credentials to see the reports available on the reporting server. We can see that the report we just deployed from LON-DCSQL-01 now appears here. Both the report and the data source are available for viewing. Before we view the report, we'll take a look at the data source information. The data source information is the same as we specified when designing the report. Again, we can see that we're going to run our report from the database snapshot of the AdventureWorks database. The Reports tab shows the list of reports on the server that use this particular data source. Our newly deployed report is listed here. Now we'll run the report against the database snapshot. When the report is finished, we've got a list of sales orders from the database snapshot. Because we've run the report against the snapshot, we've avoided any potential locks on the live data and also avoided affecting the performance of the live data server, LON-DCSQL-01. If orders are entered on the live database, the snapshot must be updated with the new rows in order for them to appear in our report.
  11. 11. We also have a wide variety of choices for exporting our report data for use in other applications. Again, because we're running the report against a database snapshot, and the reporting server is on SYD-SQL-01, any of our reporting activities here won't impose a performance penalty on LON-DCSQL-01, where the live data resides. Now that we've examined snapshot technologies, let's move on to the next section of our session. Slide 30. Agenda: Working with data-access concurrency limitations In this section, we'll look at data-access concurrency limitations. There are new features in SQL Server 2005 that can make the server more available in environments with heavy transaction processing. We'll examine pessimistic concurrency controls that have been available in previous versions of SQL Server and then examine new concurrency controls, including optimistic concurrency controls and snapshot isolation. First, let's review pessimistic concurrency controls. Slide 31. Pessimistic Concurrency Controls Concurrency controls were designed to help maintain data integrity. Without the controls, several transactions could update the same data and decision code in transactions could select inaccurate rows to update. Pessimistic concurrency controls are the default setting in SQL Server 2005. They were the only controls in SQL Server 2000 and earlier versions. Pessimistic concurrency controls require shared resource locks for SELECT operations. This means that updates can not occur while reads are occurring and reads can not occur while updates are occurring. The default behavior is to hold shared locks for just the duration of the SELECT operation. This behavior can be controlled by using different isolation levels. The transaction isolation levels control the behavior of shared locks. The isolation levels solve one or more problems that can occur while accessing data. They range from reading uncommitted data from memory, bypassing the exclusive locks on the rows on disk, and holding shared locks for the duration of the entire transaction. The isolation levels have different impacts on concurrency. Locking the rows for an entire transaction will prevent any updates from occurring until the transaction completes, thereby reducing concurrency and availability of the database. Sometimes it may be necessary for data integrity reasons to lock rows for an entire transaction. In addition to pessimistic concurrency controls, SQL Server 2005 now has optimistic concurrency controls that handle shared locks in a different manner. Slide 32. Optimistic Concurrency Controls The transaction isolation levels that use optimistic concurrency controls are an optional setting. Optimistic concurrency controls are implemented by using row versioning. Slide 33. Optimistic Concurrency Controls Cont... First, let's look at how row versioning works. Within each instance of Microsoft SQL Server 2005 and for databases enabled for row versioning-based isolation levels, the SQL Server 2005 Database Engine assigns a transaction sequence number to each transaction. Transactions start at the time a BEGIN TRANSACTION statement is run. However, the transaction sequence number starts with the first Transact-SQL statement after the BEGIN TRANSACTION or statement. The transaction sequence numbers are incremented by one. Every time a row is modified, the instance of the Database Engine stores a version of the previously committed image of the row in tempdb. Each version is marked with the transaction sequence number of the transaction that made the change. The versions of modified rows are chained using a link list. The newest row value is always stored in the current database and chained to the versions stored in tempdb.
  12. 12. Slide 34. Optimistic Concurrency Controls Cont... As stated before, optimistic concurrency controls use row versioning. These controls are still designed for data integrity, but will allow reads of committed data while other transactions are holding locks that would prevent SELECT statements from completing. SELECT operations that use these concurrency controls do not use shared locks on the rows in the database, but instead use the row versions maintained by SQL Server when the controls are enabled. The different isolation levels for the optimistic concurrency controls control which row versions are accessed by SELECT operations instead of how shared locks are acquired and released. You will want to make sure that tempdb is optimized for performance and has enough disk space before you implement the optimistic concurrency controls. A heavily accessed database can place a significant load on tempdb for both performance and disk usage. SQL Server deletes row versions when they are no longer necessary to support any transaction or operation that requires them. This helps to limit how much space these use, but large transactions or very large numbers of transactions can cause SQL Server to store many versions in tempdb. It's also worthwhile to note that row versioning has several other uses. Row versioning is used for the inserted and deleted tables that are used by triggers. Multiple Active Result Sets, or MARS, sessions make use of row versions as well. ONLINE index operations, which we'll discuss a little later, use row versions during their operations. Finally, row versioning is also used for row snapshots to implement the optimistic concurrency controls. Slide 35. Snapshot Isolation Level Snapshot isolation level is a new transaction isolation level that uses row versioning and does not require shared locks to read data. Snapshot isolation level enables transactions to read committed data that is being modified by other transactions. Snapshot isolation level will allow reads of committed data as the data existed at the time the transaction started. The data may be modified while the snapshot isolation level transaction occurred, but any read operations will read the older, committed data. You can be sure that all read operations in the transaction will be accessing the exact same data. Slide 36. Snapshot Isolation Level Cont... To enable snapshot isolation level, you must use the ALLOW_SNAPSHOT_ISOLATION database option. This does not change the default concurrency control or transaction isolation level. The option simply activates row versioning in the database. Once you select this option, all INSERTs, UPDATEs, and DELETEs cause SQL Server to create row versions, regardless of whether there are any transactions using snapshot isolation level. SQL Server does not know whether a transaction will start before any updates will complete, so it must maintain row versions just in case a snapshot isolation transaction starts. You can enable transactions to use snapshot isolation level by using the TRANSACTION ISOLATION LEVEL SNAPSHOT session command. This command only works if you have previously enabled the ALLOW_SNAPSHOT_ISOLATION database option. Slide 37. Read Committed (Snapshot) Read committed transactions can also be used to reduce concurrency limitations. These transactions read committed data at the time of the statement in the transaction. Different statements in a transaction may get different result sets for the same read operation. This is the default for the pessimistic concurrency controls as well, but read committed snapshots do not acquire shared locks. These transactions can read the latest committed data even if it is being modified by other transactions. Slide 38. Read Committed (Snapshot) Cont...
  13. 13. To enable read committed snapshot isolation, enable the READ_COMMITTED_SNAPSHOT database option. You can use the TRANSACTION ISOLATION LEVEL READ COMMITTED session option to use read committed snapshot transactions. You should notice that the command does not specify to use the snapshot version of the read committed isolation level. If you have enabled read committed snapshot for the database, all read committed transactions use read committed snapshot. Technically, you do not even have to use this session option, since read committed is the default transaction isolation level. Slide 39. Demonstration: Limiting Locking and Blocking The default locking behavior in SQL Server 2005, and the only behavior in earlier SQL Server versions, uses pessimistic concurrency controls. These concurrency controls assume any data modification is important enough for the read transactions to wait for the update, as well as for any updates to wait until read operations are complete. A new type of control using row versioning, or row snapshots, has been implemented in SQL Server 2005 that can help you reduce locking and contention in your databases. This type of control does not require shared locks for performing read operations and uses a concurrency control that has been named optimistic. You should understand the normal pessimistic concurrency-control locking behavior before examining the behavior of the new optimistic snapshot concurrency controls. The use of the word "snapshot" here should not be confused with database snapshots. Row versioning for optimistic concurrency controls uses tempdb to store row history information. This script will begin a transaction that will update the StandardCost and ListPrice for products that are socks. Updates hold exclusive locks for the duration of the transaction in which they occur. Note that the update transaction has not been committed or rolled back and is still active. We can establish a second connection to SQL Server to access the database and see the consequences for other operations that the update transaction has on rows it modifies. This script attempts a simple query for all products in the Product table. We will not see a result set appear. We will also not be able to issue any other queries from this connection because the first query is still running. We can verify this by looking at the query execution-time counter at the bottom-right corner of the query window. The time should be increasing. This query is probably being blocked by the update from the first connection. To verify this is the case, we could start a third connection and run the sp_who and sp_lock stored procedures. We could also use Activity Monitor in SQL Server Management Studio. Activity Monitor is located under Management in the Object Explorer pane. We are interested in looking at locking for a specific object-the AdventureWorks Products table. We need to scroll the detail pane to the right to see the Request Mode, Request Type, and Request Status information. We will see four granted Exclusive locks, indicated by the X in the Request Mode column, with a value of GRANT in the Request Status column. We will also look for a shared lock request, which should be an S in the Request Mode column, with a value of WAIT in the Request Status column. Our read query is indeed waiting for the exclusive locks to be released on the Product table before it can complete. Let us roll back the update transaction to release the exclusive locks so that our query can complete. We will switch back to the SELECT query connection, and we will see that the query has now been successfully completed. Notice the execution time by looking again at the query execution time value. The two new optimistic concurrency-control transaction isolation types are Snapshot Isolation Level and Read Committed (Snapshot). Disk space and performance can become an issue with tempdb since the row versioning takes place in tempdb. This is the case for all databases enabled and using these concurrency-control mechanisms. We can run an ALTER DATABASE statement to enable the optimistic concurrency controls. Snapshot Isolation and Read Committed (Snapshot) each have their corresponding database option that needs to be set. You do not need to select both options if you only want to use one of the concurrency-control mechanisms. However, we will select both options since we plan to use both types of locking behavior. We will start our examination of the optimistic concurrency controls with Snapshot Isolation Level and will repeat the earlier example while using it. Remember that optimistic concurrency controls are implemented with row versions and not with shared locks. We will run the update that was rolled back earlier. We do not need to specify anything here because the new controls do not affect data-modification locks; they only affect read locks. We can verify that the exclusive locks still exist by opening Activity Monitor and viewing locks by object.
  14. 14. If we focus on the Product table, we will see that there are, in fact, four exclusive locks in the database-one lock for each updated row. We will not see any locks in a WAIT state since we have not started any other transactions. We will create a new connection now that we have verified that the exclusive locks exist. This second connection will again be used to query the Product table, while exclusive locks are held in the table. The first step is to activate the Snapshot Isolation Level transaction for the connection. We will use the SET keyword to activate the concurrency control and use the row versions in tempdb. Now we will query the Product table for products that have been updated. We updated the sock products, so those products have exclusive locks on those rows. We can see that the query did not wait for the exclusive locks to be released, and it is not blocked by the update transaction. SQL Server read the data out of the row-versioning information in tempdb. It is clear that SQL Server keeps row-versioning information for every row in the database. The connection that started the update transaction began before we started any connection with Snapshot Isolation Level enabled. However, we were still able to read the data since it existed before the update and while the update transaction was taking place. SQL Server stores these row versions in tempdb once we alter the database to enable Snapshot Isolation Level. SQL Server does delete old row versions if the modification is complete and no read transactions still require them. We will switch back to the update connection and commit the transaction. We committed the transaction to demonstrate the behavior of Snapshot Isolation Level. We can verify that the update was committed by querying the database from the update connection. We should see the updated data. We will switch back to connection 2 and run the query once again after the update transaction has been committed. If we look at the result set, we will notice that even the result set contains the premodified data. We get this result even though changes have been committed in the database. This SELECT statement is still part of a transaction that started before the update was committed to the database. SQL Server reads the row version of the data at the time the Snapshot Isolation Level transaction was started. SQL Server will also perform conflict detection when Snapshot Isolation Level is used. We will receive the error message 3960 if we try to update the rows that were modified by the update transaction. This error occurs because SQL Server detected an update conflict. Rows that this transaction tried to update have been modified and committed after this transaction started. If SQL Server were to allow the second update, an unpredictable update would occur from the perspective of the second update transaction. The second transaction could be using outdated data in WHERE clauses or JOIN criteria to perform its update. The update conflict is designed to prevent the lost update issue. We will begin our examination of Read Committed (Snapshot) transactions in the same manner as before. We will start a connection to perform an update. Different rows in the Product table will be updated in this transaction than in the last example. AdventureWorks' supplier of fabric for jerseys has had to increase its price for the same reason as was necessary for socks, so we will update the StandardCost and ListPrice for all jerseys. As in the previous example, this transaction will not be committed immediately. We should verify once again that exclusive locks have been granted on rows in the Product table. We will open Activity Monitor and view locks by object. We will select the Product table and scroll to the right to see the Request Mode and Request Status columns. We should see eight exclusive row locks this time. Now we will start a new connection to read the StandardCost and ListPrice data for the jerseys. We will enable the Read Committed (Snapshots) concurrency control with this example. The process is the same as with Snapshot Isolation Level. We use the SET keyword to turn on the Read Committed (Snapshot) session option. We will begin a transaction and run a read query on the eight jersey products in the Product table. The result set should return immediately with eight rows. The transaction was not blocked by the update, and, once again, the data returned is the original premodified data. We will switch back to Connection 1 and commit the update transaction to increase the StandardCost and ListPrice for all jerseys. We will query the database from the update transaction connection to verify that the modification has been committed. The eight jersey products are in the result set. All of them have their new values for StandardCost and ListPrice. We will switch back to the Read Committed (Snapshot) query connection and run the SELECT statement again. The result set returns without delay and contains the eight jersey products. Notice that the StandardCost and ListPrice values are the new, updated data values. The Read Committed (Snapshot) concurrency control allows us to read data without acquiring shared locks on the data, just like the Snapshot Isolation Level locking mechanism did. The difference here, however, is that we can read the last committed change even after the transaction started. We would expect that SQL Server would allow us to modify this data since we can read the committed data in our transaction. We will run an update to increase the StandardCost and ListPrice even more. SQL Server will not return an error message and will not prevent us from performing the update.
  15. 15. We can commit the transaction to verify that no error message will occur when we try to commit the transaction. We will run the SELECT statement to verify that the second update did commit. The optimistic concurrency controls can be used to reduce locking, blocking, and contention in your database. Remember that SQL Server still stores the row- versioning data in tempdb. You should still only use these locking mechanisms when locking and blocking are issues or are expected to be issues. Even though locking does not occur on the rows in tempdb, contention can still become an issue. Another new feature of SQL Server 2005 is Online Indexing. Slide 40. Online Index Operations OFFLINE index operations were the only options available in previous versions of SQL Server. The key difference between an ONLINE index operation and an OFFLINE index operation is that during an ONLINE index operation, the underlying data in the table is still accessible during the indexing operation. The table is accessible for both read and update operations. In fact, if you are creating a clustered index, any non-clustered indexes are still available for the optimizer to use. The new clustered index is not used until its creation operation is complete. SQL Server continues to access the old table heap until that time. Of course, a non-clustered index is not available during any operation being applied directly to that non-clustered index. There are also a variety ONLINE index commands, such as: CREATE INDEX, ALTER INDEX, and DROP INDEX. In addition, the ALTER TABLE command can be used if you either ADD or DROP a Unique index or ADD or DROP a primary key with the clustered index option. Let's take a look at an ONLINE index operation. Slide 41. Online Clustered Index Operation When an ONLINE CREATE CLUSTERED INDEX operation is started, there could be many transactions reading and writing data to that table. Let's look at what happens and what access is available during an ONLINE clustered index operation. Slide 42. Online Clustered Index Operation Cont... The first phase is the preparation phase. During this phase, SQL Server will prevent new transactions for a short period of time while it creates the new target index structure and some temporary objects to be used during the creation. Slide 43. Online Clustered Index Operation Cont... During the build phase, transactions can use the table again while data is sorted and copied into the new clustered index. SQL Server will keep track of the rows that are modified during the initial copy of data pages and refresh those data pages in the index. Slide 44. Online Clustered Index Operation Cont... During the final phase, all data is copied and SQL Server once again prevents access to the table for a very short period of time. This time is used to perform cleanup on temporary objects and, more importantly, to invalidate all execution plans that used the previous table structure. This is necessary so that new transactions can use the clustered index and optimize appropriately. SQL Server uses shared locks to prevent access and intent shared locks during the copy operations. Intent shared locks do not prevent read and write operations.
  16. 16. Slide 45. Online Clustered Index Operation Cont... After this phase, the clustered index is built and ready for use. Slide 46. Demonstration: Performing Online Index Operations The default behavior in SQL Server 2005, and the only behavior in earlier versions of SQL Server, is for an index and its underlying table data to be unavailable during the execution of CREATE INDEX, ALTER INDEX, or DROP INDEX statements. This demonstration will show a new feature: online index operations. Indexes are still unavailable while being created or dropped, but the table is still available, even when creating or dropping clustered indexes. The Sales.SalesOrderDetail table has been modified for this demonstration. The clustered index on the primary key has been dropped so the demonstration can create a new clustered index. We will start by performing an offline index operation to better demonstrate the online index operation feature. Notice the WITH (ONLINE = OFF) clause in the CREATE INDEX statement. This clause is optional because the default behavior is offline. If the clause is omitted in the statement, then the index operation is offline. We should also notice that the index being created is a clustered index. Online index operations work with clustered and non-clustered indexes. Online index operations are not available for XML indexes or for indexes if the table contains large object data types, such as text, ntext, image, and even the new varchar (max) and varbinary (max). A clustered index is used here to simplify the demonstration. Normally, you would not create a clustered index with so many columns, but it was done here to ensure that the index operation takes long enough for the demonstration. Open a second connection to run other statements against the SalesOrderDetail table. You will be creating a clustered index so that the table will be unavailable until the index operation is complete. This script will query the SalesOrderDetail table after the first connection starts an offline index-creation operation. We will switch back to the first connection and run the script to create the index. Now we will switch back to the connection with the query and quickly run the SELECT statement. Notice that the query does not return a result within a few seconds. The index operation has made the table unavailable. Any existing non-clustered indexes that might cover this query would also be unavailable in this case because they are being updated within the new key values of the clustered index. Here we see the time it took for the SELECT statement to run. Now we will run the two DBCC statements. The first will drop all data pages from the buffer cache, requiring the data to be read back from disk. This will remove any speed difference in query execution time caused by reading the data from the disk, as was the case with the first query. The second DBCC statement will empty the procedure cache so that we can be certain that the execution plan will not be reused and affect the results. If we now run the same query a second time, we will see that the result set is returned almost immediately. This shows that the index operation was the reason for the delay in query processing. We can also switch back to Connection 1 and note the time SQL Server took to create the index. You can see that the index-creation operation took about the same time, or just a few seconds longer, to complete than the query. The time to switch to Connection 2 and run the query is the difference in the time for the statements to run. Now we will perform an online index operation. First, we will run the same procedure as before, only with the WITH (ONLINE = ON) clause specified in the CREATE INDEX operation. We will need to drop the clustered index first since we have already created it. Notice the WITH (ONLINE = ON) clause in the new clustered index-creation statement. This next script will run a query against the table that we started the online clustered index operation on. The SELECT statement is the same as the one run earlier. We will switch to Connection 1 and run the script to create the clustered index online. Now we will switch back to Connection 2 and run the SELECT statement. Notice that the result set is returned almost immediately. The online index-creation operation will take longer than the offline index operation. That is because SQL Server is performing more steps during an online operation than during an offline operation. If we note the time to carry out the online operation, we will see that it did take longer. How much longer the operation takes depends on the size of the table and the DML (update) activity on the table during the index creation. The process could easily take twice as long or longer to complete online.
  17. 17. That concludes the demonstration of online index operations. Now let's move on to the last section of our session. Slide 47. Agenda: Understanding availability with peer-to-peer replication In this section, we'll look at how you can use peer-to-peer replication to provide a hot standby of a replication publication. We will also see how it can provide fault tolerance through time partitioning. We will then examine an advantage that peer-to-peer replication has over other high-availability options, namely, load balancing. First, let's review the features of peer-to-peer replication. Slide 48. Peer-to-Peer Replication You can use peer-to-peer replication to provide a hot standby of a replication publication. Your publications can be as small as a single table or as large as an entire database. Peer-to-peer replication publications have a few restrictions that standard replication publications do not. A table can exist in only one article that is a member of a publication configured for peer-to-peer replication. Articles assigned to a peer-to-peer replication publication cannot use horizontal or vertical filtering, nor can they contain a timestamp column. Identity columns are not recommended for peer-to-peer replication articles. If you use identity columns, you will need to manage identity ranges on all replication nodes. Unlike clustering or mirroring, peer-to-peer replication has multiple read copies of the data. Read performance can therefore be improved. Update performance is the same as a single database, although all updates must still be recorded in the database and transaction log of every server. There are no realistic distance limitations. No shared hardware is required, and no communication between the replication nodes is required to commit transactions. Transactions are not guaranteed to be committed at all nodes in the case of server failure, so some committed data loss is possible. Peer- to-peer replication uses transactional replication, so the amount of data loss is likely to be small. Peer-to-peer replication has no automatic conflict detection like merge replication does. You must use your application to partition updates to the database. This can be done either by explicitly coding in the application to only update certain rows or by controlling which clients can connect to which servers. You can also use time limitations on your clients to avoid conflicts. The next example explains the use of time restrictions to avoid conflicts and provide high availability through peer-to- peer replication. Slide 49. Fault Tolerance with Peer-to-Peer This example uses a company that has a support infrastructure consisting of three support offices. The offices are located in Los Angeles, London, and Taipei, Taiwan. Their customer support database is replicated using peer-to-peer replication between the three offices. In the case of failure of one of the clients, the office can use and update one of the other office's database but during normal operation have their own, local copy of the database. Each of the support offices is open eight hours with no overlapping time, so updates will be made at only one of the databases at a time. With this scenario, no conflicts should occur and even identity columns would be reasonably safe to use. Peer-to-peer replication can also be used for high availability and database load balancing. Slide 50. Load Balancing with Peer-to-Peer Load balancing is an ability of peer-to-peer replication that is more closely associated with scalability than availability. It is an advantage of peer-to-peer replication over the other high-availability options that you've seen. These examples use front-end IIS application servers. This first example uses update and read load balancing. While all transactions
  18. 18. must eventually be replicated and written at both servers, a burst of transactions can be handled more quickly in this scenario. The application server must partition the updates across the server. Using ranges of products, customers, and order ID numbers are just a few examples. Slide 51. Load Balancing with Peer-to-Peer Cont... The second example on the right is similar to previous replication load balancing scenarios. All updates are performed on one server, but reads are performed from both. The benefit that peer-to-peer replication provides with this scenario is the ability to quickly switch updates to the other server in case of failure or even during maintenance. Slide 52. Peer-to-Peer Replication Topologies Peer-to-peer replication topologies will usually have all nodes replicating to all other nodes. This topology provides for the least amount of data loss and greatest level of availability. You are not required to do this, and you may choose not to because of network bandwidth or server limitations. Peer-to-peer replication checks whether a given transaction has been replicated to a server and will not replicate it a second time to that server. For example, looking at the topology on the left, if an update occurs on Server C and Server C then replicates those changes to Servers A and B, the transaction will not be replicated again between servers A and B. Now let's return to our demo network to see this in action. Slide 53. Demonstration: Using Peer-to-Peer Replication Configuring a distribution database is the first step in configuring replication of any type. We want each server that will participate in peer-to-peer replication to have its own distribution server. We will have a single point of failure in our high availability solution if we use a shared distribution server. Servers in remote locations, like London and Sydney, would usually have distribution servers at each site. We will use the Configure Distribution Wizard to configure the London server for the distribution role. We can make each replication server its own distribution server, or we can have a separate server at each site act as a distribution server. Replication works best with a file share to hold replication snapshots rather than the default file location. You should be careful when you enter the replication snapshot path since the Configure Distribution Wizard does not validate the path. The distribution database stores changes to transactional publications until subscribers can be updated. This server will also be a publisher. Distribution will be configured when we complete this wizard. Now we will finish the wizard and configure distribution on this server. We are making each server its own distribution server in this example. You may want to separate the publisher and subscriber roles from the distribution role to offload the overhead of the distributor from your database servers. Both servers will have the same configuration for distribution. Peer-to-peer replication can only be configured without using an initialization snapshot. You can use a backup of a database to initialize nodes for replication in SQL Server 2005. This is true whether you are going to use peer-to-peer replication or another form of replication. This script will create a backup of the database on the London server. You have several other options available to initialize for replication without a snapshot. You can create duplicate database schemas at each node in the replication and transfer the data using linked servers. You can perform a bulk export and import using Bulk Copy Program to transfer the data. Even easier, you could create a Data Transformation Services, or DTS, package to copy the database objects and data to the peer nodes. Support for replica initialization without a snapshot is a new feature of SQL Server 2005. Using a backup for initialization is an excellent solution. This is especially the case when you are going to be using peer-to-peer replication for an entire database. The other methods would take a significant amount of time on large databases.
  19. 19. This script will restore the backed up database to the Sydney server. Peer-to-peer subscriptions cannot be reinitialized because of the restriction of not using snapshots for initialization. You would need to initialize the data using one of the just-mentioned methods if the data is out-of-sync. It is also the responsibility of the administrator configuring peer-to- peer replication to ensure the data is exactly the same in all the databases at the time the replication is configured. This will likely require restricting database access while replication is configured. You will see how to use the stored procedures to configure peer-to-peer replication in this demonstration. The peer-to- peer replication configuration is also easier to understand using the stored procedures. We need to enable a database for replication. You can do this only after you have configured distribution and publishing on your server. The enabling process creates some tables to track data changes. Peer-to-peer replication replicates each separate transaction using transactional replication. Transactional replication requires you to create a Log Reader Agent. The Log Reader Agent is a small program that reads transactions from the transaction log and copies those to the distributor. This is the publisher-to-distributor step. We receive a message about the Contoso administrator not being able to be granted access to or revoked from the proxy. This is because we are logged in as the administrator. You can create a publication for replication only after you have created a Log Reader Agent. Your publication will contain articles that specify the objects and data you want replicated. We receive a message about a property for our publication being changed to True. The sp_addarticle stored procedure adds articles to your publications. We will need to run it once for each object we want added to the publication. When configuring an entire database to be replicated, this process can get tedious. Using the replication wizards to add articles to your publication will be easier in that case. This example has only one article. We can now create the push subscription to the other node or nodes included in the replication topology. There is only the Sydney office server here to create a push subscription to. We receive a message stating that the distribution agent job will run under the SQL Server Agent Service account. The last step in configuring peer-to-peer replication of the first replica node is to create a Push Subscription Agent. Similar to the Log Reader Agent, the Push Subscription Agent is a small program that reads the distributor for new transactions and copies them to the subscriber database. This is the distributor-to-subscriber step. We have configured one replica node for peer-to-peer replication in a two-node topology. We now have to repeat the sp_addpushsubscription step for each additional node. You will need to enable the database at each node for replication before you can create any of the database objects. You will also need to create a Log Reader Agent for each database because each of the replica databases has its own transaction log in which transactions for replication can occur. You need publications with the same name at each node. Peer-to-peer replication will not work correctly if you use different publication names. Peer-to-peer publications need to have the exact same articles with the same name at each node. The article status must be 16 since this is the only status that peer-to-peer replication supports. Status 16 specifies that the publisher use parameterized statements in published transactions. Peer-to-peer articles also cannot be horizontally or vertically filtered. You can add the push subscription once you have duplicated all the articles and publications. The script creates the push subscription from the Sydney server-the distributor-to the London server-the Subscriber. Only push subscriptions, not pull subscriptions, work with peer-to-peer replication. You can create the agent that copies the data from the distributor to the subscriber after you create the push subscription. Completion of this step on all nodes completely configures peer-to-peer replication, and it is now safe to begin allowing new transactions at any node. You should be careful with peer-to-peer replication. SQL Server is using transactional replication and not merge replication and therefore does not perform conflict detection. It is up to the application to partition the data modifications so that conflicts do not occur. You have now configured the Production.Product table in the AdventureWorks database to be replicated between the London SQL server and the Sydney SQL server. You will want to verify that the replication is actually working. You can perform updates to the table at each server and run queries to verify that the data is replicated. You should query the data on one of the servers to establish the original values. The script contains a query to select the Name, StandardCost, and ListPrice for all HL mountain bike frames. The London server returns several products with a ListPrice amount close to 1,400 dollars. We will make a connection to the Sydney server to verify that the data currently is the same on the other server. Running the same query on the Sydney server returns that same result set as on the London server. We will switch back to the London server and run an update of the ListPrice for the HL mountain bike frames. The product is in higher demand, so increase the ListPrice to 2,000 dollars. We will run the previous query to verify that the update was successful.
  20. 20. All the HL mountain bike frames now have a ListPrice of 2,000 dollars, so the update successfully occurred. The real test that replication works correctly from the London server to the Sydney server is to switch back to the Sydney server connection and run the same query. If the replication was successful, the ListPrice will be 2,000 dollars for all the HL mountain bike frames. You can see that the replication was successful. You now want to test that replication will occur successfully from the Sydney server to the London server. Update the ListPrice, but because demand has skyrocketed, set it to 4,000 dollars. Run the earlier query again on the Sydney server to verify the successful completion of the second update. You can verify the successful replication of the update by running the query one last time on the London server. You can see that the update did occur successfully since the price of all HL mountain bike frames are now 4,000 dollars. We can use the Replication section in Object Explorer as a second method of verifying the successful configuration of peer-to-peer replication. We should see a publication-in this case, Peer2Peer-configured on the AdventureWorks database after the publication was created. There should also be a push subscription object beneath the publication. This object is created after the successful execution of the sp_addpushsubscription stored procedure. A subscription object should also exist under the Local Subscriptions level. You will not see this object if the sp_addpushsubscription_agent stored procedure was not successfully carried out. We need to verify that those objects exist at all servers in our peer-to-peer replication topology. Because of the successful replication witnessed in the preceding section, we know that we will see that all the required replication objects do exist. We can use the Replication Monitor tool to view the history for both the publisher-to-distributor and distributor-to- subscriber synchronizations. We will need to add and connect to the replication publishers we are interested in monitoring. We can add a SQL Server publisher, an Oracle publisher, or a distributor and its publishers. The third option is useful if you have a central distributor for multiple publishers. You can add all the publishers using that distributor in one operation. The Sydney publisher is already listed here, so we will need to add the London publisher. These two publishers are added separately because they are not using a shared distributor, which provides a more robust, high-availability solution. First, we will look at the London publisher. In the details pane on the right, we can see the current replication status. We can click the Subscription Watch List tab to view the status of all subscriptions on that publisher. If we right-click a subscriber entry and then select View Details, we can see specific information about the subscription. The Distributor to Subscriber History tab shows the current status and history of replicated operations and transactions. There should be an entry for the one update transaction created on the London server. You will set up all nodes the exact same way, so you should see similar information. If you look at the Sydney server replication history, the status should be the same as for the London server. The exact transaction and operation history details normally will not match. They do in this example because essentially the same transaction was performed on both servers. This completes our session. What have we covered today? Slide 54. Summary SQL Server 2005 is designed to be highly available. Traditional availability options have been enhanced to provide better, faster, and more reliable failover that can take place over longer distances. Failover options address everything from hardware failure to human-caused failure to natural disaster. New features in SQL Server 2005, such as database mirroring, provide hot standby solutions that are extremely reliable and cost-effective and that aren't limited by distance. Features like database snapshots are flexible and can be implemented independently to provide higher availability or to augment and enhance other high-availability options. They can also be used to provide a quick fallback option for administrators who maintain databases regularly. Other features, such as the new concurrency controls and ONLINE index operations, allow for better data access during normal operation, meaning that users can continue to get to their data even when SELECTs are being performed and indexes are being created.
  21. 21. To get more information on the products and technologies we have covered today, we have some online resources available that can help. Slide 55. More Information For the most comprehensive technical information on Microsoft products, visit the main TechNet Web site at www.microsoft.com/technet. Additionally, visit www.microsoft.com/technet/sql-03 for more information on books, courses, certifications, and other community resources that relate directly to this particular session. There are a number of other resources that are available from Microsoft. Slide 56. Microsoft Press There are MS Press publication planned for SQL Server 2005, currently the SQL Server Administrators Companion and Pocket Consultant are due for publication around the launch time frame. Microsoft also has instructor led courses if you prefer the classroom-style environment. Slide 57. Microsoft Learning Microsoft Learning (formerly Microsoft Training and Certification and Microsoft Press) develops courseware called Microsoft Official Curriculum (MOC), which includes E-Learning, Microsoft Press books, workshops, clinics, and Microsoft Skills Assessment. MOC is offered in instructor-led environments; it offers comprehensive training courses for IT professionals, and will teach you how to support and implement solutions using Microsoft products and technologies. The courses that best support this session are MS-2733: "Updating Your Database Administration Skills to Microsoft SQL Server 2005" and MS-2734: "Updating Your Database Development Skills to Microsoft SQL Server 2005," both of which are already available. For more information, please visit www.microsoft.com/learning. There is also an assessment program available that can help you test your knowledge. Slide 58. Skills Assessment Microsoft Skills Assessment is a free online learning tool. It's an easy way for IT professionals to check their skills. You can quickly check your skills in implementing or managing Microsoft products or business solutions. Just take a short, 30-question assessment, and see how well you know your stuff. The Skills Assessment includes a Personalized Learning Plan, which includes links to Microsoft Official Curriculum, specific TechNet articles, Microsoft Press books, and other Microsoft Learning content. There's also a way to measure how well you did compared with others who took the same assessment. Microsoft Skills Assessment is an expanding learning platform. Available now are assessments for Windows Server™ 2003, including security and patch management; Exchange Server 2003; Windows Storage Server; Office 2003; and Visual Studio® .NET. If you want to take your skills assessment to the next level, a number of certification programs are available. Slide 59. Product Usage Scenario Certifications For SQL Server 2005, there is a New Generation of Certifications. The Technology Specialist in SQL Server 2005 credential is the base prerequisite for the 3 professional certifications: Database Administrator, Database Developer and Business Intelligence Developer. These new certifications are more targeted to specific job roles, map to the most
  22. 22. common product usage scenarios and are a more cost-effective way to build validate your skills. For more information on the New generation of Certification go to www.microsoft.com/mcp. This session was brought to you by TechNet, TechNet has gone through some changes recently which I'm sure you'll find interesting to hear about Slide 60. TechNet Subscription Many of you may be familiar with TechNet events and the TechNet Web site, but have you realized the benefits of being a TechNet Plus subscriber? A TechNet Plus subscription is the most convenient and reliable resource for IT professionals evaluating, managing, and supporting Microsoft technologies. With a TechNet Plus subscription, you can: Evaluate Microsoft software without time limits. This benefit allows you try products at your own pace and make informed decisions without worrying about the software expiring. TechNet Plus evaluation software includes the latest Microsoft operating systems, server applications, and Office products. With TechNet Plus, you can also save time resolving technical issues. TechNet Plus subscriptions include a range of support options, including the complete Microsoft Knowledge Base delivered each month on portable media, and two complimentary professional support incidents to address your technical roadblocks. TechNet Plus offers centralized access to current, authoritative technical guidance, software and support resources from Microsoft. IT professionals around the world rely on TechNet Plus to help them build their technical expertise and be successful implementing Microsoft solutions. For details, visit www.microsoft.com/ technet/subscriptions On the subject of TechNet and support, the new TechNet support pages outline all the support options open to you. Slide 61. TechNet Troubleshooting and Support The enhanced TechNet Troubleshooting and Support page outlines all the ways to get support assistance from Microsoft. From free online support options to subscription-based support, you'll find all your Microsoft support resources in one location at www.microsoft.com/technet/support. TechNet also provides a number of community resources. Slide 62. Community Help There are a number of free community resources available on TechNet. You can attend a regular chat with members of the products groups or technology specialists from Microsoft, or you can attend a webcast where you can see sessions like the one you've just watched but presented live and with the ability to ask questions as you go. You can also read or post questions in the public newsgroups. The Newsgroup page lists the available groups and provides an interface from which you can read and post messages. TechNet Plus subscribers can use these groups to post questions that, through their subscription ID, will be answered by Microsoft within 24 hours. The main community site provides a comprehensive list of resources available-more than we can cover on this slide-plus the page has some dynamic features with continually updated content. The Events page provides dates and details where you can attend a TechNet event live. These events take place worldwide and provide the opportunity for you to talk to Microsoft specialists face- to-face. And finally, the TechNet Columns provide a variety of topics written by industry authors.