Divide & Conquer Reporting By Scaling Out with Replication
Conquer Reporting By Scaling Out SQL Server January 16, 2013
My Contact InformationAnthony Sammartino MCITP(BI), PMP®Vice PresidentTekPartners Business Intelligence Solutions(954) firstname.lastname@example.orgTwitter: @SQLAntLinkedin: http://www.linkedin.com/in/apsammartinoBlog: www.sqlant.meWeb: www.tekpartnersbi.com
About Me• Started working with SQL Server 7.0 in 1999• 14 Data Warehouse / BI Projects in 13 Years• 11 Years SQL Server 2000, 2005, 2008(R2), 2012• Live in South Florida with my wife and two daughters• Fan of Miami and Philadelphia Sports teams, all things technology• Solutions Architect, Project Manager, Technical Lead MCP(SQL Server), MCTS(BI), MCITP(BI), PMP®
Session Overview1. Case for Scaling Out SQL Server for Reporting2. Choosing a Method to Scale Out SQL Server3. Architecture with Reporting Scaled Out4. Replication Server Roles5. Replication Agents6. Distribution Database7. Publications, Articles, Subscriptions8. Partitioning9. References for Replication10. Questions
The "Situation"Problem Statement:A company named (House Widgets Inc.) has a main application that sellswidgets online. Every time the Sales staff runs reports to see how manywidgets were sold the application slows, and even times out.Sales staff complains their reports are slow. Users complain application is slow.Background:• DBA team has tuned the reports (T-SQL) through execution plans.• They also verified its using the correct indexes (Seeks).• They even resorted to dirty reads using isolation level and nolocks in the code.• Deemed the data is just to large and too busy. Thousands of users are trying to access it, change it at the same time.• Solid State Disks were even thrown at it. Helped for a short time.Resolution:• Time to think about separating the databases for the application and reporting.• Many ways to scale out SQL Server for reporting.• Many variables to think about.
Choosing a Method to Scale OutRequirements• Requirement is to have real time reporting.• Requirement is to only report on half of the db tables.Transactional Replication…Why?• Transactional Replication happens in near real time.• Once setup, transactions flow.Other types of Replication• Snapshot: Point in time. For example once a day at 9am. Not Near real time.• Merge: Near real time but ideal for situations where two dbs need to be kept in sync with writes in two locations. Example: Bank with two branches in NY, FL.Database Mirroring• Mirroring is very easy to setup and maintain.• It’s a whole database or nothing solution.• Point in time using database snapshots.Always On Availability Groups *NEW SQL Server 2012• Near real time transactions.• It’s a whole database or nothing solution.• Enterprise Edition needed $$$ on both Source and Replica.Service Broker• Near real time transactions.• Harder to setup and maintain, more suited for transforming data.
Architecture with Reporting Scaled Out to a Replica Application (HouseWidgets.com) SQL Server (SQLVM03) SQL Server (SQLVM03SUBSCRIBER) Publisher Distributor & Subscriber Application Application DB- DB Distribution Replica Transactional Replication Read Only
Replication Server RolesPublisher:The publisher is the source SQL Server that contains database objects that youwish to replicate. In this case SQLVM03.Distributor:The distributor is the server that you wish to setup to “distribute” transactionsfrom the publisher to the subscriber server.Subscriber:The Subscriber is the server that hosts the replica database and where thereporting will be directed from the application. In this case SQLVM03Subscriber
Secret Agents…Man• Replication creates a number of Agents.• These Agents run as jobs scheduled under SQL Server Agent.• SQL Server Agent must be turned on.• Recommend restarting options be checked, and configure SQL Server Agent with a service account with access to both machines.• Replication agents can be administered from SQL Server Replication Monitor and SQL Server Management Studio.
More Secret AgentsSnapshot Agent• Used with all types of replication.• It prepares schema and initial data files of published tables and other objects.• Records information about synchronization in the distribution database.• The Snapshot Agent runs at the Distributor.Log Reader Agent• Used with transactional replication.• Moves transactions marked for replication from the Publisher to the distribution database. Each database published using transactional replication has its own Log Reader Agent.Distribution Agent• Used with snapshot replication and transactional replication.• It applies the initial snapshot to the Subscriber and moves transactions held in the distribution database to Subscribers.• The Distribution Agent runs at either the Distributor for push subscriptions or at the Subscriber for pull subscriptions.
More Secret AgentsMerge Agent• Used with merge replication.• It applies the initial snapshot to the Subscriber and moves and reconciles incremental data changes that occur.• Each merge subscription has its own Merge Agent that connects to both the Publisher and the Subscriber and updates both.• The Merge Agent runs at either the Distributor for push subscriptions or the Subscriber for pull subscriptions.Replication Maintenance JobsReplication has a number of maintenance jobs that perform scheduled andon-demand maintenance.
Distribution DatabaseDistribution DBThe distribution database holds all of the transactions from the publisherdatabase and applies them to the subscriber.Performance TipAs a general rule of thumb for smaller publisher databaseimplementations small in size, small number of concurrent users, small numberof transactions, place the distributor and subscriber on one server separatefrom the publisher.For larger database implementations large in size, large number of concurrentusers, large number of transactions, place the distributor on its own server andthe subscriber on its own server, both separate from the publisher.Always separate the distributor and subscriber from the publisher.Your goal is to take load off the publisher with this architectural solution.
Publications, Articles, SubscriptionsPublicationsA Publication is what is created on the Publisher server and stores articles thatwill be replicated.Tip: One table per publication, especially if they are large. Re-snapshot.ArticlesTables, Stored Procedures, Views , Indexed Views, User Defined Functions.SubscriptionsA Subscriber is what is created on the Subscriber server to receive all of thetransactions of the Publication.
Partitioning Large Tables – How to deal?• Partitioning Schemes can be propogated from the Publisher but don’t have to.• They can also be created a the subscriber entirely different better for reporting.• If you want to archive the Publishing database but keep the subscriber around you can get creative with Partition swapping.• Replication is scalable. I have replicated a 6 billion record table with good performance. A lot of partitions
Replication Referenceshttp://www.replicationanswers.comPaul Ibison’s website full of good replication…well answers.http://www.sqlsoldier.comRobert Davis’s website and if you use the search box you can find some good stuffon replicationTwitter: #sqlhelp
QuestionsThank you for your time today and hope you found this educationalAnthony Sammartino MCITP(BI), PMP®Vice PresidentTekPartners Business Intelligence Solutions(954) email@example.comTwitter: @SQLAntLinkedin: http://www.linkedin.com/in/apsammartinoBlog: www.sqlant.meWeb: www.tekpartnersbi.com