Db As Behaving Badly... Worst Practices For Database Administrators Rod Colledge

DBAs Behaving Badly Worst Practices for Database Administrators

About the writer… Rod Colledge Independent SQL Consultant Based in Brisbane, Australia Web; www.sqlCrunch.com Blog; www.rodcolledge.com MVP Deep Dives Book Twitter @rodcolledge linkedin.com/in/rodcolledge

About us…. Dubi Lebel DB”A

About us…. Dubi Lebel DB”A – Dubi Behind All

About us…. Dubi Lebel DB”A – Dubi Behind All D.B.A –

About us…. Dubi Lebel DB”A – Dubi Behind All D.B.A – Don’t Bother Asking… Shahar Bar SQL Consultant and CEO at Valinor

Session Overview Disaster Recovery (DR) Planning Backup & Restore Change Control Storage Configuration File Configuration Indexing Administration Techniques

Disaster Recovery (DR) Planning

1 / 20; Not having SLAs SLAs provide context for “everything”. e.g.; Database available 24/7 @ 99.999% uptime Zero data loss Sub-second response time Use option papers during SLA negotiations

2 / 20; Not having/testing DR plans Do you have DR Plans? How do you know your plans will work? “DR fire drills” All/new DBAs trained in recovery procedures? Location of recovery documents & scripts? Documents/scripts up to date?

2 / 20; Not having/testing DR plans

3 / 20; Narrow definition of disaster Types of disasters; Complete environmental destruction Air conditioning failure Disk crash Accidentally dropping a table/database Security breach; what data was accessed? The next disaster will be unanticipated. Are your DR plans pessimistic enough?

argh! ... who would have thought we needed backups?

4 / 20; Not Taking Backups Huh? Less obvious variations; File system backups only No transaction log backups SAN Snapshots – Recoverability?

5 / 20; Not Verifying Backups How do you know they worked? Verification options RESTORE VERIFYONLY FROM <backup_device> Restore to a Reporting Server Log shipping (log backup verification)

6 / 20; Designing for Backups only Design for restoration! What is the data loss exposure? How long will the recovery take? Script, test & document various restore scenarios

Backup Compression BACKUP DATABASE AdventureWorks2008 TO DISK =‘G:QL BackupWorks.bak’ WITH COMPRESSION

7 / 20; Insufficient Test Environments

8 / 20; No Performance Baseline

9 / 20; No Standard Build/Change Log Without a change log, how can you answer; Why is something different? Who made the change? When was the change made? Was the change successful? What will happen if the change is rolled back?

Demo; Configuration Changes Report

10 / 20; Capacity-Centric Design 200GB database – How many 73GB disks? Capacity Centric; 200 / 73 = 3 disks Performance Centric (reads per sec + (writes per sec * RAID)) / IOPS (1200 + (400 * 2)) / 125 = 16 disks! ~ 1.1TB or 500GB after RAID

Preface: Many Factors Affect Disk I/O Perf There are myriad best practices & considerations for optimal disk I/O subsystem performance. Be mindful of factors such as: RAID level File allocation unit size Number, size, & speed of disks Configuration & capacity of HBAs & fabric switches Consider increasing HBA Queue Depth Network bandwidth Cache on disk, controllers, & SAN Whether disks are dedicated, shared, or virtualized Bus speed Number of paths from disk I/O subsystem to server Driver versions for all components Stripe size Stripe unit size Workload

HDD Architecture: 3-D This image is from a contemporary & otherwise excellent document, but it represents disks as they were over two decades ago! The disk deities at Microsoft won’t allow me to perpetrate such myths. Graphics source: Veritas Storage Foundation™ 5.0 for Windows Best Practices for Storage Management http://eval.symantec.com/mktginfo/enterprise/white_papers/ent-whitepaper_vsfw_5.0_best_practices_for_storage_mgmt_02-2007.en-us.pdf

Partition Alignment Graphic: NTFS 4KB Cluster: Default vs. Aligned RAID Array ***This has CONTEMPORARY RELEVANCE*** This is a very simplified graphic Contemporary relevance Corresponds to default NTFS file allocation unit of 4KB Given common 64KB stripe unit size See the Notes for details Graphics Source: Jimmy May

Partition Alignment Graphic: RAID Array: Default vs. Optimized for SQL Server ***This has CONTEMPORARY RELEVANCE*** This is a very simplified graphic Mark Licata, Senior Technology Architect The worst scenario? Random operations using 64K IO and 64K chunk size. One sector off and you are hitting two disks for every IO thus halving the random performance potential. Note: On a RAID array this means accessing two different stripe units on two separate disks. Graphics Source: Jimmy May

11 / 20; Using Unaligned Partitions

Which of the following RAID levels is not a good choice for write-intensive DBs? RAID-0 RAID-1 RAID-5 RAID-10

12 / 20; Relying on Autogrowth

14 / 20; Full recovery + no log backups When are records removed from the t-log file? Full recovery model; ONLY after t-log backup Simple recovery model; On checkpoint When to use full recovery model? When point in time recovery is required Backup the log file! Take care when moving DBs from/to production

15 / 20; Too many/not enough indexes Small dev db  production (not enough) Loaded with unused indexes (too many) Watch for duplicate or overlapping indexes DMV’s to the rescue sys.dm_db_missing_index_% sys.dm_db_index_usage_stats sys.dm_db_index_physical_stats

16 / 20; Inappropriate index maintenance Code in Books Online: sys.dm_db_index_physical_stats

17 / 20; Update stats after index rebuild

18 / 20; Manual Administration “Automation enables more things to be achieved with fewer mistakes in a given amount of time”

19 / 20; Not defining alerts Manage by exception SQL Agent Alerts; Job failures Performance conditions High severity errors (level 19 +) What about error 825 (level 10) ? http://www.karaszi.com/SQLServer/util_agent_alerts.asp

20 / 20; No task lists/check lists

Demo; Administration techniques

Summary Be “cautiously pessimistic” Design backups from a restore perspective Establish & maintain performance baselines Validate the I/O chain Use a performance-centric design Don’t rely on “all” out of the box settings Understand the indexing DMVs Automate & manage by exception

rodcolledge@gmail.comwww.rodcolledge.comwww.sqlcrunch.com

Complete the Evaluation Form & Win! You could win a Dell Mini Netbook – every day – just for handing in your completed form! Each session form is another chance to win! Pick up your Evaluation Form: Within each presentation room At the PASS Booth near registration area Drop off your completed Form: Near the exit of each presentation room At the PASS Booth near registration area Sponsored by Dell

Thank you for attending this session and the 2009 PASS Summit in Seattle

Db As Behaving Badly... Worst Practices For Database Administrators Rod Colledge

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Db As Behaving Badly... Worst Practices For Database Administrators Rod Colledge

Similar to Db As Behaving Badly... Worst Practices For Database Administrators Rod Colledge (20)

More from sqlserver.co.il

More from sqlserver.co.il (20)

Recently uploaded

Recently uploaded (20)

Db As Behaving Badly... Worst Practices For Database Administrators Rod Colledge

Editor's Notes