2. About the writer… Rod Colledge Independent SQL Consultant Based in Brisbane, Australia Web; www.sqlCrunch.com Blog; www.rodcolledge.com MVP Deep Dives Book Twitter @rodcolledge linkedin.com/in/rodcolledge
9. 1 / 20; Not having SLAs SLAs provide context for “everything”. e.g.; Database available 24/7 @ 99.999% uptime Zero data loss Sub-second response time Use option papers during SLA negotiations
11. 2 / 20; Not having/testing DR plans Do you have DR Plans? How do you know your plans will work? “DR fire drills” All/new DBAs trained in recovery procedures? Location of recovery documents & scripts? Documents/scripts up to date?
13. 3 / 20; Narrow definition of disaster Types of disasters; Complete environmental destruction Air conditioning failure Disk crash Accidentally dropping a table/database Security breach; what data was accessed? The next disaster will be unanticipated. Are your DR plans pessimistic enough?
16. 4 / 20; Not Taking Backups Huh? Less obvious variations; File system backups only No transaction log backups SAN Snapshots – Recoverability?
17. 5 / 20; Not Verifying Backups How do you know they worked? Verification options RESTORE VERIFYONLY FROM <backup_device> Restore to a Reporting Server Log shipping (log backup verification)
18. 6 / 20; Designing for Backups only Design for restoration! What is the data loss exposure? How long will the recovery take? Script, test & document various restore scenarios
19. Backup Compression BACKUP DATABASE AdventureWorks2008 TO DISK =‘G:QL BackupWorks.bak’ WITH COMPRESSION
23. 9 / 20; No Standard Build/Change Log Without a change log, how can you answer; Why is something different? Who made the change? When was the change made? Was the change successful? What will happen if the change is rolled back?
28. 10 / 20; Capacity-Centric Design 200GB database – How many 73GB disks? Capacity Centric; 200 / 73 = 3 disks Performance Centric (reads per sec + (writes per sec * RAID)) / IOPS (1200 + (400 * 2)) / 125 = 16 disks! ~ 1.1TB or 500GB after RAID
29. Preface: Many Factors Affect Disk I/O Perf There are myriad best practices & considerations for optimal disk I/O subsystem performance. Be mindful of factors such as: RAID level File allocation unit size Number, size, & speed of disks Configuration & capacity of HBAs & fabric switches Consider increasing HBA Queue Depth Network bandwidth Cache on disk, controllers, & SAN Whether disks are dedicated, shared, or virtualized Bus speed Number of paths from disk I/O subsystem to server Driver versions for all components Stripe size Stripe unit size Workload
30. HDD Architecture: 3-D This image is from a contemporary & otherwise excellent document, but it represents disks as they were over two decades ago! The disk deities at Microsoft won’t allow me to perpetrate such myths. Graphics source: Veritas Storage Foundation™ 5.0 for Windows Best Practices for Storage Management http://eval.symantec.com/mktginfo/enterprise/white_papers/ent-whitepaper_vsfw_5.0_best_practices_for_storage_mgmt_02-2007.en-us.pdf
31. Partition Alignment Graphic: NTFS 4KB Cluster: Default vs. Aligned RAID Array ***This has CONTEMPORARY RELEVANCE*** This is a very simplified graphic Contemporary relevance Corresponds to default NTFS file allocation unit of 4KB Given common 64KB stripe unit size See the Notes for details Graphics Source: Jimmy May
32. Partition Alignment Graphic: RAID Array: Default vs. Optimized for SQL Server ***This has CONTEMPORARY RELEVANCE*** This is a very simplified graphic Mark Licata, Senior Technology Architect The worst scenario? Random operations using 64K IO and 64K chunk size. One sector off and you are hitting two disks for every IO thus halving the random performance potential. Note: On a RAID array this means accessing two different stripe units on two separate disks. Graphics Source: Jimmy May
38. 14 / 20; Full recovery + no log backups When are records removed from the t-log file? Full recovery model; ONLY after t-log backup Simple recovery model; On checkpoint When to use full recovery model? When point in time recovery is required Backup the log file! Take care when moving DBs from/to production
40. 15 / 20; Too many/not enough indexes Small dev db production (not enough) Loaded with unused indexes (too many) Watch for duplicate or overlapping indexes DMV’s to the rescue sys.dm_db_missing_index_% sys.dm_db_index_usage_stats sys.dm_db_index_physical_stats
49. Summary Be “cautiously pessimistic” Design backups from a restore perspective Establish & maintain performance baselines Validate the I/O chain Use a performance-centric design Don’t rely on “all” out of the box settings Understand the indexing DMVs Automate & manage by exception
51. Complete the Evaluation Form & Win! You could win a Dell Mini Netbook – every day – just for handing in your completed form! Each session form is another chance to win! Pick up your Evaluation Form: Within each presentation room At the PASS Booth near registration area Drop off your completed Form: Near the exit of each presentation room At the PASS Booth near registration area Sponsored by Dell
52. Thank you for attending this session and the 2009 PASS Summit in Seattle
Editor's Notes
ההתנהגות הרעה שאנחנו נדבר עליה מוגקבלת למנהגים גרועים של DBA, בתחום הדבאות ולא מעבר לזה.הסיפור על הבן