Your SlideShare is downloading. ×
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Databases for Storage Engineers
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Databases for Storage Engineers

653

Published on

A short introduction to SQL Serv

A short introduction to SQL Serv

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
653
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
39
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • SQL Server will auto generate statistics for columns unless we ask it not to. This helps, but is never accurate
  • Highlight spill warning
  • http://msdn.microsoft.com/en-us/library/ms345408(v=sql.90).aspx
  • http://msdn.microsoft.com/en-us/library/dd207003.aspx
  • NET START MSSQL$<instance> /f /T3608 will bring instance up without tempdb
  • http://msdn.microsoft.com/en-us/library/ms175935(v=sql.105).aspx
  • Transcript

    • 1. DatabasesFor storage People Thomas Kejser thomas@kejser.org http://blog.kejser.org @thomaskejser
    • 2. Agenda• The Microsoft Database Stack• Hard problems the database solves• File layout and I/O pattern • Data and Log Files • Analysis Services Files • TempDb and other system databases• Installation of SQL• Q&A
    • 3. The SQL Server Stack
    • 4. Product Portfolio• SQL Server (aka: Core Engine)• SQL Server Analysis Services (SSAS) • Tabular • Multi Dimensional• SQL Server Service Broker (SSB)• SQL Server Integration Services (SSIS)• SQL Server Reporting Services (SSRS)• SQL Server Data Quality Tools• SQL Server Master Data Services• SQL Server Parallel Data Warehouse• .NET stuff…• Various Excel plug-ins• A “full” stack!
    • 5. What Type of Workload? Big Simulation ETL Data Returned Small OLTP BI/DW Small Big Data Touched
    • 6. A Template OLTP System “App” tier.NET .NET .NET .NET Web Server Windows License Database Tier Web/Core Licensing 2 or 4 sockets Core
    • 7. A Template Data Warehouse SSASSSIS CoreSSIS SSASSSISSSIS Core Core SSRSIntegration Tier “Enterprise” Warehouse Tier BI / Presentation / Cubes Blades Large machines CPU Intensive VERY CPU greedy Medium Servers low IOPS VERY I/O greedy (GB/sec) Can be IOPS greedy
    • 8. Fast Track Data Warehouses
    • 9. A Template MPP WarehouseSSIS SSASSSISSSIS CoreSSIS Data Marts (The “spokes”) Enterprise Warehouse Tier Appliance (The “hub”)
    • 10. Management Tools you Need to KnowPre 2012 2012Management Studio (Management Studio)(AKA: Enterprise Manager)Project Data Dude Data ToolsConfiguration Manager Configuration ManagerSQL Server Profiler Xevent TracingReporting Services Config Reporting Services ConfigManager ManagerSp_configure Sp_configure / ALTER SERVER
    • 11. Hard problemsdatabaseshelp you solve
    • 12. Query Plan Generation Find all parts bought by Thomas Kejser
    • 13. Express Problem, Auto get solutions
    • 14. To do this well, we need StatisticsSQL Did it I did it  THIS is not accurate and it will never be!
    • 15. … and we Need Indexes B+ Tree
    • 16. 95% of all database problems* are caused by:A) Poor indexingB) Wrong StatisticsA) Badly written queriesB) All of the above * Low estimate, trying to be nice to humanity
    • 17. And most of the time, there is nothing you can do about that*… which is where storage come into the picture * AKA: “Craplications”, technical term
    • 18. Two types of bad Queries• The CPU Bound • Have to help rewrite C L 2 L • Better storage does not help C L 2 3 • But DBAs may still believe it is I/O CPU• The I/O bound • Can throw NAND at it • I will show you how to diagnose• DBA people like to talk about this like…
    • 19. Response time = Service Time + Wait Time Algorithms “Bottlenecks” and Data Structures
    • 20. When Speaking about Service Time• We normally end up talking about bad join plans• Joins come in three flavours • Merge • Hash • Loop
    • 21. Merge Join m row result n row result 1 1 1 2 2 3 Sorted 3 4 4Sorted 43 43 Complexity: O(m + n)
    • 22. Hash Join m row result n row join table 1 43 13 3 Hash(1) n row hash table 7 Complexity: O(m + 2n)
    • 23. Loop Join m row result 1 43 Log(n) reads 13 3 n row B-tree 7 Complexity: O(m * log(n))
    • 24. When Hash Joins hurt you Runtime (seconds) 30 25 20 15 10 5 Spill Zone! 0400 350 300 250 200 150 100 50 0 Hash Memory (MB)
    • 25. Join Hints B probed, lower table in join (second table in join statement) A probed, upper table in join (first table in join statement) Just the way it is …
    • 26. Why is it so hard to get joins right? Time Loop JoinMerge Join Hash Join n m
    • 27. No-one has beenable to get joins consistently right! P = NP ?
    • 28. Getting I/O right… Language Processing (Parse/Bind) Query Optimization Statement/Batch Execution (Plan Generation, View Matching, Statistics, Costing) Query Execution (Query Operators, Memory Plan Cache Management Grants, Parallelism) Storage Engine (Access Methods, Database Page Cache, Locking, Transactions, …) SQL-OS (Schedulers, Buffer Pool, Memory Management, Synchronization Primitives, …)
    • 29. The Storage Engines makes I/O Transparent! Rest of engine only sees the API Storage Engine RAM Storage
    • 30. Two Different PhilosophiesPrimitive SQL Server Analysis ServicesScheduling Voluntary Yield, User Kernel mode, Preemptive modeI/O Engine Dedicated I/O stack Windows Buffered I/OWaiting / Spinning SQLOS Primitives WindowsMemory Management SQLOS / Storage Engine Windows PagingSerialisation TDS special purpose XMLNetwork Fully optimizable, async, Windows primitives, affinitized engine blocking
    • 31. SQL Server is different• Primitives are a different beast than Windows• Scale issues are generally specific to the core, not Windows• Exposes own “belly of the beast” profiling• SQL Team build their own primitives, often better than Windows core• Highest throughput app on Windows, drives all the scale stuff there
    • 32. Analysis Services is “just another App”• Analysis Services relies fully on Windows primitives• You can profile it by looking at how Windows behaves• Upgrades to Windows are more likely to help it• No TPC style benchmarks…
    • 33. A is for Atomic LINEITEM LINEITEM LINEITEM ORDER_KEY ORDER_KEY ORDER_KEY PART_KEY PART_KEY PART_KEY COMMITDATE COMMITDATE COMMITDATE QUANTITY QUANTITY QUANTITY ORDER ORDER ORDER ORDER_KEY ORDER_KEY ORDER_KEY CUSTOMER_KEY CUSTOMER_KEY CUSTOMER_KEY
    • 34. C is for Consistency LINEITEM LINEITEM LINEITEM ORDER_KEY ORDER_KEY COMMITDATE PART_KEY = 42 = 2012-02-30 COMMITDATE QUANTITY ORDER ORDER ORDER ORDER_KEY ORDER_KEY != 42 ORDER_KEY CUSTOMER_KEY
    • 35. I is for IsolationSELECT @LastTransaction_ID =LastTransaction_IDFROM ATMWHERE ATM_ID = 13 SELECT @LastTransaction_ID = LastTransaction_ID FROM ATM(@LastTransaction_ID = 42) WHERE ATM_ID = 13 (@LastTransaction_ID = 42)SET @ID = @LastTransaction_ID + 1 SET @ID = @LastTransaction_ID + 1UPDATE ATM UPDATE ATMSET @LastTransaction_ID = @ID SET @LastTransaction_ID = @IDWHERE ATM_ID = 13 WHERE ATM_ID = 13
    • 36. D is for Durability Do Transactions Do Transactions Do Transactions Do Transactions Do Transactions Ack Do Transactions Ack Ack Do Transactions Ack Do Transactions Do Transactions Ack Do Transactions Ack Ack Ack Ack Ack
    • 37. Summary – Databases Help You• Do complex operations in optimal time• …at high parallelism• Optimise I/O pattern• Be ACID compliant• Store stuff safely…• noSQL/Big Data systems trade off >0 of these to get more of the others
    • 38. System Databases• Server won’t start without: • master • mssqlsystemressource• System CAN start, but wont work well • model • msdb• System will start under special conditions • tempdb
    • 39. Master and mssqlsystemressources• Together, contain all system information• Mssqlsystemressource • Lives under: MSSQLBinn • Contains all system code • Hidden by default• Master • Lives under: MSSQLDATA• You should move these to a safe location
    • 40. Disaster: Master or systemResources• You lost: • All passwords and server logins • All system wide certificates (You may be unable to decrypt!) • All System procedures you created• You are not 100% screwed, but you are in for a long night • Both can be rebuild (empty) during server start • …Or restored from backup • if you remembered to take one • Need /f and /T3608 to get back up
    • 41. Database: model• Every new created database is cloned from this• Loss is not catastrophic • Copy from healthy machine• Tempdb can’t boot without it• Lives with master
    • 42. Database tempdb• Database “swap file”• Does not survive restarts• No Durability guarantees here• Fast I/O helps
    • 43. Loss of Tempdb…is…Temporary• Will rebuild itself after instance restart• Configuration is stored in master• Clones from msdb• Nearly every installation must change defaults• If tempdb cannot be created, server will only start from command line
    • 44. User Databases and Failure• A database consists of • At least one Transaction Log File • The PRIMARY filegroup • At least one data file in PRIMARY• If any of these are lost, the database is dead • You can in some cases bring a database without a transaction log back alive • But typically with data loss…• Lesson: carefully protect all of above
    • 45. What is in the Files? PRIMARY Transaction Log Primary File Headers GAM / SGAM PFS Map VLF Metadata (system objects) VLF User Data VLF
    • 46. Data Files• Regular files in NTFS• Secured• Files can Auto Grow as needed • Risky • File Imbalance
    • 47. How are Database Files Created?• ALTER or CREATE DATABASE• Transaction log file always zeroed out • This looks super cool on FusionIo by the way• Data files MAY be zeroed out • Depends in privileges • May use instant file init
    • 48. Filegroups• Filegroups (one PRIMARY word) are containers of files User Data• Used to group similar data together DATA• Oracle people know User Data this concept as a table-spaces User Data• Files inside FG are accessed/allocated User Data round-robin User Data
    • 49. Reclaiming/Moving Space in Files• DBCC SHRINKFILE• REBUILD data
    • 50. DBCC SHRINKFILE 7 8 5 6 3 4 1 2 LUN 1 LUN 2 LUN 3 LUN 4
    • 51. How to reclaim space the right way… New Filegroup 7 8 7 8 5 6 5 6 3 4 3 4 1 2 1 2 LUN 1 LUN 2 LUN 3 LUN 4 ALTER INDEX Foo WITH REBUILD, SORT_IN_TEMPDB = ON
    • 52. PFS Contention• Too few PFS maps can lead to latch File contention PFS Map• Diagnosed in: sys.dm_os_waiting_tas User Data (8000 pages)ks PFS Map• Look for PAGELATCH_UP User Data (8000 pages)
    • 53. I/O DBA people worry about• DBAs typically diagnose issues with waits stats• Issues they look for: • WRITELOG/LOGBUFFER waits • PAGELATCHIO_<X> waits • BACKUPIO waits • IO_COMPLETION/ASYNC_IO_COMPLETION
    • 54. Places you need to know about• Diagnosing ressource waits: • sys.dm_os_wait_stats • Post 2008R2 – can use Xevents (harder)• More detail in: • sys.dm_io_virtual_filestats(NULL, NULL) • Confirm waits here!• SQL Server errors in log file:

    ×