SQL Azure in deep


Published on

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

SQL Azure in deep

  1. 1. SQL Azure Database. Under the hood What, how and why Cloud is different Kirill Bezpalyi, MSP, VNTU 12/25/2011
  2. 2. Что впереди? Осмотр сервиса Архитектура SQL Azure Отказоустойчивость Аппаратная организация Мониторинг Безопасность Итоги 2 12/25/2011
  3. 3. SQL Azure model  Each account has zero or more servers Account  Azure wide, provisioned in a common portal  Billing instrument  Each server has one or more databases  Zone for authentication: userId+password  Zone for administration and billing Server  Metadata about the databases and usage  Network access control based on client IP  Has unique DNS name and unit of geo-location  Each database has standard SQL objects  Unit of consistency and high availability (autonomous replication) Database  Contains Users, Tables, Views, Indices, etc…  Most granular unit of usage reports  Three SKUs available (1GB, 10GB and 50GB)3 12/25/2011
  4. 4. SQL Azure Network Topology use standard SQL Applications Application client libraries: ODBC, ADO.Net, PHP, … Internet Azure Cloud TDS (tcp) Security Boundary Load balancer forwards „sticky‟ LB sessions to TDS protocol tier TDS (tcp) Gateway Gateway Gateway Gateway Gateway Gateway Gateway: TDS protocol gateway, enforces AUTHN/AUTHZ policy; proxy to CloudDB TDS (tcp)L SQL SQL SQL SQL 4 12/25/2011 Scalability and Availability: Fabric, Failover, Replication, and Load balancing
  5. 5. Replication All reads are completed at the primary Ack Read Write Writes Value Ack replicated to Ack write quorum of replicas Ack P Ack Commit on secondaries S Write Write S first then primary S Write Write S Each transaction has a commit 5 sequence 12/25/2011 number (epoch,
  6. 6. Reconfiguration Types of reconfiguration  Primary failover  Removing a failed secondary Failed  Adding recovered replica  Building a new secondary S B P Assumes S Failed  Failure detector  Leader election  Safe in the presence  Both services provided by of cascading failures Fabric layer 6 12/25/2011
  7. 7. Partition Management Partition Manager (PM) is a highly available service running in the Master cluster  Ensures all partitions are operational  Places replicas across failure domains (rack/switch/server)  Ensures all partitions have target replica count  Balances the load across all the nodes Each node manages multiple partitions Global state maintained by the PM can be recreated from the local node state in event of disaster (GPM rebuild) 7 12/25/2011
  8. 8. System in Operation Primary master node SQL Server Partition Manager Partition Global Load Balancer Managemen Partition t Partition Placement map Advisor Leader Fabric Elector Data Node Data Node Data Node Data Node Data Node Data Node 100 101 102 103 104 105 P S P P S P S S S S S S S P S S S S S S S S S Fabric Secondary Primary Secondary8 12/25/2011
  9. 9. SQL node Architecture Single physical DB for entire Machine SQL node Instance DB files and log shared master across every logical database/partition tempdb  Allows better logging msdb throughput with sequential IO/group commits CloudNode  No auto-growth on demand stalls DB1 DB2 master  Uniform manageability and DB5 DB1 DB7 backup Each partition is a “silo” with DB3 DB4 DB7 its own independent schema Local SQL backup guards against software bugs 9 12/25/2011
  10. 10. Recap Two kinds of nodes:  Data nodes store application data  Master nodes store cluster metadata Node failures are reliably detected  On every node, SQL and Fabric processes monitor each other  Fabric processes monitor each other across nodes Local failures cause nodes to fail-fast Failures cause reconfiguration and placement changes 10 12/25/2011
  11. 11. Hardware Architecture  Each rack hosts 2 pods of L2 Switch 20 machines each  Each pod has a TOR mini- switch  10GB uplink to L2 switch  Each SQL Azure machine runs on commodity box  Example:  8 cores  32 GB RAM  1TB+ SATA drives  Programmable power  1Gb NIC  Machine spec changes as hardware (pricing) evolves11 12/25/2011
  12. 12. Hardware ChallengesSATA drives On-disk cache and lack of true "write through" results in Write Ahead Logging violations  DB requires in-order writes to be honored  Can force flush cache, but causes performance degradation Disk failures happen daily (at scale), fail-fast on those  Bit-flips (enabled page checksums)  Drives just disappear  IOs are misdirected Faulty NIC  Encountered message corruption 12  Enabled message signing and checksums 12/25/2011
  13. 13. Software Deployment OS is automatically imaged via deployment All the services are setup using file copy  Guarantees on which version is running  Provides fast switch to new version  Minimal global state allows running side by side  Yes, that includes the SQL Server DB engine Rollout is monitored to ensure high availability  Knowledge of replica state health ensure SLA is met  Two phase rollouts for data or protocol changes Leverages internal Autopilot technologies with SQL Azure extensions 13 12/25/2011
  14. 14. Software Challenges Lack of real-time OS features  CPU priority High priority for Fabric lease traffic  Page Faults/GC Locked pages for SQL and Fabric (in managed code) Fail fast or not?  Yes, for corruption/AV  No, for other issues unless centrally controlled What is really considered failed?  Some failures are non-deterministic or hangs  Multiple protocols / channels means partial failures too 14 12/25/2011
  15. 15. Monitoring Health model w/repair actions  Reboot -> Re-deploy -> Re-image (OS) -> RMA cycle Additional monitoring for SQL tier  Connect / network probes  Memory leaks / hung worker processes  Database corruption detection  Trace and performance stats capture  Sourced from regular SQL trace and support mechanisms  Stored locally and pushed to a global cluster wide store  Global cluster used for service insight and problem tracking 15 12/25/2011
  16. 16. SQL Azure Login Process 7 1 TDS Gateway Front-end Node TDS Session Protocol Parser 6 2 Gateway Logic Global Partition Map Master Node 8 3 Master Node Components 4 5Backend Node 1 Backend Node 2 Backend Node 3 SQL Instance SQL Instance SQL Instance SQL DB SQL DB SQL DB16 Scalability and and Availability: Fabric,Failover,Replication, and Load balancing Scalability Availability: Fabric, Failover, Replication, and Load balancing 12/25/2011
  17. 17. Security/Attack Considerations Service  Secure channel required (SSL)  Denial Of Service trend tracking  Packet Inspection Server  IP allow list (Firewall)  Idle connection culling  Generated server names Database  Disallow the most commonly attacked user id‟s (SA, Admin, root, guest, etc)  Standard SQL Authn/Authz mode 17 12/25/2011
  18. 18. Set up a server...18 12/25/2011
  19. 19. SQL Azure CompatibilityCurrently Supported Not Currently Supported Tables, indexes and  Data Types views  Sparse Columns, Filestream Stored Procedures  Partitions Triggers  Full-text indexes Constraints  SQL-CLR Table variables, session temp tables (#t) Spatial types, HierarchyId 19 12/25/2011
  20. 20. Size Matters20 12/25/2011
  21. 21. Pricing21 12/25/2011
  22. 22. Summary Cloud is different  Not a different place to host code SQL Azure IS SQL Server…a TDS endpoint Create DB‟s and manage using what we already know Considerations and futures paint exciting picture of what to expect looking forward Opportunities are great  Customers want a utility approach to storage  New businesses and abilities in scale, availability, etc But the price must be paid  Which is a good thing, otherwise everyone would be 22 12/25/2011 doing it!
  23. 23. Materials SQL Azure  http://www.microsoft.com/windowsazure/sqlazure/ SQL Azure “under the hood”  http://www.microsoftpdc.com/sessions/tags/sqlazure SQL Azure Fabric  http://channel9.msdn.com/pdc2008/BB03/ General Guidelines & Limitations  http://msdn.microsoft.com/en-us/library/ee336245.aspx 23 12/25/2011
  24. 24. Q&A Ask your questions…24 12/25/2011
  25. 25. Your potential. Our passion.TM