User Group Bi


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

User Group Bi

  1. 1. SQL Server Parallel DWH Architecture “Aka : Madison” Franck Sidi Lead SQL Server & Bi – Microsoft Israel
  2. 2. Trusted, Scalable Platform Our scalability strategy “Madison” in 2010 Q1
  3. 3. Agenda Concepts and Principles Reference Architectures “FastTrack” Madison functional overview Early adoption
  4. 4. Symmetric Multiprocessing SMP Single DB instance “Shared Everything” Architecture Server/CPU’s share memory disks Can lead to resource contention as you scale
  5. 5. Massively Parallel Processing MPP Server/CPU’s have their own dedicated resources “Shared Nothing” Architecture “Secret Sauce” is parallelizing operations Lightning-fast Queries, Data Loads and Updates Linear Scalability Problem needs to be partitionable
  6. 6. SMP vs MPP SMP MPP HW advancements increasing HW advancements increasing ability to scale-up ability to scale-up & scale-out Scaling is limited Scaling to 1 PB+ High end SMP very expensive Scale out is relatively low cost Extremely high concurrency for Relatively high concurrency for some workloads complex workloads Less than 1-2 TB of data SMP > 2 TB up to 1 PB will almost always be better Limited SQL Server functionality Full SQL Server functionality HA is built in HA must be architected in
  7. 7. Agenda Concepts and Principles Reference Architectures “FastTrack” Madison functional overview Early adoption
  8. 8. How some solve the problem today Big SAN Biggest 64-core Server Connected together! What’s wrong with this picture???
  9. 9. System out of balance This server can consume 16 GB/Sec of IO, but the SAN can only deliver 2 GB/Sec Even when the SAN is dedicated to the SQL Data Warehouse, which it often isn’t Lots of disks for Random IOPS BUT Limited controllers  Limited IO bandwidth System is typically IO bound and queries are slow Despite significant investment in both Server and Storage
  10. 10. Where Does an I/O Go? Understand potential throughput of the hardware Each component in the path has associated speed/bandwidth Know where the potential bottlenecks exist Switch Controllers/Processors Front End Ports Cache Host Switch PCI Bus  HBA  Fiber Channel Ports  Array Processors  Disks
  11. 11. Potential Performance Bottlenecks DISK DISK SQL SERVER CPU CORES A FC SWITCH FC SERVER WINDOWS A CACHE HBA B LUN CACHE A STORAGE A B CONTROLLER B DISK DISK FC A HBA B B LUN CPU Feed Rate SQL Server HBA Port Rate Switch Port Rate SP Port Rate LUN Read Rate Disk Feed Rate Read Ahead Rate
  12. 12. The alternative: A balanced system Design a server + storage configuration that can deliver all the IO bandwidth that CPUs can consume when executing a SQL Relational DW workload Avoid sharing storage devices among servers Avoid overinvesting in disk drives Focus on scan performance, not IOPS Layout and manage data to maximize range scan performance and minimize fragmentation
  13. 13. Sequential I/O Sequential I/O Random I/O Ideal for data warehousing Ideal for OLTP Large reads & writes Small reads and writes Scans on large data stores are OLTP usually random-read centric usually read with sequential read Seek queries are a goal in OLTP query patterns and not random read optimization patterns Seeks usually cause random reads Not as predictable & scalable for data warehousing Scalable, predictable performance Requires large number of drives Requires 1/3 or fewer drives for same performance All databases contain both scans and seeks among with other types of reads and writes, DW workload indicate that the vast majority of reads are sequential – not all
  14. 14. What is Fast Track Data Warehouse? A method for designing a cost-effective, balanced system for Data Warehouse workloads Reference hardware configurations developed in conjunction with hardware partners using this method Best practices for data layout, loading and management Relational Database Only – Not SSAS, IS, RS
  15. 15. Fast Track Scope Presentation Layer Systems Reference Architecture Scope (dashed) Presentation Data Presentation Data Web Analytic Tools Reporting Services Dedicated SAN, Storage Array Data Staging, Bulk Loading Data Warehouse Analysis Services Cubes SharePoint Services Microsoft Office SharePoint PerformancePoint Excel Services
  16. 16. Benefits of Fast Track appliance model Lower TCO Minimizes risk of overspending on un-balanced hardware configurations Commodity Hardware Choice HW platform Implementation vendor Reduced Risk Validated by Microsoft Encapsulates best practices Known performance & scalability
  17. 17. Fast Track DW Reference Configurations CPU Initial Max Server CPU SAN Data Drive Count Cores Capacity* Capacity** HP Proliant (2) AMD Opteron Istanbul 12 (3) HP MSA2312fc (24) 300GB 15k 6TB 12TB DL 385 G6 six core 2.6 GHz RPM SAS HP Proliant (4) AMD Opteron Instanbul 24 (6) HP MSA2312fc (48) 300GB 15k SAS 12TB 24TB DL 585 G6 six core 2.6 GHz HP Proliant (8) AMD Opteron Istanbul 48 (12) HP MSA2312 (96) 300GB 15k SAS 24TB 48TB DL 785 G6 six core 2.8 GHz Dell PowerEdge R710 (2) Intel Xeon Nehalem 8 (2) EMC AX4 (16) 300GB 15k FC 4TB 8TB quad core 2.66 GHz Dell Power Edge R900 (4) Intel Xeon Dunnington 24 (6) EMC AX4 (48) 300GB 15k FC 12TB 24TB six core 2.67GHz IBM X3650 M2 (2) Intel Xeon Nehalem 8 (2) IBM DS3400 (16) 200GB 15K FC 4TB 8TB quad core 2.67 GHx IBM X3850 M2 (4) Intel Xeon Dunnington 24 (6) IBM DS3400 (24) 300GB 15k FC 12TB 24TB six core 2.67 GHz IBM X3950 M2 (8) Intel Xeon Nehalem four 32 (8) IBM DS3400 (32) 300GB 15k SAS 16TB 32TB core 2.13 GHz Bull Novascale R460 (2) Intel Xeon Nehalem 8 (2) EMC AX4 (16) 300GB 15k FC 4TB 8TB E2 quad core 2.66 GHz Bull Novascale R480 (4) Intel Xeon Dunnington 24 (6) EMC AX4 (48) 300GB 15k FC 12TB 24TB E1 six core 2.67GHz * Core-balanced compressed capacity based on 300GB 15k SAS not including hot spares and log drives. Assumes 25% (of raw disk space) allocated for Temp DB. ** Represents storage array fully populated with 300GB15k SAS and use of 2.5:1 compression ratio. This includes the addition of one storage expansion tray per enclosure. 30% of this storage should be reserved for DBA operations
  18. 18. Fast Track DW Core-Balanced Architecture Using 300GB 15k SAS drives Each HBA port rated at 4Gb/s each LUN rated at 125MB/s or 400MB/s and 1600MB/s for all Each SP rated at 500MB/s each SP controls 4 LUN’s at 500MB/s 4 SP ports. or 1000MB/s for both SP’s or 1000MB/s per MSA DAE RAID GP01 RAID GP02 RAID GP05 S P 01 02 03 04 09 10 LUN1 LUN3 LUN0 A LUN2 (Logs) SMP LUN4 HS ONLY 8 SWITCH Server per RAID GP03 RAID GP04 data 4-Cores disks !!! S P 05 06 07 08 LUN5 LUN7 B LUN6 LUN8 Per MSA2312 Drive Details Each SP port rated at 4Gb/s • Each MSA can hold 12 drives, this configuration requires 11 or 400MB/s and 1600MB/s for all • MSA is 2U in total (capacitor eliminates need for battery) 4 SP ports. • Each MSA SP port controls 4 LUNs • Each pair of LUNs consists of (2) 300GB 15k SAS drives RAID1
  19. 19. Fast Track Data Warehouse Components Software: • SQL Server 2008 Enterprise • Windows Server 2008 Configuration guidelines: • Physical table structures • Indexes • Compression • SQL Server settings • Windows Server settings • Loading Hardware: • Tight specifications for servers, storage and networking • ‘Per core’ building block
  20. 20. RA: Tightly Spec'd RAs include not only hardware but best practices in: Window OS configuration SQL Server startup options Database physical layout Table types Indexing Statistics Managing fragmentation Loading procedures
  21. 21. Fast Track Case Study - Results Teradata SQL Server Comparison Fast Track DW Loading – 5:10:21 total time 51:31 total time R SQL Server 6x Subject Area 1 faster Loading – 4:36:08 total time 1:50.01 total time R SQL Server 2.5x Subject Area 2 faster Query times – 3:03 avg query time 0:15 avg query time R SQL Server 12x Subject Area 1 (using 9 benchmark (using 9 benchmark faster queries) queries) Query times – 56:44 avg query time 8:09 avg query time R SQL Server 7x Subject Area 2 (using 4 benchmark (using 4 benchmark faster queries) queries)
  22. 22. Agenda Concepts and Principles Reference Architectures “FastTrack” Madison functional overview Early adoption
  23. 23. About DATAllegro… Technology Partners Proprietary Appliance Management and MPP Database Open Source Database and OS Industry Standard Servers Industry Standard Networking Industry Standard Storage
  24. 24. Integration Plans Provide scale out through MPP on SQL Server and Windows Offer ‘Appliance like’ user experience to Data Warehouse customers Lower TCO to high end Data Warehousing Offer integrated BI platform to small and very large Enterprises OPEN SOURCE DATABASE & OS Industry Standard Servers Industry Standard Networking Industry Standard Storage
  25. 25. MPP Additional Considerations Principles & approach of SMP carry forward Deeper level of complexity – High Availability Parallelization Inter node data movement
  26. 26. Modular building blocks Balanced CPU and storage Both SMP and MPP are based on building blocks that scale by the CPU core Adds network, storage processing and disk bandwidth for each core Based on maximizing & sustaining true sequential I/O while minimizing disks Generally changes balance of systems so more can be spent on CPU and SW than on storage to give better overall performance for a given budget Building blocks can be adjusted for multiple MPP configurations – high performance, archive and extreme performance
  27. 27. The future of SQL Server Data Warehousing – Project "Madison" Predictable Scale out through MPP Customers with over 400 TB data warehouses
  28. 28. Commodity Hardware Lower cost Frequent performance improvements Easier upgrade and maintenance Higher customer comfort Better compatibility
  29. 29. Ultra Shared Nothing An extension of traditional shared nothing design Push shared nothing architecture into SMP node IO and CPU affinity within SMP nodes Eliminate contention per user query Use full resources for each user query Multiple physical instances of tables Distribute large tables Replicate small tables Distribute AND Replicate medium tables Re-Distribute rows “on-the-fly” when necessary
  30. 30. Madison Server Components Database Servers Control Nodes SQL Control Active / Passive SQL Compute SQL SQL Storage SQL Landing Zone Dual Fiber Channel SQL Dual Infiniband SQL Backup SQL Management SQL Failover/Spare Spare Database Server
  31. 31. System Architecture 20Gbs Infiniband DMS Backbone Database Servers Control Nodes SQL Active / Passive SQL SQL Client Drivers SQL SQL Dual Fiber Channel SQL Dual Infiniband Data Center Monitoring SQL SQL SQL ETL Load Interface 8Gbs Fiber Channel Corporate Backup Local San Solution Spare Database Server IPoIB Dedicated LAN Corporate Network Private Network
  32. 32. Software Architecture Nexus MS BI Query Compute Nodes (AS, RS) Compute Nodes Tool Compute Nodes DMS IIS JDBC Admin Console User Data OLE-DB SQL Server ODBC Ado.Net Madison Service Landing Zone DMS Loader DMS SQL SSIS Core Engine DMS Client DSQL SQL OS Services Manager Backup Node SQL OS DMS DW DW DW DW Schema Management Node Authentication Configuration Queue SQL Server HPC AD Existing MS software Built by DWPU 3rd Party
  33. 33. Control Node & Client Drivers Client connections always go through the control node Clustered to a passive node Processes SQL requests Prepares execution plan Orchestrates distributed execution Local SQL Server to do final query plan processing / result aggregation Will use same set of drivers used by DATAllegro Provided by DataDirect ODBC, OLE-DB, JDBC and Ado.Net client drivers Wire protocol (SeQuel Link) Available drivers for 32 and 64 bits
  34. 34. Compute Nodes A SQL Server 2008 instance DB engine nodes autonomous on local data SQL as primary interface Each MPP node is a highly tuned SMP node with standard interfaces
  35. 35. Landing Zone Provides high capacity storage for data files from ETL processes Integration services available on the landing zone Connected to internal network Available as sandbox for other applications and scripts that run on internal network. Landing Data Compute Source Zone Files Loader Nodes
  36. 36. Backup Node Builds on SQL Server native backup/restore facility Use VDI interface to plug into backup pipeline Database-level backup Coordinated backup across the nodes Quiesce write activity to synchronize Can only restore to another appliance with exactly the same number of distributions
  37. 37. Data Distribution & Replication Control Node Compute Nodes Storage Nodes Tables Are Hash Distributed Or Replicated Landing Zone Node Spare Node Text File Text File Text File Text File
  38. 38. Database Distributed & Replicated Tables Date Dim D_DATE_SK D Customer D_DATE_ID C I D_DATE C-CUSTOMER_SK C_CUSTOMER_ID D_MONTH SS … Item CD P C_CURRENT_ADD R I_ITEM_SK S … I_ITEM_ID I_REC_START_D ATE I_ITEM_DESC … Store Sales Ss_sold_date_sk D D Ss_item_sk Ss_customer_sk C I C I Ss_cdemo_sk D Ss_store_sk SS SS Ss_promo_sk CD P C I CD P Ss_quantity SS … S S Promotion CD P Customer P_PROMO_SK S Demographics P_PROMO_ID CD_DEMO_SK P_START_DATE_ CD_GENDER SK D P_END_DATE_SK D CD_MARITAL_STATUS Store CD_EDUCATION … C I C I … S_STORE_SK SS SS S_STORE_ID CD P S_REC_START_DAT CD P D D E S S_REC_END_DATE S C I C I S_STORE_NAME … SS SS CD P CD P S S
  39. 39. Physical Storage Configuration – Single Node LUN 1 LUN 2 LUN 3 LUN 8 FG Dist A FG Dist B FG Dist C FG Dist H DistData1.mdf DistData3.ndf DistData5.ndf DistData7.ndf DistData2.ndf DistData4.ndf DistData6.ndf DistData8.ndf Database(s) Replicated FG User ReplData1.mdf ReplData3.ndf ReplData5.ndf ReplData7.ndf ReplData2.ndf ReplData4.ndf ReplData6.ndf ReplData8.ndf FG Stage A FG Stage B FG Stage C FG Stage H StageData1.mdf StageData3.ndf StageData5.ndf StageData1.ndf StageData2.ndf StageData4.ndf StageData6.ndf StageData2.ndf Database Staging Replicated FG ReplData1.mdf ReplData3.ndf ReplData5.ndf ReplData7.ndf ReplData2.ndf ReplData4.ndf ReplData6.ndf ReplData.ndf Local Drive 1 Local Drive 2 Local Drive 3 Local Drive 4 Local Drive 5 Local Drive 6 TempDB TempDB1.mdf TempDB2.ndf TempDB3.ndf TempDB4.ndf TempDB5.ndf TempDB6.ndf Log LUN UserDB Log StageDB Log TempDB Log
  40. 40. Create Table – Behind the Scenes Create Table store_sales with distribute_on (ss_item_sk) partition_on(ss_sold_date_sk) cluster_on (ss_sold_date_sk) 8 Filegroups Create Table mad_store_sales_a 1 Table per FG Create Table mad_store_sales_ … Distribution_a Create Table mad_store_sales_h thru Distribution_h 12 Partitions (ss_sold_date_sk) 8K 8K 8K N-number of 8K Pages 8K Tuple Microsoft Confidential
  41. 41. High Availability Multiple levels of redundancy: • Leveraging MSCS for node availability • Cluster aware services: • SQL Server, Madison, DMS 8x1 • Leveraging MSCS for SQL Services, DMS • 1 spare node for every 8* compute nodes
  42. 42. Security and Encryption Retain DA v3 design Authentication and authorization done by Madison server Users and Roles as first class principals Nested role capabilities Connection to SQL back-ends through high privilege account SQL nodes reside on private network No support for integrated auth Leverages TDE to expose DB-level encryption Supports key rotation
  43. 43. The Logical Data Model Multiple databases per appliance Each user database maps to one SQL Server db per node Tables Replicated, Distributed, Replicated + Distributed Leverage SQL Server compression Supports Partitioning Supports secondary indexes Views
  44. 44. SQL Server Data Types DAv3 Madison Data Types bigint binary P P Most scalar data types supported bit P char / nchar P P by SQL Server 2008 are supported date, time P by Madison datetime (was date in DA) P P datetime2 P Main exceptions datetimeoffset P Character and binary strings limited to 8K decimal P P (i.e. no BLOB support) float P P XML geometry / geography hierarchyid Sql-Variant Int (was integer in DA) P P System and CLR UDTs money P Latin1_General with binary real P smalldatetime P comparison only smallint P P smallmoney P sql_variant text / ntext / image timestamp tinyint P P varchar / nvarchar / varbinary P P v*(max) uniqueidentifier xml
  45. 45. Supported SQL Syntax Aligned with ANSI SQL 92 Basic INSERT, UPDATE, DELETE, SELECT CREATE TABLE AS SELECT Limited analytical function support Teradata extensions Quantile, Sample,…
  46. 46. Configuration and Monitoring Challenge: Is it an appliance or a collection of nodes? Madison services instrumented Logs and Performance Counters Capture and forward SNMP alerts from devices within the appliance Small subset of DMVs to union underlying node DMVs Leverage HPC for monitoring
  47. 47. Manageability Web-based main administrative user interface Based on DATAllegro manageability UI Monitoring system health and activity Leveraging HPC pack 2008 Systems management Monitoring Cluster health
  48. 48. Query Tools GUI Tool: Nexus (CoffingDW) Table & view object explorer Interactive query execution Command line tool: Replacement for DA- SQL Flavor of SqlCmd
  49. 49. MS BI Integration Integration Services Madison enabled as a source Data movement, lookup operations, etc. Will add a new SSIS destination Ensure integrated high performance loads Reporting Services Fully supported; including parameterized queries Will customize experience for report builder and report designer Analysis Services Will get connectivity through OLE-DB provider Will enable both MOLAP and ROLAP storage
  50. 50. High Level Release Definitions Will start running MTPs V2+ in the summer Closer functional alignment with SQL Server Better integration with SQL and MS ecosystem, tools and technologies “Madison” (aka v1) Focus on time to market Compatibility with DATAllegro v3 MS BI integration H1 2010
  51. 51. Recap Data Warehousing Reference Architectures available today! SQL Server Fast Track SQL Server “Madison” Built for advanced, large scale data warehouses Shared-nothing MPP architecture Early evaluation programs starting soon All feedback welcome: Thank you!