Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

SQL - Parallel Data Warehouse (PDW)


Published on

In this presentation we will figure out what's SQL PDW, SMP Vs. MPP and world of Appliance....

Published in: Technology

SQL - Parallel Data Warehouse (PDW)

  1. 1. About Presenter0 Karan Gulati (SSAS Maestro)
  2. 2. SQL - Parallel Data Warehouse (PDW) Let’s figure out……….1 Karan Gulati (SSAS Maestro)
  3. 3. What are we covering • World of Appliance • Introducing SQL Parallel Data Warehouse (PDW) • Different Kinds of Nodes in PDW • Hub and Spoke Architecture2 Karan Gulati (SSAS Maestro)
  4. 4. What’s an Appliance? A re we talking about a refrigerator or an oven?3 Karan Gulati (SSAS Maestro)
  5. 5. Appliance World……. Appliance is nothing but preconfigured machine which is dedicated for specific use in contrast to general use. In Computer world - An appliance comes with hardware, with pre-installed OS, and Software, keeping all best practices or guideline in mind while building an Appliance. What this means to users? Just plug and play…... and ready to use just like a refrigerator or an oven.4 Karan Gulati (SSAS Maestro)
  6. 6. Have you heard about SQL PDW Microsoft SQL Server Parallel Data Warehouse (SQL Server PDW) is: • Massively Parallel Processing Appliance (MPP) • Simple to deploy • Pre-built Appliance with software, hardware and networking components • Highly scalable data storage, and high-speed data transfer • One answer to largest data warehouse workloads5 Karan Gulati (SSAS Maestro)
  7. 7. Symmetric Multi Processing First, lets understand Symmetric multi processing(SMP) In SMP each CPU core can work with any section of memory or disk, and all memory and all disk available to each core. Problem starts when too many CPUs making requests same time for data on the system bus which creates a traffic jam and that results in queue consequently slowness and limited amount of processing can take place on SMP creates limitation as the usage grows System Bus.6 Karan Gulati (SSAS Maestro)
  8. 8. Solution to SMP Problem lies in MPP Massively Parallel Processing Architecture refers to the use of a large number of separate computes to perform a set of a job. In simple words MPP is: Multiple boxes with their own CPUs, Memory and other resources to perform given task; this way we are using the power of all machines / nodes in one go.7 Karan Gulati (SSAS Maestro)
  9. 9. SQL PDW: Flow of Query Execution Control node break When the compute the Query into DMS or Data nodes are finished, multiple parallel Movement Service control nodes Query hits control operations and coordinates any handles post- node distribute them out needed data processing and re- to compute nodes movement among integration of result where the actual nodes sets for delivery data resides back to the users8 Karan Gulati (SSAS Maestro)
  10. 10. SQL PDW: Nodes and Services Control Node Compute Node Administrative Service Nodes Data Movement Services9 Karan Gulati (SSAS Maestro)
  11. 11. Control Node An Control node that is the central point of control for processing queries on the SQL Server PDW appliance. The Control node receives the user query, creates a distributed query plan, communicates relevant plan operations and data to Compute nodes, receives Compute node results, performs any necessary aggregation of results, and then returns the query results to the user.10 Karan Gulati (SSAS Maestro)
  12. 12. Compute Node An Compute node that is the basic unit of scalability and storage. Each Compute node in the SQL Server PDW appliance uses its own user-data and computing resources to perform a portion of each parallel query.11 Karan Gulati (SSAS Maestro)
  13. 13. Administrative Service Nodes • Landing Zone node: An appliance node that provides temporary storage and processing for loading data onto the appliance. • Management node: An appliance node that performs multiple functions related to managing the hardware and software in the appliance. This node is the hub for software deployment and servicing, authentication within the appliance (not login authentication), and monitoring system health and performance • Backup Node: The Backup Node provides high-speed integrated backup at the database level. This is tied to the organization’s overall backup strategy and systems.12 Karan Gulati (SSAS Maestro)
  14. 14. Data Movement Services • When a query is submitted to a control node, it is the PDW Engine that determines what the query plan will be on each individual compute node, then submits the query to all the compute nodes through the DMS DMS • Further DMS coordinates any needed data movement among nodes taking place between and handles any functions that needed to be resolved centrally • In simple words DMS is the brain that ties all the nodes together13 Karan Gulati (SSAS Maestro)
  15. 15. Hub and Spoke Architecture Data warehousing architecture with a central hub data warehouse that provides a flexible and high speed ability to move or copy EDW data to spokes. A spoke is typically a data mart in an optimized physical storage for a particular user group or organization. A data mart is usually a much smaller subset of the data in the EDW and specific to the reporting and analytic needs of a specific user community.14 Karan Gulati (SSAS Maestro)
  16. 16. SQL PDW – Act as Hub Using a true hub-and-spoke architecture, all enterprise data can be maintained on a SQL Server 2008 R2 Parallel Data Warehouse hub while departments or business units keep their existing data marts to suit their needs. High-speed data transfer relieves traditional barriers to hub and spoke. Power users can even deploy a dedicated MPP appliance as a spoke so they can autonomously manage resources, while IT can enforce enterprise standards across all data.15 Karan Gulati (SSAS Maestro)
  17. 17. Recommended Reading SQL Server 2008 R2 Parallel Data Warehouse ITIC: Comparison of Oracle Database Appliance to Microsoft SQL Server Implementing a SQL Server PDW Using the Kimball Approach Implementing Data Warehouse 2.0 by Immon16 Karan Gulati (SSAS Maestro)
  18. 18. Thanks Contact Speaker - Karan Gulati (SSAS Maestro)