Loading…

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Like this presentation? Why not share!

All up datawarewhouse – from smp to parallel

on

  • 399 views

 

Statistics

Views

Total Views
399
Views on SlideShare
399
Embed Views
0

Actions

Likes
0
Downloads
16
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • The HP Business Data Warehouse Appliance is a great solution for data warehouse environments with light concurrency requirements and relatively low data volumes. This workload profile is becoming increasingly common as organizations recognize the business value in using data marts and departmental data warehouses as a platform for the increasing use of business analysis tools by information workers at all levels of the business. No longer are data warehouses and BI solutions the exclusive domain of huge enterprises – they are now an increasingly important capability for small to medium businesses and decentralized departments. There’s a growing number of businesses who don’t have same concurrency and data volumes, or budgets, as large enterprises; but who want to be able to create a data warehouse for better reporting, analysis, and decision making.
  • The HP Business Data Warehouse offers a solution for the customers discussed on the previous slide. It’s a solution that is:Complete – the appliance comes with all the hardware and software you need, pre-configured for a data warehouse workload based on expertise from HP and Microsoft, and includes support services from a single source.Optimized – Experts from Microsoft and HP have designed and tuned the appliance specifically for data warehouse workloads, so you can be sure it will meet your data warehouse requirements with efficient power utilization and built in security and reliability features.Agile – Because the BDW is a single hardware appliance, you can just plug it in, switch it on, and within a very short period you’ll have a working data warehouse. The easy to use wizards included in the appliance make it easy to configure and load, enabling your business to start taking advantage of your data warehouse sooner than with a “self-build” solution. And while the BDW is optimized for relatively low data volumes and concurrency, if your business grows significantly you can transfer your BDW software licenses to a Fast Track solution.
  • There are two key scenarios for using the HP Business Data Warehouse appliance:A small business or departmental data warehouse for a small group of concurrent users who need to store and analyse up to 5 TB of data.A spoke in an Enterprise Data Warehouse “hub and spoke” architecture, where the BDW is used to deliver a subset of the corporate data warehouse to a specific set of users.
  • The applianceis a complete solution with hardware, software, and service that is needed in a mission critical data warehouse. The database is highly scalable and can handle workloads of hundreds of terabytes while maintaining performance. The EDW appliance also works with your existing data warehouses and data marts so you do not have to rip and replace your current investments. Also, you can use familiar tools such as Microsoft Excel to analyze the data in your data warehouse.
  • Customers will purchase at least two racks for a complete EDW Appliance system. [Click]The control rack will have control nodes, management nodes, the landing zone, and backup nodes. The data rack will have servers that are compute nodes and storage nodes. Each of these racks and node types will be discussed in more detail.
  • Customers will purchase at least two racks for a complete EDW Appliance system. [Click]The control rack will have control nodes, management nodes, the landing zone, and backup nodes. The data rack will have servers that are compute nodes and storage nodes. Each of these racks and node types will be discussed in more detail.
  • Customers will purchase at least two racks for a complete EDW Appliance system. [Click]The control rack will have control nodes, management nodes, the landing zone, and backup nodes. The data rack will have servers that are compute nodes and storage nodes. Each of these racks and node types will be discussed in more detail.
  • Customers will purchase at least two racks for a complete EDW Appliance system. [Click]The control rack will have control nodes, management nodes, the landing zone, and backup nodes. The data rack will have servers that are compute nodes and storage nodes. Each of these racks and node types will be discussed in more detail.
  • Customers will purchase at least two racks for a complete EDW Appliance system. [Click]The control rack will have control nodes, management nodes, the landing zone, and backup nodes. The data rack will have servers that are compute nodes and storage nodes. Each of these racks and node types will be discussed in more detail.
  • Customers will purchase at least two racks for a complete EDW Appliance system. [Click]The control rack will have control nodes, management nodes, the landing zone, and backup nodes. The data rack will have servers that are compute nodes and storage nodes. Each of these racks and node types will be discussed in more detail.
  • Customers will purchase at least two racks for a complete EDW Appliance system. [Click]The control rack will have control nodes, management nodes, the landing zone, and backup nodes. The data rack will have servers that are compute nodes and storage nodes. Each of these racks and node types will be discussed in more detail.
  • Customers will purchase at least two racks for a complete EDW Appliance system. [Click]The control rack will have control nodes, management nodes, the landing zone, and backup nodes. The data rack will have servers that are compute nodes and storage nodes. Each of these racks and node types will be discussed in more detail.
  • Data layout options:Dimension tables are typically replicatedParallel Data Warehouse maintains data integrity across all nodesFact tables are typically distributedThe data model, table sizes, and workloads must all be considered when choosing between replicated and distributed tablesThe following join types are used to achieve distribution compatibility:Shared-nothing join: Achieves distribution compatibility by using compatible distribution keys in the SQL join criteriaUltra shared-nothing join:Achieves distribution compatibility through a replicated table; no data movement between nodes is requiredRedistribution join: Requires data to be dynamically distributed between compute nodes to achieve distribution compatibility
  • The Administrative Console is an Internet Information Services (IIS) web application for SQL Server Parallel Data Warehouse that displays the appliance’s state information. Users connect to the Administrative Console through Microsoft Internet Explorer.
  • The Configuration Manager is an appliance administration tool that SQL Server Parallel Data Warehouse system administrators use to perform appliance-level operations and to change appliance-level settings. For example, use the Configuration Manager to reset passwords, set the time zone, change IP addresses, configure SSL certificates, enable remote access through the firewall, start or stop the appliance, and set Instant File Initialization.
  • A distributed data warehouse solution, such as that supported by SQL Server Parallel Data Warehouse, comprises a centralized EDW and a set of loosely coupled data marts. For many years, this has been the preferred approach for enterprise-wide data warehousing, and numerous studies since 2003 confirm that hub and spoke is the most popular data warehouse architecture among DW professionals. Traditionally, implementing a hub and spoke architecture has been challenging due to practical limitations of the database engine and network resources.[Click to display types of spoke]With SQL Server Parallel Data Warehouse, you can create a diverse range of types of spoke, from SQL Server Parallel Data Warehouse MPP appliances for user groups that have extreme scalability requirements, Fast Track data warehouse implementations, SQL Server 2008 Enterprise data warehouses, and even SQL Server 2008 Analysis Services OLAP databases.[Click to display parallel database copy point]However, the SQL Server Parallel Data Warehouse parallel database copy technology enables rapid data integration between spokes and the SQL Server Parallel Data Warehouse hub, making it easier to build hub and spoke solutions that integrate your diverse data marts and the enterprise data warehouse.[Click to display multiple-user SLA point]The SQL Server Parallel Data Warehouse hub and spoke architecture enables you to support user groups with very different SLAs; supports hot, warm, and cold data; supports different requirements for loading data loading, and more.
  • The EDW appliance can be the central hub in this architecture. The spokes can be anything from a SQL Server departmental data mart to a Fast Track reference implementation, a business decision appliance, or a SQL Server Analysis Services system. EDW is not restricted to any particular model, and the high-speed data copy features enable multiple clients.
  • With so many choices, there are always questions about which solution is right for the organization. These questions help you to determine the correct solution. While there is rarely any one deciding factor, you can find a solution that is optimized for the things that are most important to you.
  • The EDW appliance fits in with your existing data warehouse solutions and will enable you to query and report on the large amount of data stored in the appliance.

All up datawarewhouse – from smp to parallel All up datawarewhouse – from smp to parallel Presentation Transcript

  • All up datawarewhouse – From SMP to Parallel Data warehousing
  • Take 1 big SANAdd a little ServerAdd a bigger ServerAdd more networking
  • POTENTIAL PERFORMANCE BOTTLENECKS DISK DISK FC A SQL SERVER CPU CORES A WINDOWS FC SWITCH SERVER HBA B CACHE LUN CACHE A STORAGE A B CONTROLLER B DISK DISK FC A HBA B B LUNCPU Feed Rate SQL Server HBA Port Rate Switch Port Rate SP Port Rate LUN Read Rate Disk Feed Rate Read Ahead Rate
  • It’s all about …. SIZING
  • One SHOE does not FIT ALL
  • Transaction processing simplifies and accelerates data capture for accurate business decisionsAnalysis leadsto optimizedbusinessprocesses and Data warehousingimproved enables commonperformance data model for single version of the truth
  • Data Warehouse ScopeSupporting BI Data Storage Presentation LayerSystems Systems Systems Integration Analysis Services Services ETL Cubes Presentation Data Presentation Data Web Analytic Data Path Tools Reporting Services SharePoint Services Dedicated Microsoft Office SAN, Storage SharePoint Data Warehouse PerformancePoint Array Data Staging, Bulk Loading Data Warehouse Scope (dashed)
  • Data Warehouse Scenarios• No longer exclusive to large enterprises and specialists analysts• Growth of affordable self-service BI tools such as PowerPivot and Reporting Services has created a DW requirement for smaller businesses and individual departments
  • Microsoft Data Warehousing Offerings BDW Fast Track Data Parallel Data Enterprise Appliance Warehouse RA Warehouse Reference Appliance for high endScalable and reliable Scalable and reliable architectures offering MPP Data WarehousingSMP platform for data platform for data best price delivering highest warehousing on any warehousing on any performance for data scalability and hardware hardware warehousing performance Ideal for data marts orIdeal for data marts or Ideal for large data small to mid-sized Ideal for high scale or small to mid-sized marts or mid-sized data warehouses with high performance data enterprise data EDWs scan-centric marts and EDWs warehouses (EDWs) workloads Reference Integrated Appliance DW Appliance Architectures Software only (Software and (Fully integrated (Software and Hardware) Software and Hardware) Hardware) Scale-Up DW Scale-Up DW Scale-Up DW Scale-Out DW with MPP 10s of terabytes <5 terabytes 5–80 terabytes 10s - 100s of TB Software Assurance; Software Assurance; Mission CriticalPremier Mission Critical 3-Year Support Plus 24 Premier Mission Critical Advantage Program Support Support
  • Microsoft Data Warehouse OfferingsEffort to Build Very High Very Low Modera Modera Moderate Mode Very te te rate LowCapacity Variable 5 TB 14 TB 20 TB 40 TB 40 TB 500 TBConcurrency Variable Light Light Medium Medium High Very HighQuery Variable Medium Mediu Medium Medium High VeryComplexity m High
  • Business Data Warehouse Appliance
  • Business Data Warehouse ApplianceComplete Optimized Agile • Deploy in hours/days, not in • Specifically for small to• Hardware + Software months medium data warehouse + Services • Easy to use through built-in workload• Pre-tuned, pre dedicated tools to load and manage • Designed for performance, your data warehouse configured, pre- energy efficiency, and value • Designed for up to 5TB data installed. Turn on and by HP and Microsoft’s best warehouses go! engineers • Fast Track 3.0 compliant, license• Single point of contact • Security and reliability built path to Fast-Track in for support
  • ScenariosSmall/Departmental Spoke in EDW Hub and Data Warehouse Spoke Architecture
  • Reference Architectures
  • Fast Track Data Warehouse Components Software: • SQL Server 2008 R2 Enterprise • Windows Server 2008 R2 Configuration guidelines: • Physical table structures • Indexes • Compression • SQL Server settings • Windows Server settings • Loading Hardware: • Tight specifications for servers, storage and networking • ‘Per core’ building block
  • SQL Server Parallel Data Warehouse
  • SQL Server Parallel Data Warehouse • Tier-1 Enterprise Data Warehouse Appliance Offering – High scalability from tens to hundreds of terabytes – High performance through the MPP system • Flexibility and Choice – Choice of deployment options through distributed architecture • Most Comprehensive Solution – Complete data warehouse solution spanning desktop, enterprise data warehouse, and data marts
  • SQL SQLSQL SQL SQL SQL SQL SQL SQL SQL SQL
  • CONTROL NODE SQL  Client connections always go through the control SQL nodeSQL  SQL Contains no persistent user data  Parallel Data Warehouse advantages: SQL oProcesses SQL requests oPrepares execution plan SQL oOrchestrates distributed execution SQL  Local SQL Server processes final query plan and aggregates results SQL  Provided by DataDirect SQL oOpen database connectivity (ODBC), object linkingSQL embedding database and (OLE DB), Java Database Connectivity (JDBC), and ActiveX® Data Objects SQL (ADO.net) client drivers oWire protocol (SeQuel link) oDrivers are available for 32 bits and 64 bits
  • MANAGEMENT NODE SQL SQL  Provides Support and Patching for theSQL Appliance SQL  Holds image for re-deployment of compute SQL node  Holds Active Directory SQL SQL SQL SQL SQL SQL
  • LANDING ZONE SQL SQL  Provides high-capacity storage for data filesSQL from ETL processes SQL  Is available as a sandbox for other SQL applications and scripts that run on the internal networkSQL  Provides SQL Server Integration Services SQL SQL SQL Landing Data Compute Source Zone Loader Nodes SQL Files SQL DWLoader or SQL Server Integration Services
  • BACKUP NODE SQL SQL  Provides Integrated Backup SolutionSQL  Integrates with SQL party backup option 3rd  Orderable in different sizes SQL SQL SQL SQL SQL SQL SQL
  • SQL SQL• Data Rack Servers 10 active + 1 passive SQL SQL• HP ProLiant DL360 G7 compute nodes SQL SQL• InfiniBand, FC and Ethernet switching, 42U rack SQL SQL• Expansion Grow from 1– 4 data racks, storage SQL options, test/dev system• Storage 10x HP SQL StorageWorks MSA P2000 G3 SQL• Consists of COMPUTE NODES and STORAGE NODES
  • COMPUTE NODE• Data Rack Servers 10 active + 1 passive SQL• HP ProLiant DL360 G7 compute nodes• InfiniBand, FC and Ethernet switching, 42U rack  Each MPP node is a highly tuned symmetric• Expansion Grow from 1– multi-processing (SMP) node with standard 4 data racks, storage interfaces options, test/dev system  Provides dedicated hardware, database,• Storage 10x HP and storage StorageWorks MSA P2000 G3  Runs SQL Server  Spare Node provides failover in case of node failure  Drives are configured as RAID 1
  • PDW – Client Connectivity SQL SQL Client Drivers SQL SQL SQL SQL Support/Patching SQL SQL ETL Load Interface SQL SQL SQLCorporate Backup Solution
  • PDW – Query Processing ??? SQL ??? ??? SQL ???QUERY SQL ??? SQL ??? SQL ??? SQL ??? SQL ??? SQL ??? SQL ??? ??? SQL SQL
  • Data Layout ApproachesReplicatedA table structure exists as a full copy within each discrete ParallelData Warehouse node.DistributedA table structure is hashed on a single column and uniformlydistributed across all nodes on the appliance. Each distribution is aseparate physical table in the database management system(DBMS).Ultra Shared-NothingProvides the ability to design a schema of both distributed andreplicated tables to minimize data movement between nodes. Small sets of data can be more efficiently stored in full (replicated). Certain set operations (such as single-node operations) are more efficient against full sets of data.
  • Ultra Shared-Nothing ArchitectureExtends Traditional Shared-Nothing Design Pushes shared-nothing architecture into the SMP node—there is IO and CPU affinity within SMP nodes o Eliminates contention for user queries o Uses full resources for each user query Provides multiple physical instances of tables o Distributes large tables o Replicates small tables Redistributes rows as neededProvides Fault Tolerance All hardware components have redundancy (including CPUs, disks, networks, power, and storage processors) Control and compute nodes use failover clustering Management nodes have active and standby states
  • Administrative Console  Dashboard  Query activity  Load activity  Backup and restore  Active locks  Active sessions  Alerts  Appliance state https://controlnodeipaddress
  • Parallel Data Warehouse Configuration Manager  Appliance topology  Services status  Network configuration  Privileges
  • Flexible Business AlignmentParallel database Supports user groupscopy technology with very differentenables rapid data service-levelmovement and agreements (SLAs):consistency between • PerformanceEDW and data marts • Capacity • Loading • Concurrency Create SQL Server 2008 R2, Fast Track Data Warehouse, and SQL Server Analysis Services Data Marts A distributed architecture gives you the flexibility to add or change diverse workloads or user groups while maintaining data consistency across the enterprise
  • Distributed Data Warehouse Architectures Departmental Reporting Regional High- Reporting Central EDW Performance Hub Reporting MobileApplicati ons Landing Regional Zone Reporting Third- with Business Party Decision RDBMS Appliance Third- Party ETL Tools Data Integrati on
  • Determining the Right SolutionWhat is the workload? Number of concurrent users Query complexity Query mix Load processing Performance requirementsWhat is the customer looking for in a solution? Simplicity in the appliance 100 percent compatibility with SQL Server 2008 R2 Enterprise scalability Economical hardware Incremental expansion and high availability by default
  • Parallel DatawarehouseValue to Customer Enterprise-class scalability to hundreds of terabytes High performance Interoperability with leading BI products Mission critical support and maintenance Mature SQL Server platform with high security and robust engineering process Strong data warehouse vision and roadmap that includes industry- leading technologies Supporting Features MPP with ultra shared-nothing architecture Distributed query optimization Balanced hardware with pre-tested and pre-tuned appliances optimized for data warehousing Third-party product integration (for example, Microstrategy, Business Objects, and Informatica) Mission critical support and maintenance Road map includes column store, petabyte scalability, real-time data warehousing, MDM,