Leader in Data Warehouse Appliances

1,422 views

Published on

1 Comment
1 Like
Statistics
Notes
No Downloads
Views
Total views
1,422
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
95
Comments
1
Likes
1
Embeds 0
No embeds

No notes for slide
  • That’s exactly what the Netezza TwinFin is in the data warehousing and analytics world – a true appliance, which sets it apart from the competition It is engineered from the ground up for data warehousing and analytics And offers a complete solution that integrates database, server and storage together It supports standard interfaces such as ODBC, JDBC and ANSI SQL, making it very easy to deploy. Takes 2 days to get up and running versus weeks for other solutions The appliance characteristics translate into real value for customers (what we refer to as the 4 S’s)TwinFin is 10-100X faster than competitors like Oracle, Teradata and others. When analytic queries take seconds instead of hours to perform, customers get the opportunity to completely rethink their business processes and in some cases, even launch entirely new businesses The appliance is unlike anything that DBAs and IT teams have experienced in the past. Whereas Oracle and Teradata data warehouses require armies of specialists to manage, Netezza offers performance out-of-the-box, without requiring any tuning, indexing, aggregations, etc. A single appliance scales to more than a petabyte of user data capacity, not just acting as a repository for information, but allowing complex analytics to be conducted at-scale, on all the enterprise data By embedding analytics deep into the data warehouse, TwinFin powers high performance advanced analytics 100’s or even 1000’s of times faster than possible before Let’s look at some examples of Netezza (true) appliances in real customer environments
  • That’s exactly what the Netezza TwinFin is in the data warehousing and analytics world – a true appliance, which sets it apart from the competition It is engineered from the ground up for data warehousing and analytics And offers a complete solution that integrates database, server and storage together It supports standard interfaces such as ODBC, JDBC and ANSI SQL, making it very easy to deploy. Takes 2 days to get up and running versus weeks for other solutions The appliance characteristics translate into real value for customers (what we refer to as the 4 S’s)TwinFin is 10-100X faster than competitors like Oracle, Teradata and others. When analytic queries take seconds instead of hours to perform, customers get the opportunity to completely rethink their business processes and in some cases, even launch entirely new businesses The appliance is unlike anything that DBAs and IT teams have experienced in the past. Whereas Oracle and Teradata data warehouses require armies of specialists to manage, Netezza offers performance out-of-the-box, without requiring any tuning, indexing, aggregations, etc. A single appliance scales to more than a petabyte of user data capacity, not just acting as a repository for information, but allowing complex analytics to be conducted at-scale, on all the enterprise data By embedding analytics deep into the data warehouse, TwinFin powers high performance advanced analytics 100’s or even 1000’s of times faster than possible before Let’s look at some examples of Netezza (true) appliances in real customer environments
  • We do not have indexes. They are not an option, they simply do not exist.There is no disk administration or SA administraion. Day 2, the customer has a pool of disk performant ready.Upgrades are performed by Netezza as standard maintenance tech support call. Does Oracle help you go from 9i to 10g?Instead of spending time and effort on tedious DBA tasks, use the time for higher BUSINESS VALUE tasks: Bring on new applications and groups Quickly build out new data marts Provide more functionality to your end users
  • A Company is judged by the Company they keep. Those were just a few examples from over 500 Netezza customers Our customers span a variety of vertical industries and sizes
  • Predictability
  • XO Communications offers avariety of communications services including voice over internet protocol (VoIP), dataand internet services, network transport, broadband wireless access, and hosted andmanaged services. Its high capacity IP network and advanced transport networksupport more than 50 percent of the Fortune 500 and many of the world’s largesttelecommunications companies.
  • The new Cruiser will provide even greater scalability for applications such as backup for multiple TwinFins or history for regulatory reporting.At the other end of the scale the Skimmer is deployed as a development or testing appliance.
  • A key component of Netezza’s performance is the way in which its streaming architecture processes data. The Netezza architecture uniquely uses the FPGA as a turbocharger … a huge performance accelerator that not only allows the system to keep up with the data stream, but it actually accelerates the data stream through compression before processing it at line rates, ensuring no bottlenecks in the IO path. You can think of the way that data streaming works in the Netezza as similar to an assembly line. The Netezza assembly line has various stages in the FPGA and CPU cores. Each of these stages, along with the disk and network, operate concurrently, processing different chunks of the data stream at any given point in time. The concurrency within each data stream further increases performance relative to other architectures.Compressed data gets streamed from disk onto the assembly line at the fastest rate that the physics of the disk would allow. The data could also be cached, in which case it gets served right from memory instead of disk. The first stage in the assembly line, the Compress Engine within the FPGA core, picks up the data block and uncompresses it at wirespeed, instantly transforming each block on disk into 4-8 blocks in memory. The result is a significant speedup of the slowest componentin any data warehouse—the disk. The disk block is then passed on to the Project engine or stage, which filters out columns based on parameters specified in the SELECT clause of the SQL query being processed.The assembly line then moves the data block to the Restrict engine, which strips off rows that are not necessary to process the query, based on restrictions specified in the WHERE clause. The Visibility engine also feeds in additional parameters to the Restrict engine, to filter out rows that should not be “seen” by a query e.g. rows belonging to a transaction that is not committed yet. The Visibility engine is critical in maintaining ACID (Atomicity, Consistency, Isolation and Durability) compliance at streaming speeds in the Netezza.The processor core picks up the uncompressed, filtered data block and performs fundamental database operations such as sorts, joins and aggregations on it. It also applies complex algorithms that are embedded in the snippet code for advanced analytics processing. It finally assembles all the intermediate results together from the entire data stream and produces a result for the snippet. The result is then sent over the network fabric to other S-Blades or the host, as directed by the snippet code.
  • That’s exactly what the Netezza TwinFin is in the data warehousing and analytics world – a true appliance, which sets it apart from the competition It is engineered from the ground up for data warehousing and analytics And offers a complete solution that integrates database, server and storage together It supports standard interfaces such as ODBC, JDBC and ANSI SQL, making it very easy to deploy. Takes 2 days to get up and running versus weeks for other solutions The appliance characteristics translate into real value for customers (what we refer to as the 4 S’s)TwinFin is 10-100X faster than competitors like Oracle, Teradata and others. When analytic queries take seconds instead of hours to perform, customers get the opportunity to completely rethink their business processes and in some cases, even launch entirely new businesses The appliance is unlike anything that DBAs and IT teams have experienced in the past. Whereas Oracle and Teradata data warehouses require armies of specialists to manage, Netezza offers performance out-of-the-box, without requiring any tuning, indexing, aggregations, etc. A single appliance scales to more than a petabyte of user data capacity, not just acting as a repository for information, but allowing complex analytics to be conducted at-scale, on all the enterprise data By embedding analytics deep into the data warehouse, TwinFin powers high performance advanced analytics 100’s or even 1000’s of times faster than possible before Let’s look at some examples of Netezza (true) appliances in real customer environments
  • Leader in Data Warehouse Appliances

    1. 1. 14 czerwca 2011 r.Warszawa, Sheraton Warsaw Hotel Dai Clegg Leader in Data Warehouse Appliances
    2. 2. The IBM Netezza DataWarehouse Appliance: faster, simpler, more accessible analytics
    3. 3. The IBM Netezza Appliance: Revolutionizing Analytics What is Netezza?
    4. 4. The IBM Netezza Appliance: Revolutionizing Analytics  Purpose-built analytics engine  Integrated database, server & storage  Standard interfaces  Low total cost of ownership  Speed: 10-100x faster than traditional systems  Simplicity: Minimal administration  Scalability: Peta-scale user data capacity  Smart: High-performance advanced analytics
    5. 5. IBM Netezza Appliance Overview Customers Appliance Simplicity  Appliance Architecture  Advanced In-database Analytics  Summary 
    6. 6. Appliance Simplicity
    7. 7. Managing The Netezza Appliance No software installation No storage administration No database tuningLess DBA drudgery,More applications
    8. 8. The Netezza Appliance – Loading Data Integration OLE-DBAb InitioBusiness Objects/SAP JDBCComposite SoftwareExpressor SoftwareGoldenGate Software (Oracle) Data In ODBCInformaticaIBM Information ServerSunopsis (Oracle) SQLWisdomForce
    9. 9. The Netezza Appliance – Querying Reporting & Analysis OLE-DB Actuate Business Objects/SAP Cognos (IBM) JDBC Information Builders Kalido KXEN Data Out MicroStrategy ODBC Oracle OBIEE QlikTech Quest Software SAS SQL SPSS (IBM) Unica (IBM)
    10. 10. Simple to Deploy and Operate Operations  Simply load and go  Installation to Business Value in ~2 days BI Developers  No configuration, indexes or tuning  out of the box performance ETL Developers  Faster load and transformation times  simpler ETL logic & in-database transformation Business Analysts  Lower latency  load & query simultaneously  True ad hoc queries
    11. 11. Customer Success 
    12. 12. Digital Media Financial ServicesGovernmentHealth & Life Sciences Retail / Consumer Products Telecom Other Page 12
    13. 13. Speed15,000 users running 800,000+queries per day 50X faster thanbefore“…when something took 24 hours I could onlydo so much with it, but when something takes10 seconds, I may be able to completely rethinkthe business process…” - SVP Application Development, NielsenSource:http://www.youtube.com/watch?v=yOwnX14nLrE&feature=player_embedded
    14. 14. SimplicityUp and running 6 months before having any training200X faster than Oracle systemROI in less than 3 months “Allowing the business users access to MONTHS the Netezza box was what sold it.” Steve Taff, WEEKS Executive Dir. of IT Services DAYS
    15. 15. Scalability 1 PB on Netezza 7 years of historical data 100-200% annual data growth “NYSE … has replaced an Oracle 10 relational database with a data warehousing appliance from Netezza, allowing it to conduct rapid searches of 650 terabytes of data.” ComputerWeekly.comSource: http://www.computerweekly.com/Articles/2008/04/14/230265/NYSE-improves-data-management-with-datawarehousing.htm
    16. 16. SmartPredicts what shoppers are likely to buy in future visitsCoupon redemption rates as high as 25%“Because of (Netezza’s) in-database technology,we believe well be able to do 600 predictive modelsper year (10X as many as before) with the same staff." Eric Williams, CIO and executive VP
    17. 17. Appliance Architecture 
    18. 18. IBM Netezza True Appliance Architecture SOLARIS AIXClient TRU64 HP-UX WINDOWS LINUX Database Server Storage DATA SQL ETL Server DBA CLI Source Systems CACHE CACHE 3rd Party I/O I/O CACHE Apps High SQL Performance Data Loader
    19. 19. IBM Netezza True Appliance Architecture SOLARIS AIXClient TRU64 HP-UX Database WINDOWS LINUX Server Storage ODBC 3.X JDBC Type 4 SQL-92 SQL-99 Analytics ETL Server DBA CLI Source Systems CACHE 3rd Party I/O Database, Server, Storage - in one CACHE I/O CACHE Apps High Performance Loader
    20. 20. IBM Netezza True Appliance ArchitectureOptimizedHardware+Software Streaming DataPurpose-built for highperformance analytics; Hardware-based query acceleration for blisteringrequires no tuning fast resultsTrue MPP Deep AnalyticsAll processors fully utilized Complex analyticsfor maximum speed and executed in-database forefficiency deeper insights  
    21. 21. Appliance family for data life-cycle management From a few terabytes to 10s of petabytes
    22. 22. Massively Parallel Processing 
    23. 23. IBM Netezza True Appliance Massively Parallel Processing™ SOLARIS AIXClient TRU64 HP-UX S-Blade 1 WINDOWS LINUX Processor & streaming DB logic ODBC 3.X JDBC Type 4 SQL OLE-DB Compiler S-Blade SQL/92 2 Processor & streaming DB logic Query Execution S-Blade Plan Engine 3 Processor & streaming DB logic High-Performance Optimize  Database Engine  Streaming ETL Server joins, aggregations, s Admin  orts High-Speed Loader/Unloader S-Blade DBA CLI 960 Source Front End DBOS Processor & streaming DB logic Systems 3rd Party Apps Network Massively Parallel SMP Host Fabric Intelligent Storage High Performance Loader 
    24. 24. IBM Netezza True Appliance Massively Parallel Processing™ SOLARIS AIXClient TRU64 HP-UX S-Blade 1 WINDOWS LINUX Processor & Snippets streaming DB logic SQL 1 2 3 SQL Compiler S-Blade 2 Processor & streaming DB logic Query Execution S-Blade Plan Engine 3 Processor & streaming DB logic Optimize  High-Performance Database Engine  Streaming joins, ETL Server Admin SQL  aggregations, sorts High-Speed Loader/Unloader S-Blade DBA CLI 960Source Front End DBOS Processor & streaming DB logicSystems 3rd Party Apps Network Massively Parallel SMP Host Fabric Intelligent Storage High Performance Loader
    25. 25. Our Secret Sauceselect DISTRICT, PRODUCTGRP, sum(NRX)from MTHLY_RX_TERR_DATAwhere MONTH = 20091201 FPGA Core CPU Coreand MARKET = 509123and SPECIALTY = GASTRO Uncompress Project Restrict, Complex ∑ Visibility Joins, Aggs, etc. Slice of tableMTHLY_RX_TERR_DATA (compressed) select DISTRICT, where MONTH = 20091201 sum(NRX) PRODUCTGRP, and MARKET = 509123 sum(NRX) and SPECIALTY = GASTRO
    26. 26. IBM Netezza True Appliance Massively Parallel Processing™ SOLARIS AIXClient TRU64 HP-UX S-Blade 1 WINDOWS LINUX Processor & Consolidate streaming DB logic SQL Compiler 2 S-Blade Processor & streaming DB logic Query Execution S-Blade Plan Engine 3 Processor & streaming DB logic Optimize  High-Performance Database Engine  Streaming joins, ETL Server Admin  aggregations, sorts High-Speed Loader/Unloader S-Blade DBA CLI 960 Source Front End DBOS Processor & streaming DB logic Systems 3rd Party Apps Network Massively Parallel SMP Host Fabric Intelligent Storage High Performance Loader
    27. 27. Advanced Analytics 
    28. 28. Advanced Analytics the Netezza Way Way Advanced Analytics the Traditional 1) Extract data into analytic workbench or grid 2) Develop model Data 3) TestAnalytics ModelSAS, SPSS Warehouse 4) Score Grid model against whole database Data 5) Debug by discarding and iterating from step 1 Demand Forecasting ETL SQL ETL Fraud SQL Detection R, S+ ETL C/C++, Java, Python, Fortran, … SQL  Phase I  speed up the investigation & extract with the Netezza warehouse
    29. 29. Advanced Analytics the Netezza Way AnalyticsSAS, SPSS Grid Data Demand Forecasting ETL Fraud SQL Detection R, S+ C/C++, Java, Python, Fortran, …
    30. 30. Advanced Analytics the Netezza WaySAS, SPSS  complex analytics SAS, SPSS, R, Java, etc  implicit parallelism Demand Forecasting  petabyte scalability SQL  appliance simplicity Fraud Detection R, S+ SQL  Phase II  Move model functionality into the Netezza warehouse  Using its bult-in analytic libraries  Save costs and improve analyst efficiency
    31. 31. Summary
    32. 32. The IBM Netezza Appliance  Purpose-built analytics engine  Integrated database, server & storage  Standard interfaces  Low total cost of ownership Digital Media Financial Services Government Health & Life Sciences Retail / Consumer Products Telecom Other Page 32
    33. 33. Thank You

    ×