Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Ibm pure data system for analytics


Published on

Published in: Technology, Business
  • Be the first to comment

Ibm pure data system for analytics

  1. 1. IBM PureData System for Analytics JuanMa Rebés Business Development IBM Systems & Solutions Arrow Spain
  2. 2. Planet Earth 2014: Innovation wanted, needed and possible 2
  3. 3. 3 S&P 500 in 1957 (year 1 for the index) S&P 500 in 2010 (year 54 for the index) Of the 500 S&P companies in 1957, only 14% remained on the 2010 S&P 500 list Innovate or Die – Nothing New Under The Sun Source: S&P Guide 2010 and &P Library for 1957 data
  4. 4. Technology Is the Driving Force Shaping the Future 4
  5. 5. Major Waves of Technology 5
  6. 6. What Is Driving IT Demand? 6
  7. 7. Big Data is All Data from Everywhere 7
  8. 8. Intelligence Everywhere: Efficiency & Innovation 8
  9. 9. 9
  10. 10. 10 Mobile sensor market exploded (red line) between 2007 (introduction of iPhone) and 2012 from 10M sensors to 3.5B. Such explosion wasn’t anticipated by market research companies. Currently, they forecast growth to about 20B units in 5 years, while many visionary organizations see continued market explosion to trillions of sensors. TSensors Summit vision of reaching a trillion sensors in a decade needs 56%/y growth. sensors/
  11. 11. 11 90% of mobile users keep their device within arm’s reach 100% of the time 57% of CEOs using Social to Connect with Customers 8 zettabytes of digital content created by 2015
  12. 12. 12
  13. 13. 13
  14. 14. 14
  15. 15. 15
  16. 16. Businesses are “dying of thirst in an ocean of data” 80% of the world’s data today is unstructured 90% of the world’s data was created in the last two years 1 Trillion connected devices generate 2.5 quintillion bytes data / day
  17. 17. Businesses are “dying of thirst in an ocean of data” 1 in 2 business leaders don’t have access to data they need 83% of CIOs cited BI and analytics as part of their visionary plan 2.2X more likely that top performers use business analytics 80% of the world’s data today is unstructured 90% of the world’s data was created in the last two years 1 Trillion connected devices generate 2.5 quintillion bytes data / day
  18. 18. Uncertainty of New Information is Growing Alongside its Complexity 18
  19. 19. Increasing Variety of data requires new techniques Increasing Velocity of data requires higher performance Increasing Volume of data requires growing capacity 35 ZB by 2020 Today’s big data challenges for analytics are increasing demands on data systems Millions of transactions per second Telco subscriber activity logging Mobile CloudSocial Big DataCommerce 2020 50x 2010 Analytics Billions of devices & sensors Smart Meters, RFIDs, GPS…
  20. 20. Smarter Analytics should be your goal CIOs rank Analytics as the #1 factor Contributing to an organization’s competitiveness.1 1 IBM CIO Study 2009 2 IBM IBV/MIT Sloan Management Review Study 2011 Financial outperformers are 64% more likely to use analytics to evaluate talent supply and demand on an ongoing basis.3 Enterprises that apply advanced analytics have 33% More revenue Growth and 12X more profit growth.4 3 IBM CHRO Study 2010 4 IBM CFO Study 2010 Organizations that embrace analytics are more than 2X as likely to outperform their Peers.2
  21. 21. Achieve Smarter Analytics by Using all Types of Analytics Against All Types of Data Smarter Analytics Operational reports Analytic reports Statistical Analysis Forecasting Predictive Modeling Optimization Social Analytics Web Analytics Alerts Transactional & Application Data Machine & Sensor Data Social Data Enterprise Content Web Data Documents Spatial Analysis
  22. 22. How do you achieve that? There are many challenges to overcome Difficulty adding new data or analytic capability Increased Agility Lack of analytical insight Accelerated Time to Value Growing data volume, variety and velocity Tools for gaining insight from Big Data Complicated system lifecycles Reduced Complexity Administration complexity Simplicity Growing costs of IT Increased Efficiency Do you face these challenges: Then you need a platform that provides: Broad spectrum of workload and SLA requirements Fit for Purpose Solutions
  23. 23. Built-In Expertise Makes This as Simple as an Appliance  Dedicated device  Optimized for purpose  Complete solution  Fast installation  Very easy operation  Standard interfaces  Low cost
  24. 24. Analytics without constraint PureData for Analytics – Where Big Data Meets Deep Analytics
  25. 25. BI / Reportin g BI / Reporting Exploration / Visualization Functional App Industry App Predictive Analytics Content Analytics Analytic Applications IBM Big Data Platform Systems Management Application Development Visualization & Discovery Accelerators Information Integration & Governance Hadoop System Stream Computing PureData System for Analytics Data Warehouse
  26. 26. Built-in Expertise  No indexes or tuning  Data model agnostic  Fully parallel, optimized In Database Analytics Integration by Design  Server, Storage, Database in one easy to use package  Automatic parallelization and resource optimization to scale economically  Enterprise-class security and platform management Simplified Experience  Up and running in hours  Minimal ongoing administration  Standard interfaces to best of breed Analytics, BI, and data integration tools  Built-in analytics capabilities allow users to derive insight from data quickly  Easy connectivity to other Big Data Platform components IBM PureData System for Analytics The Simple Appliance for Serious Analytics
  27. 27. System for Analytics Delivering data services for analytics IBM PureData System for Analytics Optimized exclusively for analytic data workloads Speed  10-100x faster than traditional custom systems*  Patented MPP hardware acceleration (Massively Parallel Processing) Simplicity  Data load ready in hours  No database indexes  No tuning  No storage administration Scalability  Peta-scale data capacity Smart  Designed to runs complex analytics in minutes, not hours  Richest set of in-database analytics * Based on IBM customers' reported results. "Traditional custom systems" refers to systems that are not professionally pre-built, pre-tested and optimized. Individual results may vary.
  28. 28. Spend Less Time Managing and More Time Innovating Simplicity and Ease of Administration  No dbspace/tablespace sizing and configuration  No redo/physical/Logical log sizing and configuration  No page/block sizing and configuration for tables  No extent sizing and configuration for tables  No Temp space allocation and monitoring  No RAID level decisions for dbspaces  No logical volume creations of files  No integration of OS kernel recommendations  No maintenance of OS recommended patch levels  No JAD sessions to configure host/network/storage Data Experts, not Database Experts  Easy Administration Portal  No software installation  No indexes and tuning  No storage administration
  29. 29. BI Reporting and Ad-Hoc Analysis  What happened?  When and where?  How much? Predictive Analytics  What will happen?  What will the impact be? Optimization  What is the best choice? © 2012 IBM Corporation29 PureData System for Analytics Takes Analytics Beyond Reporting
  30. 30. Simplify Move analytics into the Data Warehouse – Integrate the server, storage and database into one optimized package – Move complex analytics into the database – Integrated, high performance analytics within the data warehouse Server Storage Database Analytics
  31. 31. PureData System for Analytics N2002-002 Hardware Overview  User Data Capacity: 32 TB*  Data Scan Speed: 72 TB/hr*  Load Speed (per system): 1+ TB/hr  Power Requirements: 3.2 kW  Cooling Requirements: 10,850 BTU/hr * Assuming 4X compression 2 Hosts (Active-Passive)  2 6-Core Intel Sandy Bridge CPUs  7x300 GB SAS Drives  Red Hat Linux 6 64-bit 2 Disk Enclosures  48 600 GB SAS2 Drives  Using RAID 1 Mirror  40 for User Data, 4 Spares, 4 for S-Blades 2 PureData for Analytics S-Blades™  2 Intel 8 Core 2+ GHz CPUs  2 8-Engine Xilinx Virtex-6 FPGAs  128 GB RAM + 8 GB slice buffer  Linux 64-bit Kernel  Faster, next generation CPUs  Smaller, faster drives for increased performance & faster failover  2X as many spares for better resiliency  S-Blades contain faster, next generation CPUs and FPGAs
  32. 32. What Makes PureData System for Analytics Different? Speed Simplicity Scalability Smart Up to 2000X faster than before Growing by 30% every month “Netezza has allowed us to reduce the complexity of regulatory reporting and processing of exchange data from days down to minutes.” 200X faster than Oracle system ROI in less than 3 months Up and running 6 months before having any training “Allowing the business users access to the Netezza box was what sold it.” - Steve Taff, Executive Dir. of IT Services 1 PB on Netezza 7 years of historical data 100-200% annual data growth “NYSE … has replaced an Oracle IO relational database with a data warehousing appliance from Netezza, allowing it to conduct rapid searches of 650 terabytes of data.” - SUNY Buffalo researchers reduced the time to perform quintillions of computations from 27 hours to 12 minutes “Once we had the data on Netezza we were able to do the same analysis and much more complex analysis in minutes. The research draws on medical records, lab results, MRI scans, and patient surveys.” - Dr. Murali Ramanathan, SUNY Buffalo
  33. 33. Concurrency, Performance, I/O efficiency and Manageability Performance Management & Efficiency  Performance improvements in:  Optimizer efficiency  Memory management  Communications protocols  Workload management  Faster, Better, and completely transparent to the end-user  Always improving throughput and concurrency for tactical queries  Up to 200 queries/second micro analytic workloads  Directed Data Processing increases throughput for tactical queries Resiliency and Fault Tolerance  Blade level resilience for continuous high performance  Enhanced automatic system software resilience for enterprise level requirements
  34. 34. IBM Netezza Analytics: Built-In Features and Capabilities BI Tools Visualization Tools PureData for Analytics AMPP Platform Software Development Kit IBM In-Database Analytics 3rd Party In-Database Analytics IBM InfoSphere Streams Tanay GPU Appliance by Fuzzy Logix IBM InfoSphere BigInsights Cloudera Apache Hadoop IBM SPSS SAS Revolution Analytics Eclipse
  35. 35. Data In Loading the PureData System for Analytics Data Integration – Ab Initio – Cloudera – Composite Software – IBM Big Insights – IBM Information Server – IBM InfoSphere Streams – Informatica – Oracle Data Integrator – Oracle GoldenGate – SAP Business Objects SQLODBCJDBCOLE-DB
  36. 36. Querying the PureData System for Analytics Reporting and Analysis – IBM Cognos – IBM SPSS – IBM Unica – Information Builders – Kalido – KXEN – Microsoft Excel – MicroStrategy – Oracle OBIEE – SAP Business Objects – SAS – Actuate Data Out SQLODBCJDBCOLE-DB
  37. 37. Netezza Platform Software V7.0 Synergy and Integration with IBM and Partner products  Spatial Support with ESRI inside the box  zLinux support for JDBC and ODBC Connectors for Cognos and InfoServer (DataStage)  Backup/Restore Updated Certifications:  Tivoli Storage Manager  EMC Networker (Legato)  NetBackup (Veritas)  IDAA – DB2 Analytics Accelerator for z/OS  EBCDIC mapping for single byte encoding Internal Use Only | Do Not Distribute
  38. 38. IBM Netezza Platform Software (NPS) 7.x Improved Concurrency, Performance, I/O efficiency and Manageability Better Performance Improved Management & Efficiency  Netezza Performance Portal 2.0  More than half a dozen performance improvements in:  Optimizer efficiency  Memory management  Communications protocols  Workload management  Faster, better, and completely transparent to the end-user  NPS 7.x provides much greater throughput for tactical queries vs. NPS 6.x1  Directed Data Processing for improved concurrency and better performance  Page Level Zone Maps eliminating unnecessary disk scanning 1 Based on internal benchmarking. NPS 7 refers to IBM Netezza Platform Software v.7 and NPS 6 refers to IBM Netezza Platform Software v.6. NPS is the platform software included with the IBM PureData System for Analytics appliance.
  39. 39. Simplicity and Ease of Administration • Monitor System Resources • Perform System Administration • Understand & Predict Capacity IBM Netezza Performance Portal 2.0 Consolidating WebAdmin and Portal for Simple Admin – Simple web user interface – Part of the PureData System for Analytics – New functional and usability enhancements – Administrative Functions – Hardware view & alerts – Database objects administration – User & Group management – View active sessions – Workload Management – View Events – Table skew/storage search – Capacity Planning – Monitor enhancements – Usability improvements – allow to resize monitors and mark not- monitored periods – Customer requested improvements – Show locks
  40. 40. Integrated by Design IBM Netezza Analytics Version 2.0 Netezza In-Database Analytics 2.0  Transformations  Mathematical  Geospatial  Predictive  Statistics  Time Series  Data Mining  No data movement  Analyze deep and wide data  High performance, parallel computation
  41. 41.  Basic Math*  Permutation and Combination*  Greatest Common Divisor and Least Common Multiple*  Conversion of Values*  Exponential and Logarithm*  Gamma and Beta Functions  Matrix Algebra+  Area Under Curve*  Interpolation Methods* Transformations MathematicalTime Series  Linear Regression+  Logistic Regression+  Classification  Bayesian  Sampling  Model Testing  Geospatial Data Type  Geometric Functions  Geometric Analysis Predictive Geospatial * Fuzzy Logix DB Lytix capabilities + Netezza Analytics and Fuzzy Logix DB Lytix capabilities  Data Profiling / Descriptive Statistics+  General Diagnostics  Statistics+  Sampling  Data prep Pre-Built In-Database Analytics  Descriptive Statistics+  Distance Measures*  Hypothesis Testing*  Chi-Square & Contingency Tables*  Univariate & Multivariate Distributions+  Monte Carlo Simulation*  Autoregressive+  Forecasting*  Association Rules+  Clustering+  Feature Extraction+  Discriminant Analysis* Data Mining Statistics
  42. 42. IBM DB2 Analytics Accelerator Now even faster with PureData System for Analytics N200X  NEW! Smaller entry point for z customers to accelerate analytic workloads  Query data at high speeds—by combining System z and Netezza hardware-accelerated analytics to speed complex query processing.  Extend the capabilities of DB2 for z/OS—to support a cost-effective data warehousing and business intelligence solution.  Lower operating costs—by reducing System z disk requirements and offloading query workloads to a high- performance platform.  Cost effective solution
  43. 43. Consolidate 100+ deployments to ONE analytics environment Business Innovation with Analytics on zEnterprise IBM focuses on the business not technical constraints delivering Analytics to 450,000+ users, drawing from over 660 data sources and generating more than 55,000 reports daily. “IBM has reduced end to end information delivery up to 400% and database query response times up to 4700%. Using IBM DB2 Analytics Accelerator, end users now have the information they need to make highly informed decisions orders of magnitude faster across key enterprise business processes including orders, billing; invoicing, purchasing and financial management.” James Correa, Senior Manager, IBM Business Analytics Centre of Competency
  44. 44. IBM’s PureData System for Analytics Provides – FASTEST time to value on the market today – Optimized analytics performance for Big Data – Simple administration for fast and agile deployment – Large library of analytic functions to accelerate analytic performance System for Analytics