Ibm pure data system for analytics


Published on

Published in: Technology, Business
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • This session will cover our new PureData System for Analytics, and how it can be used to drive more business value for your organization.
  • Main point: Data is growing at an astounding rate. It is growing so fast that we often lack the ability to use it to its full potential. The highly unstructured nature of this data makes the challenge that much more difficult. This is a real problem for business. It makes informed decisions more difficult to make. Business leaders need a way to find hidden patterns and isolate the valuable nuggets that they need to make business decisions.Further speaking points: Yet, the rewards for finding a way to harness the data into useful information are great; 54% of companies in this year’s study with MIT/Sloan are using analytics for competitive advantage… and that number has surged 57% in just the past 12 months. “Dying of thirst in an ocean of data”… It’s an apt analogy. Data is everywhere. 90% of it didn't exist just two years ago. The vast majority of it is totally useless for any given goal and therefore amounts to noise and a hindrance to finding the key useful information needed in a specific time and place. Additional information: See information and stats
  • Main point: Data is growing at an astounding rate. It is growing so fast that we often lack the ability to use it to its full potential. The highly unstructured nature of this data makes the challenge that much more difficult. This is a real problem for business. It makes informed decisions more difficult to make. Business leaders need a way to find hidden patterns and isolate the valuable nuggets that they need to make business decisions.Further speaking points: Yet, the rewards for finding a way to harness the data into useful information are great; 54% of companies in this year’s study with MIT/Sloan are using analytics for competitive advantage… and that number has surged 57% in just the past 12 months. “Dying of thirst in an ocean of data”… It’s an apt analogy. Data is everywhere. 90% of it didn't exist just two years ago. The vast majority of it is totally useless for any given goal and therefore amounts to noise and a hindrance to finding the key useful information needed in a specific time and place. Additional information: See information and stats
  • Today many business people don’t really know what predictive modeling, forecasting, design of experiments or mathematical optimization mean or do, but over the next 10 years, use of these powerful techniques will have to become mainstream, just as financial analysis and computers have, if businesses want to thrive in a highly competitive and regulated marketplace. Executives, managers and employee teams who do not understand, interpret and leverage these assets will be challenged to survive.To help organizations understand the opportunity of information and advanced analytics, the MIT Sloan Management Review partnered with the IBM Institute for Business Value to conduct a survey of nearly 3,000 executives, managers and analysts working across more than 30 industries and 100 countries. Among our key findings: top-performing organizations use analytics five times more than lower performers. Overall, our study found widespread belief that analytics offers value. Half of our respondents said that improvement of information and analytics was a top priority in their organizations. And more than one in five said they were under intense or significant pressure to adopt advanced information and analytics approaches. Because of these there was a 57% increase in the respondents who stated that they believe that analytics creates a competitive advantage for them. Executives have long been accustomed to a degree of imprecision and uncertainty when making decisions critical to their growth – and survival. For some companies their “best guess” was no longer good enough; hard facts were needed. These companies (the ones that use analytics) significantly outperformed their competitors by 220%!!! Now that is the true power of analytics.
  • Smarter analytics can no longer be constrained to operational reports on your point of sale data. Analytics in today’s world requires more than just reports, even analytic reports. You need to be able to combine your corporate data like point of sale with spatial data and complex statistical models to run forecasting and other predictive models that will optimize your outcome. In addition, most of today’s data is unstructured or semi-structured. It can come from documents and forms like your doctor’s office, from court documents, or from the machines and sensors throughout your factory, inventory, and supply chain. But today, the web is producing more data than any other source. In order to outperform your competitors you need to be able to take this data and use social analytics, in combination with your corporate data, to make better decisions. Today’s analytics systems need to be able to access and analyze the data where it exists, in its native form, you should not have to scrape it, summarize it, and then load it into a reporting system before running your predictive models or statistical analysis. This is the power that IBM brings with our Big Data Platform.There is no doubt analytics can transform an industry….In telco - 5 billion subscribers demand personalized offerings that match their lifestyle, The healthcare industry spends $250-$300 billion on healthcare fraud per year – this problem costs $650 million every day! and retailers miss out on $93 billion in sales every year because retailers miss out on stock to meet the demands. Yet most enterprises struggle keeping up with the demand of both their existing analytics performance while managing the demand for new analytic capabilities.
  • We just talked about what you need to do to be at the top of the heap, but how do you do this? If you have already built a data warehouse you are likely seeing some of the issues on the left. If you chose the wrong database system, and/or built the system on your own, you are likely finding it difficult to add new data and/or more analytics. You are probably having issues meeting your SLAs, unless you have an army of DBAs constantly monitoring and tuning your system. And because of the massive increase in data volume, variety, and velocity, administration has become even more complex, and data lifecycle management even harder. In order to overcome these challenges, you need a platform that is more agile, that can be built and up and running in hours, not days or weeks. This accelerated time to value means the system can start paying for itself right away, and not take months before you see value from it. PureData systems reduce the risk by delivering proven, integrated, simple, system that are fit for purpose, and optimized for that purpose. These PureData systems are the cornerstone for gaining insight from your corporate, spatial, and Big Data.
  • One of the biggest obstacles to build a truly Analytic Enterprise is the technology infrastructure. As user and data volumes grow and the questions organizations ask of the data become much more sophisticated, the technology infrastructure to support this need must evolve as well. Today, organizations have to make a choice between data volume and analytic complexity. Most analytics performed on large data is relatively simple. The infrastructure, based on traditional database technology, easily gets overextended just keeping up with the growth in user and data volumes, leaving no room to accommodate the increasing analytic complexity.On the other hand, most complex analytics is done on small data sets by groups of specialized users on underpowered systems. The analytic complexity overpowers the weak infrastructure, forcing users to limit the amount of data they operate on. Even with relatively small data volumes, users have to resort to all kinds of machinations to get the insights they want from the data. In fact quants and modelers spend 80% of their time preparing and cleaning data rather than doing actual modeling tasks. With Netezza, organizations no longer have to make a choice between Big Data and Big Math. Netezza’s business since day one has been all about removing bottlenecks from the infrastructure to allow analytics without any constraints delivering powerful business insight. We have now created the scalable analytics infrastructure for GE Corporate Researchs to drive towards truly massive data volumes, with large numbers of users, asking questions of that data that could not even be contemplated on other architectures.
  • The PureData System for Analytics, powered by Netezza technology, follows the core tenets of IBM’s PureSystems design philosophy. There is no need to tuning and on-going maintenance, and the analytics are optimized in the same way as regular SQL, so that they run in parallel all the time.Built-in Expertise means no complex tuning or modeling… just load and go. Integration by Design means there is no complex integration work to do to get the system functional. Everything you need to run complex analytics is provided, n a system that you need only load your data onto to get value. It is fast out of the box, and uses standard interfaces to talk to the outside world.Simplified Experience means a low cost of ownership and minimal maintenance effort. The system includes analytic functions out of the box at no additional cost and integrates easily with other components of the Big Data Platform (Hadoop/Big Insights, Streams, etc…)
  • This is what it really comes down to, and I’ve talked about this earlier, we’re still delivering the same ease of use. You still don’t have to do anything with indexes. There is no storage administration, no software installation, and it comes with an easy administration portal. So we become data experts and analytic experts, not database experts. You don’t have to have DBAs sitting around monitoring the system constantly. One of our big customers is Coach, and Coach has “two of our business users who do administration. We have two because, you know, they go on vacation. So someone’s always around.” They’re the business users, not database experts, and the only time they touch the system whenever they do a backup, they check was the backup successful. And, you’re going to do that on any system.  The only other time they touch the system is if/when there’s a new business user or somebody leaves the business or somebody changes applications, they need to give or take away the right privileges and authorizations. That’s it. They don’t have to worry about constantly monitoring and tuning it the way that competitive systems - our competitors require their customers to do it.
  • Most organizations find themselves in the bottom left box – they are doing basic BI reporting and analysis. Analytic appliances like IBM Netezza enable organizations to move up this value chain – to predictive analytics, and ultimately to what I refer to as “the holy grail of analytics”: Optimization. To answer the question “what is the best choice I can possibly make for my business given the data that I have...”A recent Nucleus Research study showed that for every $1 that was invested in analytics, the payback was over $10… it is almost like a license to print money. And we find that this is true. A IBM/MIT Sloan Management Review study in 2011 found that organizations that embraced analytics outperformed their industry peers by 220%.Those organizations that are successful in deploying analytics into their business processes will have a strong competitive advantage over those organizations that aren’t.
  • PureData System for Analytics simplifies the experience and delivery of analytics to your organization. PureData System for Analytics integrates the server, storage, database and analytic functions into a dedicated analytics platform that has been designed and built so that it is optimal for these workloads. Rather than moving the data to the analytics like other systems, the PureData System for Analytics has been designed to move all processing, even the complex analytic processing, to the data. PureData System for Analytics is a massively parallel system with a unique hardware acceleration that provides the highest performance analytics, without the need to move the data.
  • This is an overview of the PureData System for Analytics N2002-002. This is the replacement for the TwinFin-3, quarter rack configuration.There are 2 SAS disk enclosures with a total of 48 600 GB drives, two front end “hosts” using Sandy Bridge processor processors and seven 300GB drives each. But the real work happens on the S-Blades. In the N2002-002 there are 2 S-Blades, but neither is fully utilized. Each S-Blade is capable of handling 40 drives delivering data at full speed, so even if there is a failure of an S-Blade, the system continues to run at near full speed. These S-Blades have two eight-core CPUs and 2 eight-engine Field programmable gate arrays (FPGAs) along with 128GB of memory plus an 8 GB slice buffer. The FPGAs act as an intelligent query filter -- automatically discarding the 90-95% of the data that is irrelevant to the query being run. The built in expertise of the The FPGA does more than that though… certain types of processing are pushed down to run inside the FPGA -- on the fly decompression, projections, restrictions, and visibility lists are all handled within the FPGA. This patented hardware acceleration layer is the secret sauce that provides the breathtaking performance expected from IBM PureSystems. All of this adds up to a system that can handle up to 32 TB of data in a full rack that can scan data at approximately 72 TB per hour and load data at about 1TB per hour --- All without any need for indexes, storage management, or tuning.
  • PureData for Analytics enables a wide range of analytic capabilities with a robust analytic ecosystem.There are three pillars to the analytics ecosystem:SDK – the ability to extend the system in any of a number of supported languages 3rd party solutions from organizations like SAS, Fuzzy Logix, and of course IBM’s SPSSAnd sever key areas of built-in analytic librariesAll surrounded by integration points to traditional big data sources and best of breed BI, development, and analytic toolsBut, no matter what function you call, or write, it will run in parallel, within the database, on all of your data.
  • Some more Synergy and Integration with IBM and Partner productsSpatial Support with ESRI inside the box, the same as the rest of the analytic functions. So you can run analytics on your corporate and location based data at the same time. zLinux support for JDBC and ODBC Connectors for Cognos and InfoServer (DataStage), to allow these tools to run on zLinux IFLs.Backup/Restore Updated Certifications:Tivoli Storage ManagerEMC Networker (Legato)NetBackup (Veritas)IDAA – DB2 Analytics Accelerator for z/OSEBCDIC mapping for single byte encoding is now supported, since may zOS customers use EBCDIC code pages.
  • The Netezza Performance Portal v2.0 is a Web UI tool that is part of the PureData System for Analytics. It allows administrator to:Monitor resource utilization, system activity and system performance over a period of time to understand and predict capacity usageView, create, modify database objects and perform basic administration tasksReview hardware state & issues on the systemIn the 2.0 release, we have consolidated Netezza WebAdmin and Portal into one offering - streamlining product and development efficiencies. Existing Netezza or PureData System for Analytics customers can upgrade to NzPortal 2.0 free of charge through a download from FixCentral. Other functionality and enhancements can be seen here including adminstrative functions, monitoring enhancements and some customer requested improvements.
  • PureData for Analytics includes the most comprehensive library of analytic functions available on any platform today, organized around 7 primary functionality areas.In keeping with IBM Netezza’s philosophy of bringing the analytics to the data, these functions run in-database requiring no data movement to a separate analytics grid. This allows analysis of all the data for more accurate results. While sampling might be OK for mathematical calculations like average, you cannot examine someone’s purchasing history and make the “next best offer” based on a sample of data. Consider an example, you are trying to predict what a shopper can be influenced to buy, given a coupon. Let’s say that the shopper has bought the following items in the past month:1.      Topographical Map of Alaska2.      The book “Hiking Alaska”3.      Tent4.      Back pack5.      Sleeping bag6.      Compass7.      Portable GPSIn their current shopping expedition they are buying a pair of hiking boots. Looking at the list of what they are buying, we might hazard a guess that they are looking to start hiking, but we do not know where, or know what else they might need. So, let’s sample their historical purchases, and see what we can come up with. Even with a 20% sample (which is much larger than normal) we might retrieve the tent and compass. We still do not know where they are going, so we might offer them a coupon for a sleeping bag. But we see that they already have one. If the sample had included the book and the backpack instead, we now have an idea they might be going to Alaska, so maybe we should offer them a portable GPS for 20% off. This could be bad in a couple of ways… If the offer is for the same GPS they bought, they are likely to return the one they have and re-buy it, which just cut into the profit. If the offer is for a newer, better GPS and a price close to the price of what they just paid, then they may return the old one, or if they bought it just outside of the 30 day return window, you are likely to have an unhappy customer on your hands. This example shows why it is important to have fast analytics on all of your data, not just a “representative sample”, and this is what you can get by augmenting your EDW with an IBM Netezza data warehouse appliance.Parallelism is automatic because the analytics run on the S-Blades, directly on the data as it comes off the disks. This means no complex parallel programming algorithms!And PureData for Analytics has far more built in analytics than any of our competitors.
  • The PureData System for Analytics’ library of bundled functions provides capabilities across a wide range of needs from math to data mining to predictive modeling, and geospatial.
  • We also have customers who have been looking at the Netezza, the PureData System for Analytics with DB2 for System z. So the IBM DB2 analytics accelerator which really is a - here you piggyback a PureData System for Analytics on the side of a DB2 for zOS with a z196 or z Enterprise hardware and you offload the big complex queries reports analytic functions from the DB2 z system to the PureData System for Analytics. So the N200X will be as well not just for typical sales for the N200X, but also be the next generation DB2 analytics accelerator to deliver again that same greater capacity, 3X greater performance, greater scan speed, fewer service calls to these IDAA customers.
  • But what we want to start digging deeper now is the PureData System for Analytics and that’s what I’m going to focus on for the rest of this presentation. We’ll talk about what it delivers, how it brings the fastest time to value on a market today. So time to value again is how fast, how quickly can I get my system up and running, the analytics, complex analysis, reporting, predictive models, etc. to gain insight and value from my data. It’s optimized for Big Data analytics performance; speed, ease of use. It’s fast and it’s agile. That means that you don’t have to spend months tweaking your data model, tweaking your schema, creating the right index, creating the right average. You set it and forget it. You create the database, you load the data and you immediately start getting value by running those predictive models, you’re reporting your analytic queries. And the major thing, the major differentiator here between this and the other PureData systems, and very much even more between PureData System for Analytics and the competitors, is the vast analytic library function that is delivered as part of the analytics as part of the PureData System for Analytics. The IBM Netezza Analytics capability shines above the competition. The IBM Netezza Analytics (INZA) delivers over 150 analytic functions, whether it be our statistical calculations, time series analysis, etc. With geospatial calculations, we can correlate the transactions to location and that is becoming huge in today’s world. So that huge library, 150 plus analytic functions, dwarfs the competition. And that is what this system, the value that this system brings to the market.
  • Ibm pure data system for analytics

    1. 1. IBM PureData System for Analytics JuanMa Rebés Business Development IBM Systems & Solutions Arrow Spain
    2. 2. Planet Earth 2014: Innovation wanted, needed and possible 2
    3. 3. 3 S&P 500 in 1957 (year 1 for the index) S&P 500 in 2010 (year 54 for the index) Of the 500 S&P companies in 1957, only 14% remained on the 2010 S&P 500 list Innovate or Die – Nothing New Under The Sun Source: S&P Guide 2010 and &P Library for 1957 data
    4. 4. Technology Is the Driving Force Shaping the Future 4
    5. 5. Major Waves of Technology 5
    6. 6. What Is Driving IT Demand? 6
    7. 7. Big Data is All Data from Everywhere 7
    8. 8. Intelligence Everywhere: Efficiency & Innovation 8
    9. 9. 9
    10. 10. 10 Mobile sensor market exploded (red line) between 2007 (introduction of iPhone) and 2012 from 10M sensors to 3.5B. Such explosion wasn’t anticipated by market research companies. Currently, they forecast growth to about 20B units in 5 years, while many visionary organizations see continued market explosion to trillions of sensors. TSensors Summit vision of reaching a trillion sensors in a decade needs 56%/y growth. sensors/
    11. 11. 11 90% of mobile users keep their device within arm’s reach 100% of the time 57% of CEOs using Social to Connect with Customers 8 zettabytes of digital content created by 2015
    12. 12. 12
    13. 13. 13
    14. 14. 14
    15. 15. 15
    16. 16. Businesses are “dying of thirst in an ocean of data” 80% of the world’s data today is unstructured 90% of the world’s data was created in the last two years 1 Trillion connected devices generate 2.5 quintillion bytes data / day
    17. 17. Businesses are “dying of thirst in an ocean of data” 1 in 2 business leaders don’t have access to data they need 83% of CIOs cited BI and analytics as part of their visionary plan 2.2X more likely that top performers use business analytics 80% of the world’s data today is unstructured 90% of the world’s data was created in the last two years 1 Trillion connected devices generate 2.5 quintillion bytes data / day
    18. 18. Uncertainty of New Information is Growing Alongside its Complexity 18
    19. 19. Increasing Variety of data requires new techniques Increasing Velocity of data requires higher performance Increasing Volume of data requires growing capacity 35 ZB by 2020 Today’s big data challenges for analytics are increasing demands on data systems Millions of transactions per second Telco subscriber activity logging Mobile CloudSocial Big DataCommerce 2020 50x 2010 Analytics Billions of devices & sensors Smart Meters, RFIDs, GPS…
    20. 20. Smarter Analytics should be your goal CIOs rank Analytics as the #1 factor Contributing to an organization’s competitiveness.1 1 IBM CIO Study 2009 2 IBM IBV/MIT Sloan Management Review Study 2011 Financial outperformers are 64% more likely to use analytics to evaluate talent supply and demand on an ongoing basis.3 Enterprises that apply advanced analytics have 33% More revenue Growth and 12X more profit growth.4 3 IBM CHRO Study 2010 4 IBM CFO Study 2010 Organizations that embrace analytics are more than 2X as likely to outperform their Peers.2
    21. 21. Achieve Smarter Analytics by Using all Types of Analytics Against All Types of Data Smarter Analytics Operational reports Analytic reports Statistical Analysis Forecasting Predictive Modeling Optimization Social Analytics Web Analytics Alerts Transactional & Application Data Machine & Sensor Data Social Data Enterprise Content Web Data Documents Spatial Analysis
    22. 22. How do you achieve that? There are many challenges to overcome Difficulty adding new data or analytic capability Increased Agility Lack of analytical insight Accelerated Time to Value Growing data volume, variety and velocity Tools for gaining insight from Big Data Complicated system lifecycles Reduced Complexity Administration complexity Simplicity Growing costs of IT Increased Efficiency Do you face these challenges: Then you need a platform that provides: Broad spectrum of workload and SLA requirements Fit for Purpose Solutions
    23. 23. Built-In Expertise Makes This as Simple as an Appliance  Dedicated device  Optimized for purpose  Complete solution  Fast installation  Very easy operation  Standard interfaces  Low cost
    24. 24. Analytics without constraint PureData for Analytics – Where Big Data Meets Deep Analytics
    25. 25. BI / Reportin g BI / Reporting Exploration / Visualization Functional App Industry App Predictive Analytics Content Analytics Analytic Applications IBM Big Data Platform Systems Management Application Development Visualization & Discovery Accelerators Information Integration & Governance Hadoop System Stream Computing PureData System for Analytics Data Warehouse
    26. 26. Built-in Expertise  No indexes or tuning  Data model agnostic  Fully parallel, optimized In Database Analytics Integration by Design  Server, Storage, Database in one easy to use package  Automatic parallelization and resource optimization to scale economically  Enterprise-class security and platform management Simplified Experience  Up and running in hours  Minimal ongoing administration  Standard interfaces to best of breed Analytics, BI, and data integration tools  Built-in analytics capabilities allow users to derive insight from data quickly  Easy connectivity to other Big Data Platform components IBM PureData System for Analytics The Simple Appliance for Serious Analytics
    27. 27. System for Analytics Delivering data services for analytics IBM PureData System for Analytics Optimized exclusively for analytic data workloads Speed  10-100x faster than traditional custom systems*  Patented MPP hardware acceleration (Massively Parallel Processing) Simplicity  Data load ready in hours  No database indexes  No tuning  No storage administration Scalability  Peta-scale data capacity Smart  Designed to runs complex analytics in minutes, not hours  Richest set of in-database analytics * Based on IBM customers' reported results. "Traditional custom systems" refers to systems that are not professionally pre-built, pre-tested and optimized. Individual results may vary.
    28. 28. Spend Less Time Managing and More Time Innovating Simplicity and Ease of Administration  No dbspace/tablespace sizing and configuration  No redo/physical/Logical log sizing and configuration  No page/block sizing and configuration for tables  No extent sizing and configuration for tables  No Temp space allocation and monitoring  No RAID level decisions for dbspaces  No logical volume creations of files  No integration of OS kernel recommendations  No maintenance of OS recommended patch levels  No JAD sessions to configure host/network/storage Data Experts, not Database Experts  Easy Administration Portal  No software installation  No indexes and tuning  No storage administration
    29. 29. BI Reporting and Ad-Hoc Analysis  What happened?  When and where?  How much? Predictive Analytics  What will happen?  What will the impact be? Optimization  What is the best choice? © 2012 IBM Corporation29 PureData System for Analytics Takes Analytics Beyond Reporting
    30. 30. Simplify Move analytics into the Data Warehouse – Integrate the server, storage and database into one optimized package – Move complex analytics into the database – Integrated, high performance analytics within the data warehouse Server Storage Database Analytics
    31. 31. PureData System for Analytics N2002-002 Hardware Overview  User Data Capacity: 32 TB*  Data Scan Speed: 72 TB/hr*  Load Speed (per system): 1+ TB/hr  Power Requirements: 3.2 kW  Cooling Requirements: 10,850 BTU/hr * Assuming 4X compression 2 Hosts (Active-Passive)  2 6-Core Intel Sandy Bridge CPUs  7x300 GB SAS Drives  Red Hat Linux 6 64-bit 2 Disk Enclosures  48 600 GB SAS2 Drives  Using RAID 1 Mirror  40 for User Data, 4 Spares, 4 for S-Blades 2 PureData for Analytics S-Blades™  2 Intel 8 Core 2+ GHz CPUs  2 8-Engine Xilinx Virtex-6 FPGAs  128 GB RAM + 8 GB slice buffer  Linux 64-bit Kernel  Faster, next generation CPUs  Smaller, faster drives for increased performance & faster failover  2X as many spares for better resiliency  S-Blades contain faster, next generation CPUs and FPGAs
    32. 32. What Makes PureData System for Analytics Different? Speed Simplicity Scalability Smart Up to 2000X faster than before Growing by 30% every month “Netezza has allowed us to reduce the complexity of regulatory reporting and processing of exchange data from days down to minutes.” 200X faster than Oracle system ROI in less than 3 months Up and running 6 months before having any training “Allowing the business users access to the Netezza box was what sold it.” - Steve Taff, Executive Dir. of IT Services 1 PB on Netezza 7 years of historical data 100-200% annual data growth “NYSE … has replaced an Oracle IO relational database with a data warehousing appliance from Netezza, allowing it to conduct rapid searches of 650 terabytes of data.” - SUNY Buffalo researchers reduced the time to perform quintillions of computations from 27 hours to 12 minutes “Once we had the data on Netezza we were able to do the same analysis and much more complex analysis in minutes. The research draws on medical records, lab results, MRI scans, and patient surveys.” - Dr. Murali Ramanathan, SUNY Buffalo
    33. 33. Concurrency, Performance, I/O efficiency and Manageability Performance Management & Efficiency  Performance improvements in:  Optimizer efficiency  Memory management  Communications protocols  Workload management  Faster, Better, and completely transparent to the end-user  Always improving throughput and concurrency for tactical queries  Up to 200 queries/second micro analytic workloads  Directed Data Processing increases throughput for tactical queries Resiliency and Fault Tolerance  Blade level resilience for continuous high performance  Enhanced automatic system software resilience for enterprise level requirements
    34. 34. IBM Netezza Analytics: Built-In Features and Capabilities BI Tools Visualization Tools PureData for Analytics AMPP Platform Software Development Kit IBM In-Database Analytics 3rd Party In-Database Analytics IBM InfoSphere Streams Tanay GPU Appliance by Fuzzy Logix IBM InfoSphere BigInsights Cloudera Apache Hadoop IBM SPSS SAS Revolution Analytics Eclipse
    35. 35. Data In Loading the PureData System for Analytics Data Integration – Ab Initio – Cloudera – Composite Software – IBM Big Insights – IBM Information Server – IBM InfoSphere Streams – Informatica – Oracle Data Integrator – Oracle GoldenGate – SAP Business Objects SQLODBCJDBCOLE-DB
    36. 36. Querying the PureData System for Analytics Reporting and Analysis – IBM Cognos – IBM SPSS – IBM Unica – Information Builders – Kalido – KXEN – Microsoft Excel – MicroStrategy – Oracle OBIEE – SAP Business Objects – SAS – Actuate Data Out SQLODBCJDBCOLE-DB
    37. 37. Netezza Platform Software V7.0 Synergy and Integration with IBM and Partner products  Spatial Support with ESRI inside the box  zLinux support for JDBC and ODBC Connectors for Cognos and InfoServer (DataStage)  Backup/Restore Updated Certifications:  Tivoli Storage Manager  EMC Networker (Legato)  NetBackup (Veritas)  IDAA – DB2 Analytics Accelerator for z/OS  EBCDIC mapping for single byte encoding Internal Use Only | Do Not Distribute
    38. 38. IBM Netezza Platform Software (NPS) 7.x Improved Concurrency, Performance, I/O efficiency and Manageability Better Performance Improved Management & Efficiency  Netezza Performance Portal 2.0  More than half a dozen performance improvements in:  Optimizer efficiency  Memory management  Communications protocols  Workload management  Faster, better, and completely transparent to the end-user  NPS 7.x provides much greater throughput for tactical queries vs. NPS 6.x1  Directed Data Processing for improved concurrency and better performance  Page Level Zone Maps eliminating unnecessary disk scanning 1 Based on internal benchmarking. NPS 7 refers to IBM Netezza Platform Software v.7 and NPS 6 refers to IBM Netezza Platform Software v.6. NPS is the platform software included with the IBM PureData System for Analytics appliance.
    39. 39. Simplicity and Ease of Administration • Monitor System Resources • Perform System Administration • Understand & Predict Capacity IBM Netezza Performance Portal 2.0 Consolidating WebAdmin and Portal for Simple Admin – Simple web user interface – Part of the PureData System for Analytics – New functional and usability enhancements – Administrative Functions – Hardware view & alerts – Database objects administration – User & Group management – View active sessions – Workload Management – View Events – Table skew/storage search – Capacity Planning – Monitor enhancements – Usability improvements – allow to resize monitors and mark not- monitored periods – Customer requested improvements – Show locks
    40. 40. Integrated by Design IBM Netezza Analytics Version 2.0 Netezza In-Database Analytics 2.0  Transformations  Mathematical  Geospatial  Predictive  Statistics  Time Series  Data Mining  No data movement  Analyze deep and wide data  High performance, parallel computation
    41. 41.  Basic Math*  Permutation and Combination*  Greatest Common Divisor and Least Common Multiple*  Conversion of Values*  Exponential and Logarithm*  Gamma and Beta Functions  Matrix Algebra+  Area Under Curve*  Interpolation Methods* Transformations MathematicalTime Series  Linear Regression+  Logistic Regression+  Classification  Bayesian  Sampling  Model Testing  Geospatial Data Type  Geometric Functions  Geometric Analysis Predictive Geospatial * Fuzzy Logix DB Lytix capabilities + Netezza Analytics and Fuzzy Logix DB Lytix capabilities  Data Profiling / Descriptive Statistics+  General Diagnostics  Statistics+  Sampling  Data prep Pre-Built In-Database Analytics  Descriptive Statistics+  Distance Measures*  Hypothesis Testing*  Chi-Square & Contingency Tables*  Univariate & Multivariate Distributions+  Monte Carlo Simulation*  Autoregressive+  Forecasting*  Association Rules+  Clustering+  Feature Extraction+  Discriminant Analysis* Data Mining Statistics
    42. 42. IBM DB2 Analytics Accelerator Now even faster with PureData System for Analytics N200X  NEW! Smaller entry point for z customers to accelerate analytic workloads  Query data at high speeds—by combining System z and Netezza hardware-accelerated analytics to speed complex query processing.  Extend the capabilities of DB2 for z/OS—to support a cost-effective data warehousing and business intelligence solution.  Lower operating costs—by reducing System z disk requirements and offloading query workloads to a high- performance platform.  Cost effective solution
    43. 43. Consolidate 100+ deployments to ONE analytics environment Business Innovation with Analytics on zEnterprise IBM focuses on the business not technical constraints delivering Analytics to 450,000+ users, drawing from over 660 data sources and generating more than 55,000 reports daily. “IBM has reduced end to end information delivery up to 400% and database query response times up to 4700%. Using IBM DB2 Analytics Accelerator, end users now have the information they need to make highly informed decisions orders of magnitude faster across key enterprise business processes including orders, billing; invoicing, purchasing and financial management.” James Correa, Senior Manager, IBM Business Analytics Centre of Competency
    44. 44. IBM’s PureData System for Analytics Provides – FASTEST time to value on the market today – Optimized analytics performance for Big Data – Simple administration for fast and agile deployment – Large library of analytic functions to accelerate analytic performance System for Analytics