Big Data
Upcoming SlideShare
Loading in...5




Total Views
Views on SlideShare
Embed Views



2 Embeds 22 21 1



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Big Data Big Data Presentation Transcript

  • Everything You Need to Know About‘Big Data’, BI and Data AccelerationAdrian WestmorelandDecember, 2012
  • Safe harbor statementThe information in this presentation is confidential and proprietary to SAP and may not be disclosed withoutthe permission of SAP. This presentation is not subject to your license agreement or any other service orsubscription agreement with SAP. SAP has no obligation to pursue any course of business outlined in thisdocument or any related presentation, or to develop or release any functionality mentioned therein. Thisdocument, or any related presentation and SAPs strategy and possible future developments, products andor platforms directions and functionality are all subject to change and may be changed by SAP at any timefor any reason without notice. The information on this document is not a commitment, promise or legalobligation to deliver any material, code or functionality. This document is provided without a warranty of anykind, either express or implied, including but not limited to, the implied warranties of merchantability, fitnessfor a particular purpose, or non-infringement. This document is for informational purposes and may not beincorporated into a contract. SAP assumes no responsibility for errors or omissions in this document, exceptif such damages were caused by SAP intentionally or grossly negligent.All forward-looking statements are subject to various risks and uncertainties that could cause actual resultsto differ materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements, which speak only as of their dates, and they should not be relied upon in makingpurchasing decisions.© 2012 SAP AG. All rights reserved. 2
  • AgendaOver the next 30 minutesYou’ll gain an understanding of Big Data technologies, opportunities and challenges including howtechnology innovations are changing the Big Data landscape.You’ll see a brief overview of SAP’s Big Data architecture.You’ll discover how other companies are utilizing Big Data for their benefit.© 2012 SAP AG. All rights reserved. 3
  • Introduction to ”Big Data”
  • SOCIAL© 2012 SAP AG. All rights reserved. 5
  • VELOCITY Worldwide digital content will double in 18 months, and every 18 months thereafter.VARIABILITY IDCConfiguring modern software can be VALUEextremely difficult since a good Mobile Empowerment of the End User is the goalconfiguration depends (at least) on the Inventory CRM Data of Enterprise Software. GPShardware environment, the workload, the load Emails SAPintensity, and the target behavior Planning Demand Tweets Instant Messages MassConf Paper, ACM Opportunities Speed Velocity Customer Things Service Calls VOLUME Transactions Sales Orders VARIETY In 2005, humankind created 80% of enterprise data will 150 exabytes of information. be unstructured, spanning In 2011, 1,200 exabytes will be created. VALIDITY traditional and non traditional sources. The Economist Ensure that the information was created in Gartner accordance with complete understanding of the use cases and includes all the other aspects of data quality. Gartner © 2012 SAP AG. All rights reserved. 7
  • IDC “Big Data” definitionIDC’s “Big Data” definition utilizes criteria and steps to determine whether a use case andassociated technology and services should be included in the “Big Data” market sizing. Theseinclude the following scenarios:Scenarios:• Deployments where the data collected is over 100TB (data collected, not stored, accounts for the use of in-memory technology where data may not be stored on a disk)• Deployments of ultra-high-speed messaging technology for real-time, streaming data capture, and monitoring• Deployments where the data sets may not be very large today but are growing very rapidly at a rate of 60% or more annuallyNext, IDC evaluates whether, for each of the above scenarios, the technology is deployed onscale-out infrastructure, and finally, IDC evaluates whether the deployments include two or moredata types or data sources, and/or include high-speed data sources such as click-streamtracking or monitoring of machine-generated data .© 2012 SAP AG. All rights reserved. 8
  • Open Source Big Data – Even More Confused? Present Analytics? Applications? Mobile? Process Azkaban Oozie Pig Hive Hadoop MapReduce S4 Storm Store Voldemort Cassandra Hbase Collect Kafka Flume Scribe© 2012 SAP AG. All rights reserved. 10
  • New storage and processing techniques required Real-time queries High value data Targeted data read In-memory Columnar Row Distributed Batch queries Flexible data sets All data read© 2012 SAP AG. All rights reserved. 11
  • In-Memory computingRethink Yesterday Today CPU Multi-Core 64-bit address space Massively Parallel supports 2TB RAM SingleOptimized Platform 100GB/s throughput CPU Memory + Memory Row and Column Store Partitioning No aggregates + + ++ Software and data reside on HDD Insert Only on Delta Compression Disk Disk Logging and Backup – Solid State / Flash / HDD • I/O constraint • Leverage latest advances in hardware • Support many platforms • Minimize I/O time • Optimized for none • Optimized for x86 platform© 2012 SAP AG. All rights reserved. 12
  • The future of database technology 1990 – Cost per Terabyte Traditional Database Adoption Disk $9,000,000 Memory $106,000,000 In-memory Computing Adoption 2012 – Cost per Terabyte Disk $60 Memory $4,900 Falling prices move processing from Disk/SSD to In-Memory Time© 2012 SAP AG. All rights reserved. 13
  • The future of database technology Main memory reference 100 ns Compress 1K bytes with Zippy 3,000 ns = 3 µs Send 2K bytes over 1 Gbps network 20,000 ns = 20 µs SSD random read 150,000 ns = 150 µs Read 1 MB sequentially from memory 250,000 ns = 250 µs Round trip within same datacenter 500,000 ns = 0.5 ms Read 1 MB sequentially from SSD* 1,000,000 ns = 1 ms Disk seek 10,000,000 ns = 10 ms Read 1 MB sequentially from disk 20,000,000 ns = 20 ms Send packet Canada->Europe->Canada 50,000,000 ns = 150 ms *Assuming ~1GB/sec SSD Data by [Jeff Dean]( Originally by [Peter Norvig](© 2012 SAP AG. All rights reserved. 14
  • The future of database technology Lets multiply all these durations by a billion: Hour: Main memory reference 100 s Brushing your teeth Compress 1K bytes with Zippy 50 min One episode of a TV show (including ad breaks) Day: Send 2K bytes over 1 Gbps network 5.5 hr From lunch to end of work day Week SSD random read 1.7 days A normal weekend Read 1 MB sequentially from memory 2.9 days A long weekend Round trip within same datacenter 5.8 days A vacation Read 1 MB sequentially from SSD 11.6 days A European vacation Year Disk seek 16.5 weeks A semester in university Read 1 MB sequentially from disk 7.8 months The above two together 1 year Decade Send packet Canada-Europe-Canada 4.8 years Average time it takes to complete a bachelors degree© 2012 SAP AG. All rights reserved. 15
  • SAP HANA – Overview What is SAP HANA? SAP BusinessObjects BI SAP Applications Solutions SAP NetWeaver BW  A flexible, data source agnostic in-memory analytic appliance to quickly process and SAP HANA™ analyze large volumes of transactional data in SAP HANA SAP Information real-time Studio Composer SAP HANA Database  A modern platform that serves as the foundation Calculation Engine Row & Column In-Memory to develop a new class of real-time applications Real-Time Data SAP BusinessObjects Replication Data Integrator  In-Memory Database that runs under SAP Non SAP Data NetWeaver BW for a supercharged data SAP Applications sources warehouse© 2012 SAP AG. All rights reserved. 16
  • Next generation SAP Real-time Data Platform SAP Business SAP NW BW SAP Big Data SAP Custom SAP Analytics 3rd Party Suite Applications Mobile Apps BI Clients On Premise / Cloud SAP Real-time Data Platform Open Developer API’s and Protocols Sybase PowerDesigner Common Landscape Common Modeling SAP Sybase Scale-Out Management SQLA MPP SAP Sybase ASE SAP Sybase IQ SAP HANA Platform SAP Sybase ESP SAP Sybase SAP Data SAP MDG and MDM Replication Server Services SAP Enterprise Information Management© 2012 SAP AG. All rights reserved. 17
  • Introducing SAP Big Data Processing Framework Provide optimized data management across each phase of the information lifecycle process and deliver real-time, actionable insights Collect Store Process Present Sybase Replication Server for  SAP HANA, ASE, or IQ for  SAP HANA and Sybase IQ for  SAP BusinessObjects BI real-time high value data real-time data store real-time high value data platform to display federated replication processing query results across Hadoop and  Sybase IQ for near-time data HANA/IQ to provide deep Sybase ESP for collecting store and multimedia data  Sybase Event Stream insights (dashboards, stream data storage Processor for real-time event visualization, data exploration, data processing predictive analysis, analytic SAP BusinessObjects Data  Hadoop for long-term, applications, and embedded BI in Services with Hadoop extended archive  Sybase IQ for federated query business applications) Connectors for collecting data w/ MapReduce API from disparate sources via batch  Hadoop/MapReduce for batch, explorative data processing© 2012 SAP AG. All rights reserved. 18
  • Fresh Direct “Our Food is Fresh. Our Customers Are Spoiled”© 2012 SAP AG. All rights reserved. 19
  • Parking Ticket Optimization© 2012 SAP AG. All rights reserved. 20
  • McLaren Group LimitedAutomotive Industry (Formula One) – Predict and Transform the outcome of races Product: Agile Datamart - POC 14,000x faster data analysis – from 5 Business Challenges hours to 1 second  Cut costs on expensive data scientists that currently help with the teams data analysts to measure and predict car’s performance  Better anticipate, accelerate and differentiate its business from competitors 99% predict Technical Challenges the outcome of a  Turbo-charge both the speed and depth of McLaren’s telemetry technology race  Process Big data and act on it rapidly to create the prescriptive intelligence in order to help transform the outcome of races Benefits  Real-time analysis of car sensor data – historical data and predictive models  Make immediate proactive corrections and avoid costly, dangerous incidents and win the race  Provide a technology engine that was integrated, scalable and delivered maximum performance “ ”“Transforming information into intelligence in real time is a cornerstone for McLaren’s winning formula – and increasingly critical for the future of every company,” JimHagemann Snabe, co-CEO, SAP AG"Using HANA we can hopefully automate decision making. People have always made decisions based on the data, but we want to get to the point where the system canmake the decision,“ Stuart Birrell , McLaren CIO© 2012 SAP AG. All rights reserved. 21
  • SunGard Leading software and IT services company Business Challenges  Enabling the building of newer and larger systems – allowing expansion into new markets and business areas. 1 Trillion rows data stored Technical Challenges  Handle very large and continuously growing volumes of data without performance degradation.  Existing system began to experience performance deterioration that was unacceptable to end-users 80 TB of compressed data Benefit  Slashes query response time regardless of data volumes  Enables analytics and reporting against virtually unlimited data“ SAP Sybase IQ is simple to manage and operate and it’s enabling us to easily build really big systems in a way that is cost-effective, manageable and ” sustainable…It doesn’t matter what we throw at it, it seems to take it in stride and give us a great response…We feel like it’s a solution that will carry us forward into uncharted territory. We see no limit to how far we can go with it. Product Architect, SunGard © 2012 SAP AG. All rights reserved. 22
  • SAP HANA + Hadoop + R Benefits 408,000x faster than traditional disk-based  Reduces time to detect variant DNA systems in PoC  In-memory accelerates predictive & correlation analysis 216x faster DNA analysis results - from  Optimized treatment plans based on DNA mutations 2-3 days to 20 minutes  Long-term study of DNA-based cancer treatment “ ”Genomic DNA analysis in real-time will transform how we enable comprehensive patient care to fight against cancer. SAP HANA will be the mission criticaland reliable data platform to make real-time cancer analytics into a reality. Separately, our internal technical comparison demonstrated that SAP HANAoutperforms a traditional disk-based system by factor of 408,000 when performing other types of data analysis. Yukihisa Kato, Director & Executive Officer, CTO, Research and Development Center, MITSUI KNOWLEDGE INDUSTRY CO.,LTD.© 2012 SAP AG. All rights reserved. 23
  • Thank You!Adrian WestmorelandSAP CanadaAdrian.westmoreland@sap.com604 647 8343
  • SAP BusinessObjects BI 4.0and SAP HANA
  • SAP HANAA platform for a new class of real-time analytics and applications SAP NetWeaver SAP Business Microsoft Others…(Open) Business Client Objects solutions Excel Real-time analytics Real-time apps SAP HANA Information Composer Application Services & Modeling Studio (e.g. HTML 5 Server) Text Search Planning and Predictive Analysis & Business Function R & Hadoop Calculation Engine Libraries integration Real-time replication In-memory database services Data services SAP Business Suite Third-party systems© 2012 SAP AG. All rights reserved. 27
  • Todays World Transactional Data System Warehouse / Marts OLTP OLAP Real Life Analysis and Insight Business Transaction Action Real-time Aggregation Limited flexibility due to posting pre-defined data structures Batch transfer to into Transactional Data Warehouse Long query run-times System Reporting Loss of detail Challenges Long Wait times for reports Large Volumes High Impact© 2012 SAP AG. All rights reserved. 28
  • What if this would all happen real-time? SAP HANA IN-MEMORY No Aggregation / No Data Staging / No Data Marts Real Life Analysis and Insight Business Transaction Action Real-time High Performance Fast, flexible and detail Loading into SAP Large Volume Data analytics over large volumes HANA Processing© 2012 SAP AG. All rights reserved. 29
  • Accelerated BI with SAP BusinessObjects and SAP HANAOne Unified and Complete BI Suite Addressing the Full Spectrum of BI on SAP HANA Discovery and Analysis Dashboards and Apps Reporting Discover. Predict. Create. Build Engaging Experiences Share Information Discover areas to optimize your business  Deliver engaging information to users where they  Securely distribute information across your need it organization Adapt data to business needs  Track key performance indicators and summary  Give users the ability to ask and answer their Tell your story with beautiful visualizations data own questions  Build custom experiences so users get what they  Build printable reports for operational efficiency need quickly© 2012 SAP AG. All rights reserved. 30
  • Discovery and AnalysisDiscover. Predict. Create. Agility for business analysts and business users Portfolio Discover trends, outliers and areas of interest in your business Visual Intelligence Adapt to business scenarios by combining, manipulating, and enriching data Explorer Tell your story with self-service visualizations and analytics Analysis Forecast and predict future outcomes Predictive Analysis© 2012 SAP AG. All rights reserved. 31
  • Dashboards and AppsBuild Engaging Experiences Build engaging, visual dashboards Portfolio Powerful environment to build interactive and visually appealing analytics Design Studio Rich set of controls: buttons, list boxes, drop-down, crosstabs, charts… Dashboards (aka Xcelsius®) Use custom code to extend and build workflows© 2012 SAP AG. All rights reserved. 32
  • ReportingShare Information High productivity design for report designers Portfolio  Quickly build formatted reports on any data source Web Intelligence Securely distribute reports both internally and externally Crystal Reports Minimize IT support costs by empowering end users to easily create and modify their own reports Enhance custom applications with embedded reports© 2012 SAP AG. All rights reserved. 33
  • BI 4 Platform: Open, Agnostic, and UnifiedAccess any data, consume information anywhere Embedded MS Office Enterprise On Demand Services Mobile Devices Browsers Content Portals Discovery and Analysis Dashboards and Apps Reporting Business Intelligence Platform Universe Semantic Layer ERP EDW Personal Unstructured© 2012 SAP AG. All rights reserved. 34
  • Thank You!Adrian WestmorelandSAP CanadaAdrian.westmoreland@sap.com604 647 8343
  • © 2012 SAP AG. All rights reserved.No part of this publication may be reproduced or transmitted in any form or for any purpose without Google App Engine, Google Apps, Google Checkout, Google Data API, Google Maps, Google Mobilethe express permission of SAP AG. The information contained herein may be changed without prior Ads, Google Mobile Updater, Google Mobile, Google Store, Google Sync, Google Updater, Googlenotice. Voice, Google Mail, Gmail, YouTube, Dalvik and Android are trademarks or registered trademarks ofSome software products marketed by SAP AG and its distributors contain proprietary software Google Inc.components of other software vendors. INTERMEC is a registered trademark of Intermec Technologies Corporation.Microsoft, Windows, Excel, Outlook, PowerPoint, Silverlight, and Visual Studio are registeredtrademarks of Microsoft Corporation. Wi-Fi is a registered trademark of Wi-Fi Alliance.IBM, DB2, DB2 Universal Database, System i, System i5, System p, System p5, System x, System z, Bluetooth is a registered trademark of Bluetooth SIG Inc.System z10, z10, z/VM, z/OS, OS/390, zEnterprise, PowerVM, Power Architecture, Power Systems, Motorola is a registered trademark of Motorola Trademark Holdings LLC.POWER7, POWER6+, POWER6, POWER, PowerHA, pureScale, PowerPC, BladeCenter, SystemStorage, Storwize, Computop is a registered trademark of Computop Wirtschaftsinformatik GmbH.XIV, GPFS, HACMP, RETAIN, DB2 Connect, RACF, Redbooks, OS/2, AIX, Intelligent Miner, SAP, R/3, SAP NetWeaver, Duet, PartnerEdge, ByDesign, SAP BusinessObjects Explorer,WebSphere, Tivoli, Informix, and Smarter Planet are trademarks or registered trademarks of IBM StreamWork,Corporation. SAP HANA, and other SAP products and services mentioned herein as well as their respective logosLinux is the registered trademark of Linus Torvalds in the United States and other countries. are trademarks or registered trademarks of SAP AG in Germany and other countries.Adobe, the Adobe logo, Acrobat, PostScript, and Reader are trademarks or registered trademarks of Business Objects and the Business Objects logo, BusinessObjects, Crystal Reports, CrystalAdobe Systems Incorporated in the United States and other countries. Decisions, Web Intelligence, Xcelsius, and other Business Objects products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of Business ObjectsOracle and Java are registered trademarks of Oracle and its affiliates. Software Ltd. Business Objects is an SAP company.UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group. Sybase and Adaptive Server, iAnywhere, Sybase 365, SQL Anywhere, and other Sybase productsCitrix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and MultiWin are and services mentioned herein as well as their respective logos are trademarks or registeredtrademarks or registered trademarks of Citrix Systems Inc. trademarks of Sybase Inc. Sybase is an SAP company.HTML, XML, XHTML, and W3C are trademarks or registered trademarks of W3C®, World Wide Web Crossgate, m@gic EDDY, B2B 360°, and B2B 360° Services are registered trademarks ofConsortium, Massachusetts Institute of Technology. Crossgate AG in Germany and other countries. Crossgate is an SAP company.Apple, App Store, iBooks, iPad, iPhone, iPhoto, iPod, iTunes, Multi-Touch, Objective-C, Retina,Safari, Siri, All other product and service names mentioned are the trademarks of their respective companies.and Xcode are trademarks or registered trademarks of Apple Inc. Data contained in this document serves informational purposes only. National product specifications may vary.IOS is a registered trademark of Cisco Systems Inc. © 2012 SAP AG. All rights reserved. 36 The information in this document is proprietary to SAP. No part of this document may be reproduced,RIM, BlackBerry, BBM, BlackBerry Curve, BlackBerry Bold, BlackBerry Pearl, BlackBerry Torch,