Steve Jenkins - Business Opportunities for Big Data in the Enterprise


Published on

Steve Jenkins from MapR Technologies presentation from our Big Data breakfast conference

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • McKinsey:
  • MapR forms the nexus, both on-site and cloud (Google, AWS)
  • From Big Data over information to signal to insights — need the right tool set and skills for this!
  • MapR has been selected by two of the companies most experienced with MapReduce technology which is a testament to the technology advanges of MapR’s distribution. Amazon through its Elastic MapReduce service (EMR) hosted over 2 million clusters in the past year. Amazon selected MapR to complement EMR as the only commercial Hadoop distribution being offered, sold and supported as a service by Amazon to its customers. MapR was also selected by Google – the pioneer of MapReduce and the company whose white paper on MapReduce inspired the creation of Hadoop – has also selected MapR to make our distribution available on Google Compute Engine. Hadoop in the cloud makes a great deal of sense: the elastic resource allocation that cloud computing is premised on works well for cluster-based data processing infrastructure used on varying analyses and data sets of indeterminate size. MapR has unique features such as mirroring between sites and multi-tenancy support that further enhance cloud deployments
  • Steve Jenkins - Business Opportunities for Big Data in the Enterprise

    1. 1. Steve Jenkins MapR Technologies ‘Business Opportunities for Big Data in the Enterprise ‘
    2. 2. 2©MapR Technologies - Confidential Big Data in the Enterprise Steve Jenkins VP EMEA MapR Technologies
    3. 3. 3©MapR Technologies - Confidential Business Value
    4. 4. 4©MapR Technologies - Confidential Changing landscape  90% digital data created in last two years  2.7 Zettabytes in 2012 predicting 7.9 Zettabytes in 2015 • 1,000 Exabytes or 1 Billion Terabytes  6 billion phone subscriptions • 87% world population  1.011 billion facebook users • 604 million users login from mobile devices, monthly  400 million tweets a day • 84 million access by mobile
    5. 5. 5©MapR Technologies - Confidential Too much data ? Retail industry  39% infrequent collection, not fast enough  42% could not link data at individual level  45% not using effectively  Only sample 10% of data
    6. 6. 6©MapR Technologies - Confidential  “The use of big data will become a key basis of competition and growth for individual firms. In most industries, established competitors and new entrants alike will leverage data-driven strategies to innovate, compete, and capture value from deep and up-to-real-time information.” – McKinsey & Company  “The size, complexity of formats and speed of delivery exceeds the capabilities of traditional data management technologies” – Gartner  "The bringing together of a vast amount of data from public and private sources, combined with the intuition of business and thought leaders and the speed and affordability of today's computers, is what Big Data is all about.” – IDC
    7. 7. 7©MapR Technologies - Confidential Across all verticals, typical use cases  Logistics  Fraud  Loyalty programmes  Sentiment analysis  ETA calculations  Customer insight  Gene sequening  Operations
    8. 8. 8©MapR Technologies - Confidential Biggest challenges  80% Finding talent  72% Training and education  76% Identifying the correct tools  32% Siloed data, non cooperation  Identifying the correct resources and use case • Data Scientist • Analytical capability Based on 300 interviews by Infochimp
    9. 9. 9©MapR Technologies - Confidential
    10. 10. 10©MapR Technologies - Confidential From Big Data to Insights
    11. 11. 11©MapR Technologies - Confidential General Observations  Analytics becoming a critical component in business environments  Base decisions on data  Work with existing applications  Principle: keep all data around – benefit from all data – Human generated – Machine generated  Pioneered at Google and Amazon
    12. 12. 12©MapR Technologies - Confidential Hadoop Growth
    13. 13. 13©MapR Technologies - Confidential The Hadoop Ecosystem
    14. 14. 14©MapR Technologies - Confidential Case studies – unlocking the power of Big Data….  Financial Services (customer insights, fraud detection, etc.)  Global Telecommunications - Data Warehouse Augmentation  Petroleum - Trade, Logistics & Transportation  Retail application
    15. 15. 15©MapR Technologies - Confidential Case study – Credit Card Company Fraud detection Personalized offers Fraud investigation tool Fraud investigator Fraud model Recommendation table Queries on IT logs MapR Big Data Platform Credit card transactions
    16. 16. 16©MapR Technologies - Confidential Arrival of Big Data Impacts Data Warehouse BIG DATA Volume Variety Velocity Prohibitively expensive storage costs Inability to process unstructured formats Faster arrival and processing needs How can a Data Warehouse leverage Big Data?
    17. 17. 17©MapR Technologies - Confidential Case study – Data Warehouse Augmentation  Problem: – Major telecom vendor – Key step in billing pipeline handled by data warehouse (EDW) – EDW at maximum capacity – Multiple rounds of software optimization already done  Revenue limiting (= career limiting) bottleneck  Solution: Use MapR to off-load ETL processes that don’t fit EDW capabilities
    18. 18. 18©MapR Technologies - Confidential Clean Conform Normalize Present AccessTransformExtract Billing Systems Clean Conform NormalizeTransformExtractExtract Clean Conform Transform Normalize Present Access Billing Systems Current ETL Pipeline Hybrid Solution Pipeline Teradata Hadoop Teradata DataStage Data Warehouse Augmentation
    19. 19. 19©MapR Technologies - Confidential Results of TCO Evaluation  CapEx: Cost avoidance for annual Teradata adds  Storage: 20x storage good for next 5 years  Cost: 100x cost reduction  Scale Out Architecture: New nodes can be added on the fly  No Disruption: Hybrid solution ensures no change to upstream/downstream business systems Solution Technology 5-Yr TCO Existing Teradata $66,950,000 New Hybrid: Teradata + Hadoop $33,000,000 Total Cost Savings $33,950,000 One Time Hadoop Investment of ~$6.5M Provides $33.9M Cost Savings
    20. 20. 20©MapR Technologies - Confidential Use case – Geo-spatial & time series dashboarding  Data sources – stock transactions – vessel positions – weather  Goal: provide aggregated overview + drill-down capability in a dashboard  Batch-generated overview (Hadoop’s MapReduce, HBase/M7)  Interactive, ad-hoc drill-down (HBase/M7, Apache Drill)
    21. 21. 21©MapR Technologies - Confidential Use case – Geo-spatial & time series dashboarding batch-generated overview interactive, ad-hoc drill-down storage and access at scale dashboard
    22. 22. 22©MapR Technologies - Confidential Combine Different Data Sources Streaming writes to Hadoop Retail purchase Info Real-time offers Hadoop POS/Online Data
    23. 23. 23©MapR Technologies - Confidential MapR
    24. 24. 24©MapR Technologies - Confidential MapR Distribution for Apache Hadoop  Complete Hadoop distribution  Comprehensive management suite  Industry-standard interfaces  Combines open source packages with Enterprise-grade dependability  Higher performance
    25. 25. 25©MapR Technologies - Confidential MapR: The Enterprise Grade Distribution
    26. 26. 26©MapR Technologies - Confidential MapR Supports Broad Set of Customers  Log analysis  HBase  Customer targeting  Social media analysis  Customer Revenue Analytics  ETL Offload  Advertising exchange analysis and optimization  Clickstream Analysis  Quality profiling/field failure analysis  Enterprise Grade Platform  COOP features  Monitoring and measuring online behavior  Fraud Detection  Channel analytics  Recommendation Engine  Fraud detection and Prevention  Customer Behavior Analysis  Brand Monitoring  Customer targeting  Viewer Behavioral analytics  Recommendation Engine  Family tree connections  Global threat analytics  Virus analysis  Patient care monitoring Leading RetailerGlobal Credit Card Issuer  Intrusion detection & prevention  Forensic analysis
    27. 27. 27©MapR Technologies - Confidential Thank You
    28. 28. 28©MapR Technologies - Confidential Industry Leaders Choose MapR in the Cloud Google chose MapR to provide Hadoop on Google Compute Engine Amazon EMR is the largest Hadoop provider in revenue and # of clusters