Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

When Streaming Becomes Strategic


Published on

We’re in the midst of an exciting paradigm shift in terms of how we process events data in real time to better react to business opportunities or risk. To stay ahead of your competition, you need the ability to react to business-critical events as they happen. These critical events are created through diverse sources such as social interaction, machine sensors, or a customer transaction. How can you understand the meaning and context of these events that ultimately define your business?

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

When Streaming Becomes Strategic

  1. 1. © 2016 MapR Technologies© 2016 MapR Technologies When Streaming Becomes Strategic
  2. 2. © 2016 MapR Technologies Today’s Presenters Jack Norris SVP – Data Applications @Norrisjack Robin Bloor Chief Analyst and Co-founder @robinbloor
  3. 3. © 2016 MapR Technologies Agenda • The data movement issue • Hadoop and the problem and gravity • Streaming architecture in big data • The Event-Insight-Outcome framework • Introduction to MapR Streams • Customer cases studies
  4. 4. © 2016 MapR Technologies • It has always been necessary to move data because centralized computing does not scale • Data volumes grow at 50-60% per annum, and hence has increasing inertia (gravity) • Data is born distributed and its level of distribution will increase with time • Processing data in flight Stream processing is becoming both common and necessary for some applications • Hadoop’s HDFS the first truly scalable file system also has scalability limits The Data Movement Issue
  5. 5. © 2016 MapR Technologies • We no longer process transactions we process events (click-streams, log files, IoT, etc.) • Batching is a software process used for the sake of efficiency. • Batches are becoming micro-batches and streams. • Some analytic applications can only work processing streams. • OLTP applications are stream processing of a kind. • But, fast queries on large data collections require data pools and data lakes with powerful query engines above them. • So the “database/data warehouse” does not vanish. The Streaming Dynamic
  6. 6. © 2016 MapR Technologies What is an Event? An event is an action or occurrence detected by a program. Events can be user actions (such as clicking a link on a web page or selling a stock), sensor actions (such as reading temperature), system occurrences (such as server crash). Examples: • Retail: Item sold, item out of stock, payment accepted, payment rejected etc. • Telco: Call initiated, call ended, call dropped etc. • IoT: Temperature reading, pressure reading, moisture level reading etc. • Healthcare: Vital signs unstable, patient released, patient billed, image taken etc. • IT: System crash, unautorized access, login failed etc. • Automobile: Engine error detected, tyre pressure low etc.
  7. 7. © 2016 MapR Technologies The Way We Were • This is beginning to fail because it doesn’t scale and it is expensive • The architectural weakness is in the staging and in a central data repository • Staging became a problem because of unstructured data Data Warehouse Data Marts Transactional Systems File(s) Data Staging ETL ETL ETL Queries
  8. 8. © 2016 MapR Technologies The Data Lake Concept Data Lake Applications • ETL – data acquisition • Data Lineage – for analytic usage • Metadata Discovery – external data (at least) • Metadata management – the data catalog • Governance – many aspects • Life Cycle Management – to archive or deletion • MDM – business glossary or ontology • ETL – to data engines • Direct Applications This is far too much work for a single Hadoop instance Collect Data Prep Static Data Sources Data Streams Data Lake or Hub MetaData Discovery MetaData Management Data Cleansing Data Lineage MDM ETL Life Cycle Mgt GovernETL Analytics or BI Apps Data Warehouse or Big Data DBMS
  9. 9. © 2016 MapR Technologies The Likely Future • This is a remarkably flexible architecture allowing for both data distribution and concentration • It accommodates streaming, normal application latency and high concurrency as occurs with database • It genuinely scales Data Sources Database Queries Data Lake Apps Streaming Apps Hadoop Instance Hadoop Instance Hadoop Instance Fast Pipe Fast Pipe Urgent Streams
  10. 10. © 2016 MapR Technologies Increased computer power reduces latencies compressing time ● Event = action or transaction ● Insight = analysis ● Outcome = business result This can move to ● Event = action ● Insight = predictive analysis of action and response ● Event = transaction ● Insight = analysis of aggregation ● Outcome = different business result Analytics is gradually becoming part of the business transaction rather than a later activity Event-Insight-Outcome
  11. 11. © 2016 MapR Technologies© 2016 MapR Technologies© 2016 MapR Technologies The Event-Insight-Outcome Framework
  12. 12. © 2016 MapR Technologies Events, Insights, and Outcomes – A Framework Events Fast Insights Historical Perspective Deep Insights Real-time Actions Business Outcomes
  13. 13. © 2016 MapR Technologies Questions to ask: • What “events” are most relevant? • What business benefits accrue by spotting trends, anomalies, and patterns in real-time? • What insights can be gleaned by looking at trends, anomalies, and patterns in a deeper, more historical context? • What business actions need to be the result of these insights? Applying EIO Framework to a Customer or Vertical
  14. 14. © 2016 MapR Technologies© 2016 MapR Technologies© 2016 MapR Technologies MapR Streams and the Converged Data Platform
  15. 15. © 2016 MapR Technologies Without a Converged Platform Open Source DatabaseStreams Enterprise Storage Batch Loads Real Time Apps Streaming Sources
  16. 16. © 2016 MapR Technologies The Converged Big Data Platform Open Source Streams Enterprise Storage Database MapR Converged Big Data Platform
  17. 17. © 2016 MapR Technologies The Converged Big Data Platform MapR Converged Big Data Platform
  18. 18. © 2016 MapR Technologies Life with a Converged Platform Stream ProcessingBulk ProcessingSources/Apps Enterprise-Grade Platform Services Data Web-Scale Storage MapR-FS MapR-DB MapR Streams Database Event Streaming Global Namespace High Availability Data Protection Self-healing Unified Security Real-time Multi-tenancy
  19. 19. © 2016 MapR Technologies MapR Streams: Global Pub-sub Event Streaming System for Big Data “Publish” means writing events to MapR Streams topics “Subscribe” means reading events from MapR Streams topics Guaranteed, immediate delivery to all consumers. Tie together geo-dispersed clusters. Worldwide. Standard real-time API (Kafka). Integrates with Spark Streaming, Storm, Apex, and Flink To pi c Stream Producers Remote sites and consumers Batch analytics Topic Replication Consumers Consumers
  20. 20. © 2016 MapR Technologies MapR Streams Benefits Simpler and Faster Architecture • Converged platform with file storage and database reduces data movement, data latency, hardware cost, and administration cost • Event streaming and stream processing in the same cluster enables faster processing • Unified security framework with files and database tables reduces administration cost around setting up and enforcing security policies • Multi-tenant - topic isolation, quotas, data placement control allows multiple isolated streaming applications to run on the same cluster reducing hardware cost and data movement
  21. 21. © 2016 MapR Technologies MapR Streams Benefits • Global data replication enables disaster recovery • One unified view of all data created and distributed • Ingest more events to enable faster insights and Hold on to events longer to enable deeper insights
  22. 22. © 2016 MapR Technologies 22© 2016 MapR Technologies 22MapR Confidential © 2016 MapR Technologies© 2016 MapR Technologies Use Cases
  23. 23. © 2016 MapR Technologies Altitude Digital
  24. 24. © 2016 MapR Technologies Largest Biometric Database in the World PEOPLE 1.2B PEOPLE
  25. 25. © 2016 MapR Technologies Yield Management Optimization Global Semi-conductor Company
  26. 26. © 2016 MapR Technologies National Oilwell Varco (NOV)
  27. 27. © 2016 MapR Technologies JSON DB (MapR-DB) Graph DB (Titan on MapR-DB) Search Engine (Elastic-Search) Transforming the Health Care Ecosystem Electronic Medical Records “The Stream is the System of Record” –Brad Anderson VP Big Data Informatics
  28. 28. © 2016 MapR Technologies Q&AEngage with us! 1. Whitepaper: When Streaming Becomes Strategic 2. Book: Streaming Architecture – Kafka and MapR Streams 3. Get Answers: MapR Converge Community: