Virdata: lessons learned from the Internet of Things and M2M Cloud Services @ IBM Big Data Developers Meetup

8,772 views

Published on

Presentation I gave at the IBM Big Data Developers meetup group in San Jose, CA.

There is also a video available of this talk at:
https://www.youtube.com/watch?v=TSt49yPBmW0&t=7m59s

Published in: Technology, Business

Virdata: lessons learned from the Internet of Things and M2M Cloud Services @ IBM Big Data Developers Meetup

  1. 1. Big Data Developers - Virdata, Internet of Things #virdata Big Data & IoT: lessons learned Big Data Developers Meetup, San Jose, CA - June 5, 2014 #virdata | @nathan_gs
  2. 2. Big Data Developers - Virdata, Internet of Things #virdata Who is Technicolor? Domains ● Media Services ● Entertainment Services ● Connected Home ● Emerging Ventures ● Technology & Innovations Who We Are Technicolor, a worldwide technology leader in the media and entertainment sector, is at the forefront of digital innovation. Our world class research and innovation laboratories and our creative talent pool enable us to lead the market in delivering advanced services to content creators and distributors. We also benefit from an extensive intellectual property portfolio focused on imaging and sound technologies, supporting our thriving licensing business.
  3. 3. Big Data Developers - Virdata, Internet of Things #virdata Virdata – OUR CORE CLOUD SERVICES Device Monitoring Device Management Big Data Analytics Big Data Queries Application Monitoring Virdata Cloud APIs MQTT MQTT MQTT MQTT M Q TT MQTT
  4. 4. Big Data Developers - Virdata, Internet of Things #virdata Virdata - 2 COMPONENTS: A CLOUD & A LIBRARY ★ Elastic and Scalable cutting edge technologies ★ API’s for different types of information/data consumption ★ Cloud agnostic thru self build monitoring tools ★ Running on both public & private cloud infrastructure ★ Bi-directional messaging ★ High performance brokers architecture ★ Lightweight and portable library ★ Multiple programming languages ★ Supports multiple transport protocols ★ Available for all HW and OS ★ Supports any type of data in any format/syntax ★ Payload is compressed and encrypted
  5. 5. Big Data Developers - Virdata, Internet of Things #virdata Virdata - SERVICE ARCHITECTURE millions of simultaneous persistent bi-directional connections millions of messages per second Real-time Complex Event Processing Distributed Pub/Sub Messaging Historical Data Archiving Pre-computed Data In-Memory real-time Data REST API Launch Queries - Launch Jobs INTEGRATION CUSTOMIZATION NOC, OPERATIONS, MGMT REPORTS, TRENDS ANALYTICS
  6. 6. Big Data Developers - Virdata, Internet of Things #virdata Virdata - VERTICAL INDUSTRIES AUTOMOTIVE ● Fleet Management ● Insurance ● Emergency Services UTILITIES ● Remote Meter Management ● Monitor Energy Consumption ● Optimize Subscription Plan CONSUMER ELECTRONICS ● Monitoring & Management ● Upsell Services ● Enhanced End User Experience CUSTOMER CARE ● Monitor Device & Application ● One Button Care ● Call Avoidance RETAIL ● Geo-location Based Adverts ● Heat Mapping ● Individualized Offering HEALTH ● Promote Patient Independence ● Time-Series Analysis ● Pro-active Responses
  7. 7. Big Data Developers - Virdata, Internet of Things #virdata Live Demo Contact us for a live demo at info@virdata.com or virdata.com.
  8. 8. Big Data Developers - Virdata, Internet of Things #virdata Connected “Things”
  9. 9. Big Data Developers - Virdata, Internet of Things #virdata Huge variety in devices and OSs.
  10. 10. Big Data Developers - Virdata, Internet of Things #virdata Virdata Client Libraries
  11. 11. Big Data Developers - Virdata, Internet of Things #virdata APIs
  12. 12. Big Data Developers - Virdata, Internet of Things #virdata Northbound and Southbound API Northbound API = Cloud API ● Messaging API ○ REST ○ PUB/SUB ○ MQTT ○ JMS ● Data Processing API ○ SQL ○ JobAPI ○ Query/REST Southbound API provided at the device level
  13. 13. Big Data Developers - Virdata, Internet of Things #virdata Integration of Virdata into IBM BlueMix Objectives • Show the strengths of the Virdata Internet of Things platform • Scalability to supports millions of connected devices • Real-time and historical data processing • Cloud API’s powering new data drives services across vertical markets • Demonstrate the power of the IBM BlueMix solution • Rapid development and deployment of new applications • Platform as a Service marketplace • Highlight the value of combining both • Internet of Things platform as a service Use-case • Virdata provides real-time car data • App acts upon car trouble codes • Invokes manufacturer analytics service • Initiates recommended actions, e.g. through Maximo workflow service • Schedules car dealer appointment • Informs the car driver
  14. 14. Big Data Developers - Virdata, Internet of Things #virdata Messaging & Broker
  15. 15. Big Data Developers - Virdata, Internet of Things #virdata Messaging Architecture: Device to Platform Protocol Adapter Protocol Adapter Protocol Adapter Kafka Kafka Kafka Kafka Storm Storm Storm API Data Processing API State State State
  16. 16. Big Data Developers - Virdata, Internet of Things #virdata Messaging Architecture: Device to Device(s) Protocol Adapter Protocol Adapter Protocol Adapter Kafka Kafka Kafka Kafka Storm Storm Storm API Data Processing API State State State
  17. 17. Big Data Developers - Virdata, Internet of Things #virdata Messaging Architecture: Large Fan Out Protocol Adapter Protocol Adapter Protocol Adapter Kafka Kafka Kafka Kafka Storm Storm Storm API Data Processing API State State State
  18. 18. Big Data Developers - Virdata, Internet of Things #virdata Horizontally scalable … and elastic as well. Messaging
  19. 19. Big Data Developers - Virdata, Internet of Things #virdata Persistent connections Broker
  20. 20. Big Data Developers - Virdata, Internet of Things #virdata Real-time bidirectional communication
  21. 21. Big Data Developers - Virdata, Internet of Things #virdata MQTT Pub/Sub Protocol Adaptor
  22. 22. Big Data Developers - Virdata, Internet of Things #virdata MQTT: QoS levels QoS 0: best effort QoS 1: at least once QoS 2: Exactly once Protocol Adaptor
  23. 23. Big Data Developers - Virdata, Internet of Things #virdata Kafka Queues
  24. 24. Big Data Developers - Virdata, Internet of Things #virdata Storm Messaging
  25. 25. Big Data Developers - Virdata, Internet of Things #virdata Message passing Storm
  26. 26. Big Data Developers - Virdata, Internet of Things #virdata Stream/Message partitioning, as well as grouping. Storm
  27. 27. Big Data Developers - Virdata, Internet of Things #virdata Storm Nimbus Zookeeper Supervisor Worker Node Executer Executer Executer Supervisor Worker Node Executer Executer Executer Supervisor Worker Node Executer Executer Executer
  28. 28. Big Data Developers - Virdata, Internet of Things #virdata Storm Tuple Stream Field 1 | Field 2 | Field 3| Field 4 | Field 5 TUPLE TUPLE TUPLE TUPLE TUPLE STREAM
  29. 29. Big Data Developers - Virdata, Internet of Things #virdata Storm Spout Bolt SPOUT BOLT T T T T T T T BOLT T T T T T T T T T BOLT API
  30. 30. Big Data Developers - Virdata, Internet of Things #virdata Storm Grouping S B B B B B GROUPING GROUPING
  31. 31. Big Data Developers - Virdata, Internet of Things #virdata Data Processing
  32. 32. Big Data Developers - Virdata, Internet of Things #virdata Events used to manipulate the master data. Events: Before
  33. 33. Big Data Developers - Virdata, Internet of Things #virdata Today, events are the master data. Events: After
  34. 34. Big Data Developers - Virdata, Internet of Things #virdata Let’s store everything. Data System
  35. 35. Big Data Developers - Virdata, Internet of Things #virdata Data is Immutable. Data System
  36. 36. Big Data Developers - Virdata, Internet of Things #virdata Data is Time Based. Data System
  37. 37. Big Data Developers - Virdata, Internet of Things #virdata The data you query is often transformed, aggregated, ... Rarely used in its original form. Query
  38. 38. Big Data Developers - Virdata, Internet of Things #virdata Query = function ( all data ) Query
  39. 39. Big Data Developers - Virdata, Internet of Things #virdata Functional computation, based on immutable inputs, is idempotent. Batch Layer
  40. 40. Big Data Developers - Virdata, Internet of Things #virdata Query: Number of cars living in each city Car Location Timestamp BMW 1 Antwerp 2008-10-11 Aston Martin Cologne 2010-01-23 BMW 2 Antwerp 2012-09-12 BMW 1 Cologne 2014-04-29 Location Count Antwerp 1 Cologne 2
  41. 41. Big Data Developers - Virdata, Internet of Things #virdata Query All Data QueryPrecomputed View
  42. 42. Big Data Developers - Virdata, Internet of Things #virdata Layered Architecture Batch Layer Speed Layer Serving Layer
  43. 43. Big Data Developers - Virdata, Internet of Things #virdata Layered Architecture Spark C* Incoming Data * Query
  44. 44. Big Data Developers - Virdata, Internet of Things #virdata Batch Layer
  45. 45. Big Data Developers - Virdata, Internet of Things #virdata Batch Layer Incoming Data Spark C*
  46. 46. Big Data Developers - Virdata, Internet of Things #virdata Batch Layer The batch layer can calculate anything, given enough time... Unrestrained computation.
  47. 47. Big Data Developers - Virdata, Internet of Things #virdata Keep the data in its original format. The batch layer stores the data normalized, the generated views are often, if not always denormalized. Batch Layer
  48. 48. Big Data Developers - Virdata, Internet of Things #virdata Horizontally scalable. Batch Layer
  49. 49. Big Data Developers - Virdata, Internet of Things #virdata Stores a master copy of the data set Batch Layer … append only
  50. 50. Big Data Developers - Virdata, Internet of Things #virdata High Latency. Let’s for now pretend the update latency doesn’t matter. Batch Layer
  51. 51. Big Data Developers - Virdata, Internet of Things #virdata Batch Layer
  52. 52. Big Data Developers - Virdata, Internet of Things #virdata In-memory storage Spark
  53. 53. Big Data Developers - Virdata, Internet of Things #virdata Advanced DAG execution engine Cyclic data, in memory computing. Spark
  54. 54. Big Data Developers - Virdata, Internet of Things #virdata Multilanguage support, interactive shells Scala, Java & Python Spark
  55. 55. Big Data Developers - Virdata, Internet of Things #virdata Write programs in terms of transformations on distributed datasets. RDD, are collections of objects, stored in RAM or on disk. Are build through parallel transformations, and are automatically rebuild on failure. Spark
  56. 56. Big Data Developers - Virdata, Internet of Things #virdata map Spark: API reduce
  57. 57. Big Data Developers - Virdata, Internet of Things #virdata map filter groupBy sort union join leftOuterJoin rightOuterJoin count fold reduceByKey groupByKey Spark: API reduce cogroup cross zip sample take first partitionBy mapWith pipe save ...
  58. 58. Big Data Developers - Virdata, Internet of Things #virdata Spark Ecosystem Spark HDFS Tachyon Mesos Spark Streaming Shark / Spark SQL GraphX MLlib Mahout MR v1 Blink DB Velox YARN
  59. 59. Big Data Developers - Virdata, Internet of Things #virdata Every iteration produces the views from scratch. Batch Layer
  60. 60. Big Data Developers - Virdata, Internet of Things #virdata Batch View Databases We need a (read-only) database to store those views.
  61. 61. Big Data Developers - Virdata, Internet of Things #virdata Example: the automotive market Real Time Tracking Engine Block Performance Fleet Management 3rd Party API integration Integration with Informix Big Data Visualization 3rd Party Application Creation BlueMix Platform as a Service Process Integrations The Open Source Route Enterprise Integration Bringing Analytics to the Data
  62. 62. Big Data Developers - Virdata, Internet of Things #virdata Batch Layer Data absorbed into Batch Views Time Now We are not done yet… Not yet absorbed. Just a few hours of data.
  63. 63. Big Data Developers - Virdata, Internet of Things #virdata Speed Layer
  64. 64. Big Data Developers - Virdata, Internet of Things #virdata Speed Layer Spark C* Incoming Data C*
  65. 65. Big Data Developers - Virdata, Internet of Things #virdata Stream processing. Speed Layer
  66. 66. Big Data Developers - Virdata, Internet of Things #virdata Continuous computation. Speed Layer
  67. 67. Big Data Developers - Virdata, Internet of Things #virdata Storing a limited window of data. Compensating for the last few hours of data. Speed Layer
  68. 68. Big Data Developers - Virdata, Internet of Things #virdata All the complexity is isolated in the Speed Layer. If anything goes wrong, it’s auto-corrected. Speed Layer
  69. 69. Big Data Developers - Virdata, Internet of Things #virdata You have a choice between: ● Availability ○ Queries are eventually consistent ● Consistency ○ Queries are consistent CAP Consistency Partition Tolerance Availability
  70. 70. Big Data Developers - Virdata, Internet of Things #virdata Eventual accuracy Some algorithms are hard to implement in real-time. For those cases we could estimate the results.
  71. 71. Big Data Developers - Virdata, Internet of Things #virdata Speed Layer
  72. 72. Big Data Developers - Virdata, Internet of Things #virdata Spark Streaming Micro batches
  73. 73. Big Data Developers - Virdata, Internet of Things #virdata Spark Streaming Stateful
  74. 74. Big Data Developers - Virdata, Internet of Things #virdata Spark Streaming Exactly once
  75. 75. Big Data Developers - Virdata, Internet of Things #virdata Incremental algorithms Spark Streaming
  76. 76. Big Data Developers - Virdata, Internet of Things #virdata IBM Infosphere Streams
  77. 77. Big Data Developers - Virdata, Internet of Things #virdata Serving Layer
  78. 78. Big Data Developers - Virdata, Internet of Things #virdata Serving Layer Spark C* Incoming Data C* Query
  79. 79. Big Data Developers - Virdata, Internet of Things #virdata Serving Layer Random reads.
  80. 80. Big Data Developers - Virdata, Internet of Things #virdata This layer queries the batch & real-time views and merges it. Serving Layer
  81. 81. Big Data Developers - Virdata, Internet of Things #virdata Lambda Architecture
  82. 82. Big Data Developers - Virdata, Internet of Things #virdata Lambda Architecture The Lambda Architecture can discard any view, batch and real-time, and just recreate everything from the master data.
  83. 83. Big Data Developers - Virdata, Internet of Things #virdata Mistakes are corrected via recomputation. Write bad data? Remove the data & recompute. Bug in view generation? Just recompute the view. Lambda Architecture
  84. 84. Big Data Developers - Virdata, Internet of Things #virdata Using a new schema? No problem, keep your data, keep your input F, change your output. Lambda Architecture
  85. 85. Big Data Developers - Virdata, Internet of Things #virdata Data storage is highly optimized. Lambda Architecture
  86. 86. Big Data Developers - Virdata, Internet of Things #virdata Control Plane
  87. 87. Big Data Developers - Virdata, Internet of Things #virdata Cloud Agnostic Control Plane
  88. 88. Big Data Developers - Virdata, Internet of Things #virdata IBM SoftLayer Experiences & Observations 1. Smooth migration from SCE 2.2 to SoftLayer in 1 months time including: ■ Development of SoftLayer specific FOG abstraction layer expansion to accommodate Virdata’s Devops tooling (CHEF) ■ Complete on-boarding of the Virdata Platform ■ Complete launch of simulation and emulation clusters ■ Very exhaustive and complete API 2. Very constructive and professional support throughout the complete on-boarding process 3. Availability of bare metal seen as a differentiator
  89. 89. Big Data Developers - Virdata, Internet of Things #virdata Cluster Management & Orchestration Control Plane RGOSSIP
  90. 90. Big Data Developers - Virdata, Internet of Things #virdata Monitoring and Logging Control Plane
  91. 91. Big Data Developers - Virdata, Internet of Things #virdata Wrap-up
  92. 92. Big Data Developers - Virdata, Internet of Things #virdata Virdata - SERVICE ARCHITECTURE millions of simultaneous persistent bi-directional connections millions of messages per second Real-time Complex Event Processing Distributed Pub/Sub Messaging Historical Data Archiving Pre-computed Data In-Memory real-time Data REST API Launch Queries - Launch Jobs INTEGRATION CUSTOMIZATION NOC, OPERATIONS, MGMT REPORTS, TRENDS ANALYTICS
  93. 93. Big Data Developers - Virdata, Internet of Things #virdata Questions? @virdata_iot | #virdata @nathan_gs
  94. 94. Big Data Developers - Virdata, Internet of Things #virdata Acknowledgements I would like to thank Nathan Marz for writing a very insightful book, where the idea of the Lambda Architecture comes from. Lambda: Big Data - Nathan Marz published at Manning Lambda, Storm: A real-time architecture using Hadoop & Storm - Nathan Bijnens & Geert Van Landeghem at FOSDEM 2013 Spark: Apache Spark website Spark: Apache Spark - the light at the end of the tunnel? - Michael Hausenblas, MapR at Data Science Day Berlin 2014
  95. 95. Big Data Developers - Virdata, Internet of Things #virdata Thank you virdata.com | +1 (937) 569 4220 | info@virdata.com #virdata | @virdata_iot @nathan_gs | nathan.bijnens@virdata.com

×