Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Sensing the world with Data of Things

2,063 views

Published on

Presented at Structure Data 2016, San Francisco.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Sensing the world with Data of Things

  1. 1. Sensing the world with Data of Things By:Sriskandarajah Suhothayan (Suho) Technical Lead at WSO2 @suhothayan suho@wso2.com STRUCTURE DATA 2016 MARCH 9 - 10 • SAN FRANCISCO
  2. 2. Any customer can have a car painted any colour that he wants so long as it is black ~ Henry Ford ~
  3. 3. Me Me Me !!! Your customers want to have a personalized experience. We are in the time of ME!
  4. 4. What to do ? You need to know the customer profile, e.g. historical data, to take a decision You need to understand the context in which the customer evolves You need to be able to react in real time to certain conditions or patterns
  5. 5. Is IoT New ? • source: http://community.arm.com/groups/internet-of-things/blog/2014/06
  6. 6. Internet of Things http://na1.www.gartner.com/imagesrv/newsroom/images/HC_ET_2014.jpg;wadf79d1c8397a49a2 source : http://na1.www.gartner.com/imagesrv/newsroom/images/HC_ET_2014.jpg;wadf79d1c8397a49a2
  7. 7. IoT Ecosystem
  8. 8. WSO2 IoT Server M3 : https://goo.gl/nhbxnG http://wso2.com/iot
  9. 9. Concepts of IoT Analytics ● Type of Data ● Distributed Nature ● Event-Drivenness ● Possible Type of Analytics ● Scalability ● Edge Analytics ● Uncertainty
  10. 10. Data Types of Things ● Time based data ○ Continuous monitoring & reporting ○ Time series processing (e.g. Energy consumption over time) ○ Specialised DBs - OpenTSDB ● Location based data ○ Things are allover the place & they move ○ Tracked via GPS / iBeacons ○ Geospatial processing (e.g Traffic planning, better route suggestion for vehicles) ○ Geospatial optimised processing engines - GeoTrellis
  11. 11. IoT is Distributed ● Constant changes ○ When components added and removed ○ Data flows are modified or repurposed ● Data collection need to support ○ Weak 3G networks to Ad-hoc peer-to-peer networks. ○ Message Queuing Telemetry Transport (MQTT) ○ Common Open Source Publishing Platform (CoApp) ○ ZigBee or Bluetooth low energy (BLE) ● Dynamic scaling ○ Hybrid cloud
  12. 12. IoT Analytics are Event-Driven ● Sensors report data as Event Streams ● Analysis on flowing (or perishable) data ● Realtime Analytics ○ Detect temporal and logical patterns ○ Identify KPIs and Thresholds ○ Send out alerts immediately ○ E.g. Alert when temperature sensor hit a limit, notify in car dashboard of low tire pressure ○ Systems : Apache Storm, Google Cloud DataFlow & WSO2 CEP
  13. 13. History Repeats ● Present vs usual behavior ● Understand the history ● Batch Analytics ○ Perform periodic summarisation/analytics ○ E.g. Average temperature in a room last month, total power usage of the factory last year ○ Systems : Apache Hadoop, Apache Spark + Storage
  14. 14. ● Ad-Hoc Queries ● Interactive Analytics ○ Provides searchability ○ E.g. Identify fraud rings from simple fraud alerts ○ Systems : Apache Drill, indexed storage systems such as Couchbase, Apache Lucene Deep Investigations
  15. 15. Thinking Ahead ● When you don’t Know the equations ● Focusing conditions & preventing issues ● Predictive Analytics ○ Incremental Learning ○ E.g. Proactive maintenance, fraud detection and health warnings ○ Systems : Apache Mahout, Apache Spark MLlib, Microsoft Azure Machine Learning, WSO2 ML, Skytree
  16. 16. Technology we’ve chosen Realtime Batch Interactive Predictive
  17. 17. WSO2 Data Analytics Server
  18. 18. Plenty of Data Scalable Data Processing source : http://www.websitemagazine.com/content/blogs/posts/archive/2014/09/25/customer-service-in-2039.aspx
  19. 19. Scalable Realtime Deployment More info : https://docs.wso2.com/display/CEP410/Creating+a+Storm+Based+Distributed+Execution+Plan
  20. 20. Scalable Deployment Interactive BatchRealtime & Predictive
  21. 21. ● Publishing all events is not good! ○ Hardware may not be scalable ○ Network getting flooded ● What we usually need ○ Aggregation over time ○ Trends that exceed thresholds ○ Event matching a rare condition ● Results in ○ Local optimisation ○ Quick detection of issues ○ Instant notification Is Every Event Significant?
  22. 22. Edge Analytics Analytics on the Edge with WSO2 Siddhi Push
  23. 23. Outliers ... ● E.g. Anomaly detection, Fraud Analytics ● Alerts for known and unknown frauds and Deep Search Analytics https://goo.gl/TWV5C1
  24. 24. Outliers ● We used: Linear Regression, Markov Models & Credit Scoring
  25. 25. Uncertainty in Data of Things Data can be ● Duplicated ● Arrives out of order ● Not arrive at all ● Wrong readings
  26. 26. Events Duplicates & Out of Order … ● Due redundant sensors & network latency ● Difficult for temporal data processing ○ Time Windows ○ Temporal ordering ● Such as Fraud detection define stream Purchase (price double, cardNo long,place string); from every (a1 = Purchase[price < 10] ) -> a2 = Purchase[ price >10000 and a1.cardNo == a2.cardNo ] within 1 day select a1.cardNo as cardNo, a2.price as price, a2.place as place insert into PotentialFraud ;
  27. 27. Events Arriving Out of Order E.g. Realtime Soccer Analytics (DEBS 2013) https://goo.gl/c2gPrQ ● Identify ball kicks, ball possession, shot on goal & offside ● Solutions : K-Slack Based Algorithms https://www2.informatik.uni-erlangen.de/publication/download/IPDPS2013.pdf
  28. 28. Missing Data ● Due to network outages ● E.g. Smart Meters (DEBS 2014) ○ Smart home electricity data: 2000 sensors, 40 houses, 4 Billion events in four months ○ Processed 400K events/sec ● Solutions: ○ Approximate using complimenting sensor reading ■ Electricity Monitoring ● Frequent Load readings ● Occasional Work readings ○ Fault-tolerant data streams : Google Millwheel
  29. 29. Wrong Sensor Readings ● From GPS ● E.g.TFL Traffic Analysis ○ Using Transport for London open data feeds. ○ http://goo.gl/04tX6k, http://goo. gl/9xNiCm ○ Scales to 500,000 Events/Sec and more ● From iBcons at shops, ships and airport ● Solution: Kalman Filter
  30. 30. Visualisation ● Per-device & Summarization View ● Ability to group by categories ● Solutions: Composable Dashboard with sampling & indexing
  31. 31. Communicate to Mobile & 3rd Party Apps ● Expose analytics Results as API ○ Mobile Apps, Third Party ● Provides ○ Security, Billing, ○ Throttling, Quotas & SLA ● Solution ○ Write data to database ○ Expose them via secured APIs (E.g. WSO2 API Manager)
  32. 32. Reference Architecture for IoT Analytics
  33. 33. IoT Analytics ● (WSO2 DAS) 3.0.1 ○ Combines all types of analytics. ● (WSO2 CEP) 4.1 ○ For who need to analyze event streams in realtime. ● (WSO2 ML) 1.1 ○ For building Predictive Models http://wso2.com/analytics http://wso2.com/iot
  34. 34. Thank You Any Questions ?
  35. 35. Contact us !

×