Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

542 views

Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices

  1. 1. AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices IEEE International Conference on Big Data 2015 (BigData 2015) Oct 29 – Nov 01, 2015 @ Santa Clara, CA, USA Demetris Trihinas, George Pallis, Marios D. Dikaiakos University of Cyprus {trihinas, gpallis, mdd}@cs.ucy.ac.cy
  2. 2. Demetris Trihinas 2 IoT was initially devices sensing and exchanging data streams with humans or other network-enabled devices IEEE BIGDATA CONFERENCE 2015
  3. 3. Edge-Mining Demetris Trihinas 3 A term coined to reflect data processing and decision-making on “smart” devices that sit at the edge of IoT networks Buy more detergent …our devices just got a little bit more “smarter”… IEEE BIGDATA CONFERENCE 2015
  4. 4. The “Big Data” in IoT Demetris Trihinas 4 • Taming data volume and data velocity with limited processing and network capabilities • IoT devices are usually battery-powered which means intense processing leads to increased energy consumption (less battery-life) Zhang et al., Usenix HotCloud, 2015 Challenges [IDC, Big Data in IoT, 2014] IEEE BIGDATA CONFERENCE 2015
  5. 5. Adaptive Sampling and Filtering Demetris Trihinas 5IEEE BIGDATA CONFERENCE 2015
  6. 6. Metric Stream Demetris Trihinas 6 • A metric stream 𝑀 = {𝑠𝑖}𝑖=0 𝑛 is a large sequence of collected samples, denoted as 𝑠𝑖 where 𝑖 = 0, 1, … , 𝑛 and 𝑛 → ∞ • Each sample 𝑠𝑖 is a tuple 𝑡𝑖, 𝑣𝑖 described by a timestamp 𝑡𝑖 and a value 𝑣𝑖 𝑠𝑖 𝑠𝑖+1 Metric Stream M IEEE BIGDATA CONFERENCE 2015
  7. 7. Periodic Sampling Demetris Trihinas 7 • The process of triggering the collection mechanism of a monitored source every 𝑇 time units such that the 𝑖 𝑡ℎ sample is collected at time 𝑡𝑖 = 𝑖 ∙ 𝑇 𝑠𝑖 𝑠𝑖+1 Metric Stream M sampled every T = 1s Compute resources and energy are wasted while generating large data volumes at a high velocity Metric Stream M sampled every T = 10s Sudden events and significant insights are missed IEEE BIGDATA CONFERENCE 2015
  8. 8. Adaptive Sampling Demetris Trihinas 8 • Dynamically adjust the sampling period 𝑇𝑖 based on some function, containing information of the metric stream evolution 𝑠𝑖 𝑠𝑖+1 𝑇𝑖+1 Metric Stream 𝑀′ dynamically sampledMetric Stream 𝑀 with 𝑇 = 1𝑠 IEEE BIGDATA CONFERENCE 2015 • Adapting the sampling rate reduces unnecessary processing accompanied by computing a new sample and consequently preserves energy!
  9. 9. Metric Filtering Demetris Trihinas 9 • “Cleansing” the metric stream by not transmitting over the network consecutive collected samples of “little” difference • Fixed Range Metric Filters Metric Stream M with fixed filter Range 𝑅 = 1% if (curValue ∈ [prevValue – R, prevValue + R ]) filter(curValue) Although stable phases exist, only 2 metric values are filtered Can we know this R beforehand? Should it keep the same value forever? Is it the same for each and every metric? IEEE BIGDATA CONFERENCE 2015 𝑅 𝑠𝑖+1 𝐹𝑖𝑙𝑡𝑒𝑟𝑒𝑑!𝑠𝑖 x x
  10. 10. Adaptive Filtering Demetris Trihinas 10 • Dynamically adjusting the filter range 𝑅 based on some function containing information of the current variability of the metric stream Metric Stream 𝑀′ with adaptive RangeMetric Stream 𝑀 𝑠𝑖 𝑠𝑖+1 𝑅𝑖+1 𝑅𝑖 IEEE BIGDATA CONFERENCE 2015 • Adapting the filter range to the variability of the metric streams allows for data volume and network traffic to be reduced
  11. 11. The Adaptive Monitoring Framework AdaM Demetris Trihinas 11 Sensing Unit Processing Unit Communication Unit Adaptive Sampling Adaptive Filtering Knowledge Base AdaM IoT Device 𝑠𝑖 𝑇𝑖+1 𝑅𝑖+1 𝑠𝑖 • Software library with no external dependencies embeddable on IoT devices • Reduces processing, network traffic and energy consumption by adapting the monitoring intensity based on the metric stream evolution and variability • Achieves a balance between efficiency and accuracy It’s open-source! Give it a try! https://github.com/dtrihinas/AdaM Raspberry Pi Arduino Beacons IEEE BIGDATA CONFERENCE 2015
  12. 12. Adaptive Sampling Algorithm • Step 1: Compute the distance 𝛿𝑖 between the current two consecutive sample values • Step 2: Compute metric stream evolution based on a Probabilistic Exponential Weighted Moving Average (PEWMA) to estimate 𝛿𝑖+1 and the standard deviation 𝜎𝑖+1: 𝛿𝑖+1 = 𝜇𝑖 = 𝑎 ∙ 𝜇𝑖−1 + 1 − 𝑎 𝛿𝑖 Demetris Trihinas 12 Looks like an exponential moving average, right? But weighting is probabilistically applied! 𝑎 = 𝛼 1 − 𝛽P𝑖 IEEE BIGDATA CONFERENCE 2015
  13. 13. Why use a Probabilistic EWMA? Simple EWMA is volatile to abrupt transient changes • Slow to acknowledge spike after “stable” periods • If “stable” phases follow sudden spikes, then subsequent values are overestimated Demetris Trihinas 13 Sounds a lot like a job for a Gaussian distribution! IEEE BIGDATA CONFERENCE 2015
  14. 14. Adaptive Sampling Algorithm • Step 3: Compute actual standard deviation 𝜎𝑖 • Step 4: Compute the current confidence (𝑐𝑖 ≤ 1) of our approach based on the actual and estimated standard deviation 𝑐𝑖 = 1 − | 𝜎𝑖 − 𝜎𝑖| 𝜎𝑖 • Step 5: Compute the sampling period 𝑇𝑖+1 based on the current confidence and the user-defined imprecision γ Demetris Trihinas 14 … the more “confident” the algorithm is, the larger the outputted sampling period 𝑇𝑖+1 can be… O(1) complexity as all steps only use their previous values! IEEE BIGDATA CONFERENCE 2015
  15. 15. Adaptive Filtering Algorithm • Step 1: Compute the dispersion of the stream based on average 𝜇𝑖 and standard deviation 𝜎𝑖 already computed from adaptive sampling 𝐹𝑖 = 𝜎𝑖 2 𝜇𝑖 • Step 2: Compute the new filter range 𝑅𝑖+1 based on 𝐹𝑖 and the user- defined imprecision 𝛾 Demetris Trihinas 15 …a low 𝐹𝑖 indicates a non dispersed data stream which means a low variability in the metric stream… O(1) complexity as all steps only use their previous values! IEEE BIGDATA CONFERENCE 2015 𝑠𝑖 𝑠𝑖+1 𝑅𝑖+1 𝑠𝑖+2 𝑠𝑖+3 𝐹𝑖𝑙𝑡𝑒𝑟𝑒𝑑!𝑅𝑖+2𝑅𝑖 𝐹 < 𝛾
  16. 16. Evaluation Demetris Trihinas 16IEEE BIGDATA CONFERENCE 2015
  17. 17. AdaM vs State-of-the-Art Adaptive Techniques • i-EWMA: an EWMA adapting sampling period by 1 time unit when the estimated error (ε) is under/over user-defined imprecision • L-SIP[1]: a linear algorithm using a double EWMA to produce estimates of current data distribution based on the rate sample values change => Slow to react to highly transient and abrupt fluctuations in the metric stream • FAST[2]: an (aggressive) framework using a PID controller to compute (large) sampling periods accompanied by a Kalman Filter to predict values at non sampling points Demetris Trihinas 17 [1] Gaura et al., IEEE Sensors Journal, 2013 [2] Fan et al., ACM CIKM, 2012 IEEE BIGDATA CONFERENCE 2015
  18. 18. Datasets Demetris Trihinas 18 Memory Trace Java Sorting Benchmark CPU Trace Carnegie Mellon RainMon Project Disk I/O Trace Carnegie Mellon RainMon Project TCP Port Monitoring Trace Cyber Defence SANS Tech Institute Fitbit Trace Coursera Data Science Course IEEE BIGDATA CONFERENCE 2015
  19. 19. Our Small “Big Data” Testbeds Demetris Trihinas 19 Raspberry Pi 1st Gen. Model B 512MB RAM, Single Core 700MHz Daemon on OS emulates traces while feeding samples to each algorithm Android Emulator + SensorSimulator 512MB RAM, Single Core SensorSimulator script emulates traces by feeding samples to Android emulator for each algorithm Daemon IEEE BIGDATA CONFERENCE 2015
  20. 20. Evaluation Metrics • Estimation Accuracy – Mean Absolute Percentage Error (MAPE) • CPU Cycles • Outgoing Network Traffic • Edge Device Energy Consumption* Demetris Trihinas 20 *Xiao et al., CPSCom, 2010 perf nethogs powerstat Other than error evaluation no other system goes through an overhead study!
  21. 21. Error Comparison Demetris Trihinas 21 Memory Trace CPU Trace Disk I/O Trace TCP Port Monitoring Trace Fitbit Trace For all traces AdaM has the smallest inaccuracy (<10%) Even in a more aggressive configuration (λ=2) AdaM is still comparable to L-SIP AdaM AdaM AdaM AdaM AdaM User-defined accepted imprecision set to γ = 0.1 IEEE BIGDATA CONFERENCE 2015
  22. 22. So How Well Does AdaM Perform? Demetris Trihinas 22 Memory Trace Java Sorting Benchmark CPU Trace Carnegie Mellon RainMon Project Disk I/O Trace Carnegie Mellon RainMon Project TCP Port Monitoring Trace Cyber Defence SANS Tech Institute Fitbit Trace Coursera Data Science Course IEEE BIGDATA CONFERENCE 2015
  23. 23. Overhead Comparison (1) Demetris Trihinas 23 Network Traffic Energy Consumption CPU Cycles • Data volume reduced by 74% • Energy consumption by 71% • Accuracy is always above 89% and 83% with filtering enabled IEEE BIGDATA CONFERENCE 2015
  24. 24. Overhead Comparison (2) Demetris Trihinas 24 Network Traffic Energy Consumption Error (MAPE) FAST’s aggressiveness results in slightly lower energy consumption and network traffic But, this does not come for free… FAST features a large inaccuracy AdaM achieves a balance between efficiency and accuracy IEEE BIGDATA CONFERENCE 2015
  25. 25. Scalability Evaluation • Integrated AdaM to a data streaming system • JCatascopia Cloud Monitoring* • JCatascopia Agents (data sources) use AdaM to adapt monitoring intensity • Archiving time is measured at JCatascopia Server to evaluate data velocity • Compare AdaM over Periodic Sampling Demetris Trihinas 25 * [Trihinas et al, CCGrid 2014] https://github.com/dtrihinas/JCatascopia DB AdaM Agent Agent Agent . . . JCatascopia AdaM AdaM metrics Data velocity is reduced to a linear growth IEEE BIGDATA CONFERENCE 2015
  26. 26. Conclusion and Future Work Demetris Trihinas 26 • Data volume and velocity of IoT-generated data keeps increasing • AdaM reduces processing, network traffic and energy consumption of IoT device by adapting monitoring intensity • AdaM achieves a balance between efficiency and accuracy • AdaM v2.0 support data streaming engines such as Apache Spark and Flink IEEE BIGDATA CONFERENCE 2015 It’s open-source! Give it a try! https://github.com/dtrihinas/AdaM
  27. 27. Demetris Trihinas 27 Laboratory for Internet Computing Department of Computer Science University of Cyprus http://linc.ucy.ac.cy IEEE BIGDATA CONFERENCE 2015

×