SlideShare a Scribd company logo
1 of 23
ApproxIoT
Approximate Analytics for Edge
Computing
https://ApproxIoT.github.io/ApproxIoT/
Zhenyu Wen, Do Le Quoc,
Pramod Bhatotia, Ruichuan Chen, Myungjin Lee
Modern online services
Stream
aggregator
Stream
analytics
system
Useful
Information
Processing streaming data from different sources
Modern online services
Low latency
Tension
Approximate computing
Efficient resource
utilization
Approximate computing
Many applications:
Approximate output is good enough!
The proportion of data is useful for this application
Live taxi heatmap
Approximate computing
Idea: To achieve low latency, compute over a sub-set of data items
instead of the entire data-set
Analyze
Approximate output
± error bound
Approximate
computing
(sampling)
State-of-the-art system
StreamApprox [Middleware’17]
Approximate output
±error bound
StreamApprox
Stream
aggregator
S1
S2
Sn
…
Data
stream
Cloud datacenter
Limitations:
• It wastes bandwidth
• It utilizes only cloud datacenter resources
Edge computing
Cloud
Gateway
Edge node
Local processing
Source of
data
Allows data to be processed at the edge
node before it’s sent to the cloud
Opportunities:
• Providing more computing resources
• Saving bandwidth
Edge infrastructure
Source: https://peering.google.com/#/infrastructure
Azure IoT edge
Watson IoT
AWS IoT
Problem statement
To build a stream analytics system
• By utilizing the cloud and edge computing resources
• By leveraging approximate computing
Design goals
• Efficiency: Efficient utilization of computing resources
• Adaptability: Adaptive execution based on the available resources
• Transparency: No code change required and resource management
Outline
• Motivation
• Design
• Implementation
• Evaluation
ApproxIoT: Overview
S1
Si
Sn
…
Sm
…
…
Central
node
Cloud
Query
Approximate output
± error bound
ApproxIoT
ApproxIoT employs sampling in the distributed environment of
edge + cloud
Edge nodes
Regional edge
Continental node
Naïve algorithm
SRS Query
Simple random sampling (SRS)
Approximate output
± error bound
Sampled unfairly
Overlooked Low accuracy
Background: Stratified sampling
Stratified
sampling
Advantage: The sub-streams are sampled fairly
Disadvantage: Requires the knowledge of each sub-stream size
Background: Reservoir sampling
Reservoir
sampling
Size of reservoir = 4
Reservoir
sampling
Size of reservoir = 4
Advantage:
• No pre-knowledge required of sub-stream size
Disadvantages:
• The sub-streams are sampled unfairly
• Difficult to run on multiple nodes
Reservoir
sampling
Size of reservoir = 4
The 5th item With probability(
4
5
) replaced by the 5th item
Reservoir
sampling
Size of reservoir = 4
Reservoir
sampling
Size of reservoir = 4
The 6th item With probability(
4
6
) replaced by the 6th item
Reservoir
sampling
Size of reservoir = 4
Reservoir
sampling
Size of reservoir = 4
ApproxIoT sampling algorithm
Easy to parallelize, requires
no synchronization between
sub-streams
Weighted hierarchical sampling (WHS)
Combining stratified and reservoir sampling
Weight: C/N, if C>N
1, if C <=N
WHS
Reservoir size N=4
With initial weight 1
W=1
W=1
W=1
W=6/4
W=1
W=1
C=6
WHS on edge nodes
Regional
edge WHS
W=1
W=1
W=1
W=6/2=3
W=4/2=2
W=1
Continental
node WHS
W=4
W=1
W=3
W=4*5/2=10
W=1*3/2=3/2
W=3
Reservoir size equals 2
Central
node
Cloud
Edge nodes
Regional edge Continental node
Easy to parallelize, requires
no synchronization between
computing nodes
Carried weight Current weight
ApproxIoT in the cloud
Reservoir size equals 1
Query
(sum)
WHS
The weights are carried
W=4/3*6/1 =8
W=1*4/1=4
W=1*2/1=2
± error bound
8* +4* +2*
W=4/3
W=1
W=1
Approximate output:
Central
node
Cloud
Edge nodes
Regional edge Continental node
Outline
• Motivation
• Design
• Implementation
• Evaluation
Implementation
S1
S2
Sn
…
Kafka
cluster
Stream
pub/sub
Edge
nodes
Cloud
datacenter
Data stream
Sampled
data stream
Sampled
data stream
See the paper
for more details
Kafka Streams
Experimental setup
• Evaluation questions
• Accuracy vs. sample size
• Throughput vs. sample size
• Testbed: 25 nodes
• 15 nodes for ApproxIoT deployment
• 10 nodes for Kafka cluster
• Datasets:
• Synthetic: Poisson and Gaussian distribution
• Real: Brasvo pollution and New York Taxi Ride
See the paper
for more
results!
Accuracy vs. sample size
0
20
40
60
80
10 20 40 60 80
Accuracy
loss(%)
Sampling fraction(%)
SRS ApproxIoT
Lower
the better
ApproxIoT: ~2600X higher accuracy over SRS
The average is 0.035%
Throughput vs. sample size
0
40
80
120
10 20 40 60 80 90 100
Throughput(k)
items/s
Sampling fraction(%)
Native SRS ApproxIoT
Higher
the better
• ApproxIoT has low overhead compared to the native execution
• ApproxIoT has similar throughput as SRS
Conclusion
ApproxIoT: Approximate analytics for edge computing
Adaptability Adaptive execution based on the available resources
Transparency Requires no code changes and resource management
Thank you!
More details on the project website:
https://ApproxIoT.github.io/ApproxIoT/
Efficiency Efficient computing and bandwidth resource utilization

More Related Content

Similar to approxIoT.pptx

Efficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive WindowsEfficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive WindowsAlbert Bifet
 
Migration of groups of virtual machines in distributed data centers to reduce...
Migration of groups of virtual machines in distributed data centers to reduce...Migration of groups of virtual machines in distributed data centers to reduce...
Migration of groups of virtual machines in distributed data centers to reduce...Sabidur Rahman
 
Continental division of load and balanced ant
Continental division of load and balanced antContinental division of load and balanced ant
Continental division of load and balanced antIJCI JOURNAL
 
EventVisualization
EventVisualizationEventVisualization
EventVisualizationHenoch Wong
 
LHCb Computing Workshop 2018: PV finding with CNNs
LHCb Computing Workshop 2018: PV finding with CNNsLHCb Computing Workshop 2018: PV finding with CNNs
LHCb Computing Workshop 2018: PV finding with CNNsHenry Schreiner
 
Distributed approximate spectral clustering for large scale datasets
Distributed approximate spectral clustering for large scale datasetsDistributed approximate spectral clustering for large scale datasets
Distributed approximate spectral clustering for large scale datasetsBita Kazemi
 
Building blocks for aggregate programming of self-organising applications
Building blocks for aggregate programming of self-organising applicationsBuilding blocks for aggregate programming of self-organising applications
Building blocks for aggregate programming of self-organising applicationsFoCAS Initiative
 
Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...
Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...
Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...AtakanAral
 
Query optimization for_sensor_networks
Query optimization for_sensor_networksQuery optimization for_sensor_networks
Query optimization for_sensor_networksHarshavardhan Achrekar
 
JPN1406 Snapshot and Continuous Data Collection in Probabilistic Wireless S...
JPN1406   Snapshot and Continuous Data Collection in Probabilistic Wireless S...JPN1406   Snapshot and Continuous Data Collection in Probabilistic Wireless S...
JPN1406 Snapshot and Continuous Data Collection in Probabilistic Wireless S...chennaijp
 
Crash course on data streaming (with examples using Apache Flink)
Crash course on data streaming (with examples using Apache Flink)Crash course on data streaming (with examples using Apache Flink)
Crash course on data streaming (with examples using Apache Flink)Vincenzo Gulisano
 
Taming Big Data!
Taming Big Data!Taming Big Data!
Taming Big Data!Ian Foster
 
Geospatial Sensor Networks and Partitioning Data
Geospatial Sensor Networks and Partitioning DataGeospatial Sensor Networks and Partitioning Data
Geospatial Sensor Networks and Partitioning DataAlexMiowski
 
Inter Task Communication On Volatile Nodes
Inter Task Communication On Volatile NodesInter Task Communication On Volatile Nodes
Inter Task Communication On Volatile Nodesnagarajan_ka
 
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...DataStax Academy
 

Similar to approxIoT.pptx (20)

overbooking.ppt
overbooking.pptoverbooking.ppt
overbooking.ppt
 
Efficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive WindowsEfficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive Windows
 
Migration of groups of virtual machines in distributed data centers to reduce...
Migration of groups of virtual machines in distributed data centers to reduce...Migration of groups of virtual machines in distributed data centers to reduce...
Migration of groups of virtual machines in distributed data centers to reduce...
 
Accordion - VLDB 2014
Accordion - VLDB 2014Accordion - VLDB 2014
Accordion - VLDB 2014
 
Continental division of load and balanced ant
Continental division of load and balanced antContinental division of load and balanced ant
Continental division of load and balanced ant
 
Srushti_M.E_PPT.ppt
Srushti_M.E_PPT.pptSrushti_M.E_PPT.ppt
Srushti_M.E_PPT.ppt
 
EventVisualization
EventVisualizationEventVisualization
EventVisualization
 
Wireless Sensor
Wireless SensorWireless Sensor
Wireless Sensor
 
LHCb Computing Workshop 2018: PV finding with CNNs
LHCb Computing Workshop 2018: PV finding with CNNsLHCb Computing Workshop 2018: PV finding with CNNs
LHCb Computing Workshop 2018: PV finding with CNNs
 
Distributed approximate spectral clustering for large scale datasets
Distributed approximate spectral clustering for large scale datasetsDistributed approximate spectral clustering for large scale datasets
Distributed approximate spectral clustering for large scale datasets
 
Building blocks for aggregate programming of self-organising applications
Building blocks for aggregate programming of self-organising applicationsBuilding blocks for aggregate programming of self-organising applications
Building blocks for aggregate programming of self-organising applications
 
Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...
Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...
Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...
 
Query optimization for_sensor_networks
Query optimization for_sensor_networksQuery optimization for_sensor_networks
Query optimization for_sensor_networks
 
JPN1406 Snapshot and Continuous Data Collection in Probabilistic Wireless S...
JPN1406   Snapshot and Continuous Data Collection in Probabilistic Wireless S...JPN1406   Snapshot and Continuous Data Collection in Probabilistic Wireless S...
JPN1406 Snapshot and Continuous Data Collection in Probabilistic Wireless S...
 
Crash course on data streaming (with examples using Apache Flink)
Crash course on data streaming (with examples using Apache Flink)Crash course on data streaming (with examples using Apache Flink)
Crash course on data streaming (with examples using Apache Flink)
 
Taming Big Data!
Taming Big Data!Taming Big Data!
Taming Big Data!
 
Geospatial Sensor Networks and Partitioning Data
Geospatial Sensor Networks and Partitioning DataGeospatial Sensor Networks and Partitioning Data
Geospatial Sensor Networks and Partitioning Data
 
Unit 4
Unit 4Unit 4
Unit 4
 
Inter Task Communication On Volatile Nodes
Inter Task Communication On Volatile NodesInter Task Communication On Volatile Nodes
Inter Task Communication On Volatile Nodes
 
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
 

Recently uploaded

Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...drjose256
 
Diploma Engineering Drawing Qp-2024 Ece .pdf
Diploma Engineering Drawing Qp-2024 Ece .pdfDiploma Engineering Drawing Qp-2024 Ece .pdf
Diploma Engineering Drawing Qp-2024 Ece .pdfJNTUA
 
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas SachpazisSeismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas SachpazisDr.Costas Sachpazis
 
Low Altitude Air Defense (LAAD) Gunner’s Handbook
Low Altitude Air Defense (LAAD) Gunner’s HandbookLow Altitude Air Defense (LAAD) Gunner’s Handbook
Low Altitude Air Defense (LAAD) Gunner’s HandbookPeterJack13
 
Introduction to Artificial Intelligence and History of AI
Introduction to Artificial Intelligence and History of AIIntroduction to Artificial Intelligence and History of AI
Introduction to Artificial Intelligence and History of AISheetal Jain
 
Filters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility ApplicationsFilters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility ApplicationsMathias Magdowski
 
Software Engineering Practical File Front Pages.pdf
Software Engineering Practical File Front Pages.pdfSoftware Engineering Practical File Front Pages.pdf
Software Engineering Practical File Front Pages.pdfssuser5c9d4b1
 
Research Methodolgy & Intellectual Property Rights Series 2
Research Methodolgy & Intellectual Property Rights Series 2Research Methodolgy & Intellectual Property Rights Series 2
Research Methodolgy & Intellectual Property Rights Series 2T.D. Shashikala
 
Seizure stage detection of epileptic seizure using convolutional neural networks
Seizure stage detection of epileptic seizure using convolutional neural networksSeizure stage detection of epileptic seizure using convolutional neural networks
Seizure stage detection of epileptic seizure using convolutional neural networksIJECEIAES
 
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024EMMANUELLEFRANCEHELI
 
Autodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxAutodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxMustafa Ahmed
 
Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...IJECEIAES
 
AI in Healthcare Innovative use cases and applications.pdf
AI in Healthcare Innovative use cases and applications.pdfAI in Healthcare Innovative use cases and applications.pdf
AI in Healthcare Innovative use cases and applications.pdfmahaffeycheryld
 
"United Nations Park" Site Visit Report.
"United Nations Park" Site  Visit Report."United Nations Park" Site  Visit Report.
"United Nations Park" Site Visit Report.MdManikurRahman
 
analog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptxanalog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptxKarpagam Institute of Teechnology
 
Artificial Intelligence in due diligence
Artificial Intelligence in due diligenceArtificial Intelligence in due diligence
Artificial Intelligence in due diligencemahaffeycheryld
 
Research Methodolgy & Intellectual Property Rights Series 1
Research Methodolgy & Intellectual Property Rights Series 1Research Methodolgy & Intellectual Property Rights Series 1
Research Methodolgy & Intellectual Property Rights Series 1T.D. Shashikala
 
Passive Air Cooling System and Solar Water Heater.ppt
Passive Air Cooling System and Solar Water Heater.pptPassive Air Cooling System and Solar Water Heater.ppt
Passive Air Cooling System and Solar Water Heater.pptamrabdallah9
 
Augmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptxAugmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptxMustafa Ahmed
 
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...josephjonse
 

Recently uploaded (20)

Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
 
Diploma Engineering Drawing Qp-2024 Ece .pdf
Diploma Engineering Drawing Qp-2024 Ece .pdfDiploma Engineering Drawing Qp-2024 Ece .pdf
Diploma Engineering Drawing Qp-2024 Ece .pdf
 
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas SachpazisSeismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
 
Low Altitude Air Defense (LAAD) Gunner’s Handbook
Low Altitude Air Defense (LAAD) Gunner’s HandbookLow Altitude Air Defense (LAAD) Gunner’s Handbook
Low Altitude Air Defense (LAAD) Gunner’s Handbook
 
Introduction to Artificial Intelligence and History of AI
Introduction to Artificial Intelligence and History of AIIntroduction to Artificial Intelligence and History of AI
Introduction to Artificial Intelligence and History of AI
 
Filters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility ApplicationsFilters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility Applications
 
Software Engineering Practical File Front Pages.pdf
Software Engineering Practical File Front Pages.pdfSoftware Engineering Practical File Front Pages.pdf
Software Engineering Practical File Front Pages.pdf
 
Research Methodolgy & Intellectual Property Rights Series 2
Research Methodolgy & Intellectual Property Rights Series 2Research Methodolgy & Intellectual Property Rights Series 2
Research Methodolgy & Intellectual Property Rights Series 2
 
Seizure stage detection of epileptic seizure using convolutional neural networks
Seizure stage detection of epileptic seizure using convolutional neural networksSeizure stage detection of epileptic seizure using convolutional neural networks
Seizure stage detection of epileptic seizure using convolutional neural networks
 
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
 
Autodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxAutodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptx
 
Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...
 
AI in Healthcare Innovative use cases and applications.pdf
AI in Healthcare Innovative use cases and applications.pdfAI in Healthcare Innovative use cases and applications.pdf
AI in Healthcare Innovative use cases and applications.pdf
 
"United Nations Park" Site Visit Report.
"United Nations Park" Site  Visit Report."United Nations Park" Site  Visit Report.
"United Nations Park" Site Visit Report.
 
analog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptxanalog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptx
 
Artificial Intelligence in due diligence
Artificial Intelligence in due diligenceArtificial Intelligence in due diligence
Artificial Intelligence in due diligence
 
Research Methodolgy & Intellectual Property Rights Series 1
Research Methodolgy & Intellectual Property Rights Series 1Research Methodolgy & Intellectual Property Rights Series 1
Research Methodolgy & Intellectual Property Rights Series 1
 
Passive Air Cooling System and Solar Water Heater.ppt
Passive Air Cooling System and Solar Water Heater.pptPassive Air Cooling System and Solar Water Heater.ppt
Passive Air Cooling System and Solar Water Heater.ppt
 
Augmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptxAugmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptx
 
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
 

approxIoT.pptx