SlideShare a Scribd company logo
ApproxIoT
Approximate Analytics for Edge
Computing
https://ApproxIoT.github.io/ApproxIoT/
Zhenyu Wen, Do Le Quoc,
Pramod Bhatotia, Ruichuan Chen, Myungjin Lee
Modern online services
Stream
aggregator
Stream
analytics
system
Useful
Information
Processing streaming data from different sources
Modern online services
Low latency
Tension
Approximate computing
Efficient resource
utilization
Approximate computing
Many applications:
Approximate output is good enough!
The proportion of data is useful for this application
Live taxi heatmap
Approximate computing
Idea: To achieve low latency, compute over a sub-set of data items
instead of the entire data-set
Analyze
Approximate output
± error bound
Approximate
computing
(sampling)
State-of-the-art system
StreamApprox [Middleware’17]
Approximate output
±error bound
StreamApprox
Stream
aggregator
S1
S2
Sn
…
Data
stream
Cloud datacenter
Limitations:
• It wastes bandwidth
• It utilizes only cloud datacenter resources
Edge computing
Cloud
Gateway
Edge node
Local processing
Source of
data
Allows data to be processed at the edge
node before it’s sent to the cloud
Opportunities:
• Providing more computing resources
• Saving bandwidth
Edge infrastructure
Source: https://peering.google.com/#/infrastructure
Azure IoT edge
Watson IoT
AWS IoT
Problem statement
To build a stream analytics system
• By utilizing the cloud and edge computing resources
• By leveraging approximate computing
Design goals
• Efficiency: Efficient utilization of computing resources
• Adaptability: Adaptive execution based on the available resources
• Transparency: No code change required and resource management
Outline
• Motivation
• Design
• Implementation
• Evaluation
ApproxIoT: Overview
S1
Si
Sn
…
Sm
…
…
Central
node
Cloud
Query
Approximate output
± error bound
ApproxIoT
ApproxIoT employs sampling in the distributed environment of
edge + cloud
Edge nodes
Regional edge
Continental node
Naïve algorithm
SRS Query
Simple random sampling (SRS)
Approximate output
± error bound
Sampled unfairly
Overlooked Low accuracy
Background: Stratified sampling
Stratified
sampling
Advantage: The sub-streams are sampled fairly
Disadvantage: Requires the knowledge of each sub-stream size
Background: Reservoir sampling
Reservoir
sampling
Size of reservoir = 4
Reservoir
sampling
Size of reservoir = 4
Advantage:
• No pre-knowledge required of sub-stream size
Disadvantages:
• The sub-streams are sampled unfairly
• Difficult to run on multiple nodes
Reservoir
sampling
Size of reservoir = 4
The 5th item With probability(
4
5
) replaced by the 5th item
Reservoir
sampling
Size of reservoir = 4
Reservoir
sampling
Size of reservoir = 4
The 6th item With probability(
4
6
) replaced by the 6th item
Reservoir
sampling
Size of reservoir = 4
Reservoir
sampling
Size of reservoir = 4
ApproxIoT sampling algorithm
Easy to parallelize, requires
no synchronization between
sub-streams
Weighted hierarchical sampling (WHS)
Combining stratified and reservoir sampling
Weight: C/N, if C>N
1, if C <=N
WHS
Reservoir size N=4
With initial weight 1
W=1
W=1
W=1
W=6/4
W=1
W=1
C=6
WHS on edge nodes
Regional
edge WHS
W=1
W=1
W=1
W=6/2=3
W=4/2=2
W=1
Continental
node WHS
W=4
W=1
W=3
W=4*5/2=10
W=1*3/2=3/2
W=3
Reservoir size equals 2
Central
node
Cloud
Edge nodes
Regional edge Continental node
Easy to parallelize, requires
no synchronization between
computing nodes
Carried weight Current weight
ApproxIoT in the cloud
Reservoir size equals 1
Query
(sum)
WHS
The weights are carried
W=4/3*6/1 =8
W=1*4/1=4
W=1*2/1=2
± error bound
8* +4* +2*
W=4/3
W=1
W=1
Approximate output:
Central
node
Cloud
Edge nodes
Regional edge Continental node
Outline
• Motivation
• Design
• Implementation
• Evaluation
Implementation
S1
S2
Sn
…
Kafka
cluster
Stream
pub/sub
Edge
nodes
Cloud
datacenter
Data stream
Sampled
data stream
Sampled
data stream
See the paper
for more details
Kafka Streams
Experimental setup
• Evaluation questions
• Accuracy vs. sample size
• Throughput vs. sample size
• Testbed: 25 nodes
• 15 nodes for ApproxIoT deployment
• 10 nodes for Kafka cluster
• Datasets:
• Synthetic: Poisson and Gaussian distribution
• Real: Brasvo pollution and New York Taxi Ride
See the paper
for more
results!
Accuracy vs. sample size
0
20
40
60
80
10 20 40 60 80
Accuracy
loss(%)
Sampling fraction(%)
SRS ApproxIoT
Lower
the better
ApproxIoT: ~2600X higher accuracy over SRS
The average is 0.035%
Throughput vs. sample size
0
40
80
120
10 20 40 60 80 90 100
Throughput(k)
items/s
Sampling fraction(%)
Native SRS ApproxIoT
Higher
the better
• ApproxIoT has low overhead compared to the native execution
• ApproxIoT has similar throughput as SRS
Conclusion
ApproxIoT: Approximate analytics for edge computing
Adaptability Adaptive execution based on the available resources
Transparency Requires no code changes and resource management
Thank you!
More details on the project website:
https://ApproxIoT.github.io/ApproxIoT/
Efficiency Efficient computing and bandwidth resource utilization

More Related Content

Similar to approxIoT.pptx

overbooking.ppt
overbooking.pptoverbooking.ppt
overbooking.ppt
webhostingguy
 
Efficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive WindowsEfficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive Windows
Albert Bifet
 
Migration of groups of virtual machines in distributed data centers to reduce...
Migration of groups of virtual machines in distributed data centers to reduce...Migration of groups of virtual machines in distributed data centers to reduce...
Migration of groups of virtual machines in distributed data centers to reduce...
Sabidur Rahman
 
Accordion - VLDB 2014
Accordion - VLDB 2014Accordion - VLDB 2014
Accordion - VLDB 2014
Marco Serafini
 
Continental division of load and balanced ant
Continental division of load and balanced antContinental division of load and balanced ant
Continental division of load and balanced ant
IJCI JOURNAL
 
Srushti_M.E_PPT.ppt
Srushti_M.E_PPT.pptSrushti_M.E_PPT.ppt
Srushti_M.E_PPT.ppt
khalid aberbach
 
EventVisualization
EventVisualizationEventVisualization
EventVisualization
Henoch Wong
 
Wireless Sensor
Wireless SensorWireless Sensor
Wireless Sensor
Deepak Prabhu
 
LHCb Computing Workshop 2018: PV finding with CNNs
LHCb Computing Workshop 2018: PV finding with CNNsLHCb Computing Workshop 2018: PV finding with CNNs
LHCb Computing Workshop 2018: PV finding with CNNs
Henry Schreiner
 
Distributed approximate spectral clustering for large scale datasets
Distributed approximate spectral clustering for large scale datasetsDistributed approximate spectral clustering for large scale datasets
Distributed approximate spectral clustering for large scale datasets
Bita Kazemi
 
Building blocks for aggregate programming of self-organising applications
Building blocks for aggregate programming of self-organising applicationsBuilding blocks for aggregate programming of self-organising applications
Building blocks for aggregate programming of self-organising applications
FoCAS Initiative
 
Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...
Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...
Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...
AtakanAral
 
Query optimization for_sensor_networks
Query optimization for_sensor_networksQuery optimization for_sensor_networks
Query optimization for_sensor_networks
Harshavardhan Achrekar
 
JPN1406 Snapshot and Continuous Data Collection in Probabilistic Wireless S...
JPN1406   Snapshot and Continuous Data Collection in Probabilistic Wireless S...JPN1406   Snapshot and Continuous Data Collection in Probabilistic Wireless S...
JPN1406 Snapshot and Continuous Data Collection in Probabilistic Wireless S...
chennaijp
 
Crash course on data streaming (with examples using Apache Flink)
Crash course on data streaming (with examples using Apache Flink)Crash course on data streaming (with examples using Apache Flink)
Crash course on data streaming (with examples using Apache Flink)
Vincenzo Gulisano
 
Taming Big Data!
Taming Big Data!Taming Big Data!
Taming Big Data!
Ian Foster
 
Geospatial Sensor Networks and Partitioning Data
Geospatial Sensor Networks and Partitioning DataGeospatial Sensor Networks and Partitioning Data
Geospatial Sensor Networks and Partitioning Data
AlexMiowski
 
Unit 4
Unit 4Unit 4
Unit 4
Ravi Kumar
 
Inter Task Communication On Volatile Nodes
Inter Task Communication On Volatile NodesInter Task Communication On Volatile Nodes
Inter Task Communication On Volatile Nodes
nagarajan_ka
 
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
DataStax Academy
 

Similar to approxIoT.pptx (20)

overbooking.ppt
overbooking.pptoverbooking.ppt
overbooking.ppt
 
Efficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive WindowsEfficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive Windows
 
Migration of groups of virtual machines in distributed data centers to reduce...
Migration of groups of virtual machines in distributed data centers to reduce...Migration of groups of virtual machines in distributed data centers to reduce...
Migration of groups of virtual machines in distributed data centers to reduce...
 
Accordion - VLDB 2014
Accordion - VLDB 2014Accordion - VLDB 2014
Accordion - VLDB 2014
 
Continental division of load and balanced ant
Continental division of load and balanced antContinental division of load and balanced ant
Continental division of load and balanced ant
 
Srushti_M.E_PPT.ppt
Srushti_M.E_PPT.pptSrushti_M.E_PPT.ppt
Srushti_M.E_PPT.ppt
 
EventVisualization
EventVisualizationEventVisualization
EventVisualization
 
Wireless Sensor
Wireless SensorWireless Sensor
Wireless Sensor
 
LHCb Computing Workshop 2018: PV finding with CNNs
LHCb Computing Workshop 2018: PV finding with CNNsLHCb Computing Workshop 2018: PV finding with CNNs
LHCb Computing Workshop 2018: PV finding with CNNs
 
Distributed approximate spectral clustering for large scale datasets
Distributed approximate spectral clustering for large scale datasetsDistributed approximate spectral clustering for large scale datasets
Distributed approximate spectral clustering for large scale datasets
 
Building blocks for aggregate programming of self-organising applications
Building blocks for aggregate programming of self-organising applicationsBuilding blocks for aggregate programming of self-organising applications
Building blocks for aggregate programming of self-organising applications
 
Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...
Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...
Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...
 
Query optimization for_sensor_networks
Query optimization for_sensor_networksQuery optimization for_sensor_networks
Query optimization for_sensor_networks
 
JPN1406 Snapshot and Continuous Data Collection in Probabilistic Wireless S...
JPN1406   Snapshot and Continuous Data Collection in Probabilistic Wireless S...JPN1406   Snapshot and Continuous Data Collection in Probabilistic Wireless S...
JPN1406 Snapshot and Continuous Data Collection in Probabilistic Wireless S...
 
Crash course on data streaming (with examples using Apache Flink)
Crash course on data streaming (with examples using Apache Flink)Crash course on data streaming (with examples using Apache Flink)
Crash course on data streaming (with examples using Apache Flink)
 
Taming Big Data!
Taming Big Data!Taming Big Data!
Taming Big Data!
 
Geospatial Sensor Networks and Partitioning Data
Geospatial Sensor Networks and Partitioning DataGeospatial Sensor Networks and Partitioning Data
Geospatial Sensor Networks and Partitioning Data
 
Unit 4
Unit 4Unit 4
Unit 4
 
Inter Task Communication On Volatile Nodes
Inter Task Communication On Volatile NodesInter Task Communication On Volatile Nodes
Inter Task Communication On Volatile Nodes
 
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
 

Recently uploaded

CSM Cloud Service Management Presentarion
CSM Cloud Service Management PresentarionCSM Cloud Service Management Presentarion
CSM Cloud Service Management Presentarion
rpskprasana
 
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMSA SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
IJNSA Journal
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
KrishnaveniKrishnara1
 
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
University of Maribor
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
Rahul
 
PPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testingPPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testing
anoopmanoharan2
 
Question paper of renewable energy sources
Question paper of renewable energy sourcesQuestion paper of renewable energy sources
Question paper of renewable energy sources
mahammadsalmanmech
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
IJECEIAES
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
IJECEIAES
 
Wearable antenna for antenna applications
Wearable antenna for antenna applicationsWearable antenna for antenna applications
Wearable antenna for antenna applications
Madhumitha Jayaram
 
New techniques for characterising damage in rock slopes.pdf
New techniques for characterising damage in rock slopes.pdfNew techniques for characterising damage in rock slopes.pdf
New techniques for characterising damage in rock slopes.pdf
wisnuprabawa3
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
MIGUELANGEL966976
 
Heat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation pptHeat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation ppt
mamunhossenbd75
 
6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)
ClaraZara1
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
SUTEJAS
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
Madan Karki
 
Technical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prismsTechnical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prisms
heavyhaig
 
2. Operations Strategy in a Global Environment.ppt
2. Operations Strategy in a Global Environment.ppt2. Operations Strategy in a Global Environment.ppt
2. Operations Strategy in a Global Environment.ppt
PuktoonEngr
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
Victor Morales
 
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdfIron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
RadiNasr
 

Recently uploaded (20)

CSM Cloud Service Management Presentarion
CSM Cloud Service Management PresentarionCSM Cloud Service Management Presentarion
CSM Cloud Service Management Presentarion
 
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMSA SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
 
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
 
PPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testingPPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testing
 
Question paper of renewable energy sources
Question paper of renewable energy sourcesQuestion paper of renewable energy sources
Question paper of renewable energy sources
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
 
Wearable antenna for antenna applications
Wearable antenna for antenna applicationsWearable antenna for antenna applications
Wearable antenna for antenna applications
 
New techniques for characterising damage in rock slopes.pdf
New techniques for characterising damage in rock slopes.pdfNew techniques for characterising damage in rock slopes.pdf
New techniques for characterising damage in rock slopes.pdf
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
 
Heat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation pptHeat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation ppt
 
6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
 
Technical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prismsTechnical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prisms
 
2. Operations Strategy in a Global Environment.ppt
2. Operations Strategy in a Global Environment.ppt2. Operations Strategy in a Global Environment.ppt
2. Operations Strategy in a Global Environment.ppt
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
 
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdfIron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
 

approxIoT.pptx