SlideShare a Scribd company logo
1 of 47
Download to read offline
29th International European Conference on Parallel and Distributed Computing (Euro-Par2023)
Moysis Symeonides∗, Demetris Trihinas†, George Pallis∗, Marios D. Dikaiakos∗
∗Department of Computer Science
University of Cyprus
{msymeo03, pallis, mdd}@ucy.ac.cy
†Department of Computer Science
University of Nicosia
trihinas.d@unic.ac.cy
An Emulation Framework for
Edge-enabled Apache Spark Deployments
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023
The Internet of Things
2
● Transforming the physical world into an information system.
● The predictions put the number of connected IoT devices up to 29.4
billion by 2030 and data generated from them to be 73.1 ZB by 2025.
Analytic Insights
cloud
● It only seems “natural” that IoT services offload analytic jobs to the cloud for processing.
https://www.idc.com/getdoc.jsp?containerId=prAP46737220, 2020
https://www.statista.com/statistics/1183457/iot-connected-devices-worldwide/, 2023
Data Analyst
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023
Apache Spark and Internet of Things
3
Apache Spark democratized the big data processing by hiding the complexity related to
machine communication, resource management, task scheduling, fault tolerance, etc…
● But… data analysis usually come with near real-time requirements and moving data
centrally for processing penalizes analytics timeliness.
cloud
Analytic Insights
Data Analyst
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023
Bringing Apache Spark to the Edge
4
Analytic Insights
cloud
The “Edge”
Deploying Apache Spark to the Edge and performing data processing in place or in
vicinity of data generating devices offers:
● Efficient processing
● Less network pressure
● Cost reduction
Data Analyst
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023
ARM 8 cores @2.2GHz
32GB RAM 512-core GPU
$695
Jetson Nano
Challenges of Edge-enabled Spark Deployments
5
5
ARM 4 cores@1.5GHz
4GB RAM
$54
Plethora of Devices and
Heterogeneity
ARM 4 cores @1.4GHz
4GB RAM + GPU No wifi
$69
AGX Xavier
Raspberry Pi 4
Scalable Geo-distributed
Deployments
Infrastructure and
System Monitoring
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 6
SparkEdgeEmu
An Emulation Framework for Edge-enabled Apache Spark Deployments
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023
7
● Statistically Generated Edge Infrastructures
● Automated Apache Spark Deployment
● Query-level Performance Profiler
● Infrastructure & Apache Spark Metrics
SparkEdgeEmu
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 8
SparkEdgeEmu Overview
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 9
SparkEdgeEmu Overview
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 10
SparkEdgeEmu Overview
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023
Rausch, T., Lachner, C., Frangoudis, P. A., Raith, P., & Dustdar, S. Ether: Synthesizing Plausible Infrastructure Configurations for Evaluating Edge Computing Systems. In 3rd USENIX Workshop on
Hot Topics in Edge Computing (HotEdge 2020). USENIX Association
11
SparkEdgeEmu Overview
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 12
SparkEdgeEmu Overview
.…
.…
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 13
SparkEdgeEmu Overview
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 14
SparkEdgeEmu Overview
M. Symeonides, Z. Georgiou, D. Trihinas, G. Pallis and M. D. Dikaiakos, "Fogify: A Fog Computing Emulation Framework," IEEE/ACM Symposium on Edge Computing (SEC2020)
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 15
SparkEdgeEmu Overview
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 16
SparkEdgeEmu Overview
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 17
Implementation Aspects
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 18
Modeling Abstractions
● Infrastructure components, that include:
○ Devices types that describe workers capabilities
including processor, memory, and disk.
○ Connections types that provide link characteristics
for uplink and downlink QoS, including data rate,
latency, error rate, etc.
● Use case
○ defines a scenario.
○ specifies use case parameters, like number of
devices, regions, etc., and the infrastructure
components.
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 19
Modeling Abstractions
● Infrastructure components, that include:
○ Devices types that describe workers capabilities
including processor, memory, and disk.
○ Connections types that provide link characteristics
for uplink and downlink QoS, including data rate,
latency, error rate, etc.
● Use case
○ defines a scenario.
○ specifies use case parameters, like number of
devices, regions, etc., and the infrastructure
components.
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 20
Modeling Abstractions
● Infrastructure components, that include:
○ Devices types that describe workers capabilities
including processor, memory, and disk.
○ Connections types that provide link characteristics
for uplink and downlink QoS, including data rate,
latency, error rate, etc.
● Use case
○ defines a scenario.
○ specifies use case parameters, like number of
devices, regions, etc., and the infrastructure
components.
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 21
Modeling Abstractions
● Infrastructure components, that include:
○ Devices types that describe workers capabilities
including processor, memory, and disk.
○ Connections types that provide link characteristics
for uplink and downlink QoS, including data rate,
latency, error rate, etc.
● Use case
○ defines a scenario.
○ specifies use case parameters, like number of
devices, regions, etc., and the infrastructure
components.
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 22
Interactive Programming Interface
SparkEdgeEmu offers a programming
interface that allows users to:
● load the modeling abstractions
● deploy the description and create an
emulated Edge-Spark testbed
● connect on Spark and submit queries
● retrieve metrics from infrastructure and
Spark cluster
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 23
Interactive Programming Interface
SparkEdgeEmu offers a programming
interface that allows users to:
● load the modeling abstractions
● deploy the description and create an
emulated Edge-Spark testbed
● connect on Spark and submit queries
● retrieve metrics from infrastructure and
Spark cluster
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 24
Interactive Programming Interface
SparkEdgeEmu offers a programming
interface that allows users to:
● load the modeling abstractions
● deploy the description and create an
emulated Edge-Spark testbed
● connect on Spark and submit queries
● retrieve metrics from infrastructure and
Spark cluster
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 25
Interactive Programming Interface
SparkEdgeEmu offers a programming
interface that allows users to:
● load the modeling abstractions
● deploy the description and create an
emulated Edge-Spark testbed
● connect on Spark and submit queries
● retrieve metrics from infrastructure and
Spark cluster
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 26
Interactive Programming Interface
SparkEdgeEmu offers a programming
interface that allows users to:
● load the modeling abstractions
● deploy the description and create an
emulated Edge-Spark testbed
● connect on Spark and submit queries
● retrieve metrics from infrastructure and
Spark cluster
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 27
Interactive Programming Interface
SparkEdgeEmu offers a programming
interface that allows users to:
● load the modeling abstractions
● deploy the description and create an
emulated Edge-Spark testbed
● connect on Spark and submit queries
● retrieve metrics from infrastructure and
Spark cluster
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 28
Infrastructure Use Case Generator
Ether is a framework for synthesizing plausible edge
infrastructure use cases configurations, grounded on
empirical data, like:
● Smart Cities Scenarios
● Industrial IoT deployment
● Mobile Edge clouds and vehicular networks, etc.
Ether creates an in-memory network with all information about the use case annotated with
resource and network QoS properties.
Rausch, T., Lachner, C., Frangoudis, P. A., Raith, P., & Dustdar, S. Ether: Synthesizing Plausible Infrastructure Configurations for Evaluating Edge Computing Systems. In 3rd USENIX Workshop on
Hot Topics in Edge Computing (HotEdge 2020). USENIX Association
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 29
Use Case to Emulation Model
SparkEdgeEmu generates the
execution configuration files by
traversing and translating the
network’s structure.
…
worker-nuc-0
image: emulator/spark-worker:3.0.0
environment:
- SPARK_WORKER_CORES=4
- SPARK_WORKER_MEMORY=8G
- SPARK_MASTER_HOST=spark-master
…
volumes:
- /data:{data_folder}
…
…
- from_node: nuc-0
to_node: cloudlet-vm-0
network: edge_net
properties:
bandwidth: 125Mbps
latency:
delay: 24.45ms
…
Typically, an Edge emulation framework forces users to define every emulated node, pairwise
connections, Spark configurations, etc…
A topology with 20 nodes generates a
YAML configuration file with more
than 100 configuration blocks!!!
…
- name: nuc
capabilities:
processor:
clock_speed: 2400
cores: 4
memory: 8G
disk: …
…
- label: nuc-0
networks:
- edge_net
node: nuc
replicas: 1
service: worker-nuc-0
…
…
- from_node: cloudlet-vm-0
to_node: nuc-0
network: edge_net
properties:
bandwidth: 125Mbps
latency:
delay: 24.45ms
…
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 30
Emulation & Deployment
M. Symeonides, Z. Georgiou, D. Trihinas, G. Pallis and M. D. Dikaiakos, "Fogify: A Fog Computing Emulation Framework," IEEE/ACM Symposium on Edge Computing (SEC2020)
SparkEdgeEmu utilizes Fogify as a scalable
underlying emulator since it provides:
● Resource Heterogeneity
● Network Link Heterogeneity
● Any-scale Experimentation
● Monitoring Capabilities
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 31
Emulation & Deployment
M. Symeonides, Z. Georgiou, D. Trihinas, G. Pallis and M. D. Dikaiakos, "Fogify: A Fog Computing Emulation Framework," IEEE/ACM Symposium on Edge Computing (SEC2020)
SparkEdgeEmu utilizes Fogify as a scalable
underlying emulator since it provides:
● Resource Heterogeneity
● Network Link Heterogeneity
● Any-scale Experimentation
● Monitoring Capabilities
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 32
Emulation & Deployment
M. Symeonides, Z. Georgiou, D. Trihinas, G. Pallis and M. D. Dikaiakos, "Fogify: A Fog Computing Emulation Framework," IEEE/ACM Symposium on Edge Computing (SEC2020)
SparkEdgeEmu utilizes Fogify as a scalable
underlying emulator since it provides:
● Resource Heterogeneity
● Network Link Heterogeneity
● Any-scale Experimentation
● Monitoring Capabilities
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 33
Emulation & Deployment
M. Symeonides, Z. Georgiou, D. Trihinas, G. Pallis and M. D. Dikaiakos, "Fogify: A Fog Computing Emulation Framework," IEEE/ACM Symposium on Edge Computing (SEC2020)
SparkEdgeEmu utilizes Fogify as a scalable
underlying emulator since it provides:
● Resource Heterogeneity
● Network Link Heterogeneity
● Any-scale Experimentation
● Monitoring Capabilities
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 34
Emulation & Deployment
SparkEdgeEmu utilizes Fogify as a scalable
underlying emulator since it provides:
● Resource Heterogeneity
● Network Link Heterogeneity
● Any-scale Experimentation
● Monitoring Capabilities
M. Symeonides, Z. Georgiou, D. Trihinas, G. Pallis and M. D. Dikaiakos, "Fogify: A Fog Computing Emulation Framework," IEEE/ACM Symposium on Edge Computing (SEC2020)
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 35
Experimental Study
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 36
Smart City Use Case
We utilize the framework’s templates to
emulate a city-scale Apache Spark cluster with
22 heterogeneous Edge nodes deployed in three
regions.
The emulated topology includes 22 nodes:
● One cloudlet server, 8 cores@2.4GHz and 8GB RAM
● 8 raspberry Pi3b, 4 cores@1.2GHz and 1GB RAM
● 6 raspberry Pi4, 4 cores @1.8GHz and 4GB RAM
● 7 Jetson-Nano, 4 cores @1.43 GHz and 4GB RAM
We use the real-world dataset* of NYC For-Hire Vehicle (”FHV”) trip routes in the first half of 2022 from New York city.
*https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 37
Performance Evaluation of Queries
We perform three queries on top of the deployed Spark
cluster, namely:
● Average payment of each drop-off location
● Number of trips per company
● The overall summarized tips
It seems that execution time follows Q1>Q2>Q3, while the
refetching of data harms the performance of the queries.
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023
CPU utilization & Task placement
38
Apache Spark tends to assign more
tasks to more powerful nodes, while
these nodes usually are underutilized in
Edge topologies…
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023
Memory Usage
39
Apache Spark occupied less memory on
edge memory-constrained devices, even
if it assigns to them a similar number of
tasks as the other Edge nodes.
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023
Network Utilization
40
The health-check and the task-assigning
messages contribute a non-negligible
extra overhead in network traffic.
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023 41
Conclusion
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023
Conclusion
● We introduced SparkEdgeEmu an open-source emulation framework for Edge-enabled
Apache Spark Deployments, that…
○ features a powerful model specification for Edge use case deployments
○ offers a ease-to-use interactive programming interface
○ generates representative and realistic large-scale deployments
○ provides monitoring capabilities for both emulated infrastructure and Spark cluster
● We provided a detailed description of the implementation aspects, such as modeling,
deployment realization and Spark Edge cluster emulation.
● We demonstrate the use of SparkEdgeEmu utilizing a representative use case scenario
highlighting how it influences the performance of the queries, and their effects on
underlying resources.
42
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023
Thank You
43
repo: https://github.com/UCY-LINC-LAB/SparkEdgeEmu
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023
Backup slides
44
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023
Testbeds & Emulators
45
To get the most accurate results in testing, you need a physical deployment, however…
● to build a physical testbed is costly, time-consuming, and for minor changes requires extra money
and re-configurations
Edge and Fog Emulators can help by providing scalable testbeds with resource and network
heterogeneity[1][2][3] but they do not…
● provide randomized large-scale Edge deployments, with users manually needing to configure and
design every single emulated node
● provide specialized deployments for Big Data Engines (e.g., Apache Spark), and consequently
● offer fine-grained monitoring metrics from the Big Data Engines
[1] Fogbed: A rapid-prototyping emulation environment for fog computing, A. Coutinho et al. IEEE International Conference on Communications (ICC), 2018
[2] Emuedge: A hybrid emulator for reproducible and realistic edge computing experiments, Y. Zeng IEEE International Conference on Fog Computing (ICFC), 2019
[3] MockFog: Emulating Fog Computing Infrastructure in the Cloud, J. Hasenburg et al. IEEE International Conference on Fog Computing (ICFC), 2019
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023
Use Case Scenario
46
Data Analyst
● The underlying cluster of 48cores@2.450 GHz and 176 GB memory
Moysis Symeonides
msymeo03@ucy.ac.cy
Laboratory for Internet
Computing
Euro-Par 2023
Emulation Accuracy
The emulated computing resources has only a small performance
degradation for workloads approaching 100% CPU usage
Fogify achieves near to real-world network link capabilities, with
only outliers not captured and a slight overhead in low-latency
connections
The emulation results closely follow the real measurements
with a 5%-8% deviation of the overall experiment time.

More Related Content

Similar to SparkEdgeEmu: An Emulation Framework for Edge-enabled Apache Spark Deployments

An Exploration of Grid Computing to be Utilized in Teaching and Research at TU
An Exploration of Grid Computing to be Utilized in Teaching and Research at TUAn Exploration of Grid Computing to be Utilized in Teaching and Research at TU
An Exploration of Grid Computing to be Utilized in Teaching and Research at TU
Eswar Publications
 
A survey of fog computing concepts applications and issues
A survey of fog computing concepts  applications and issuesA survey of fog computing concepts  applications and issues
A survey of fog computing concepts applications and issues
Rezgar Mohammad
 

Similar to SparkEdgeEmu: An Emulation Framework for Edge-enabled Apache Spark Deployments (20)

4 th International Conference on Cloud, Big Data and IoT (CBIoT 2023)
4 th International Conference on Cloud, Big Data and IoT (CBIoT 2023)4 th International Conference on Cloud, Big Data and IoT (CBIoT 2023)
4 th International Conference on Cloud, Big Data and IoT (CBIoT 2023)
 
4th International Conference on Cloud, Big Data and IoT (CBIoT 2023)
4th International Conference on Cloud, Big Data and IoT (CBIoT 2023)4th International Conference on Cloud, Big Data and IoT (CBIoT 2023)
4th International Conference on Cloud, Big Data and IoT (CBIoT 2023)
 
4th International Conference on Cloud, Big Data and IoT (CBIoT 2023)
4th International Conference on Cloud, Big Data and IoT (CBIoT 2023)4th International Conference on Cloud, Big Data and IoT (CBIoT 2023)
4th International Conference on Cloud, Big Data and IoT (CBIoT 2023)
 
Call for Research Articles - 4th International Conference on Cloud, Big Data ...
Call for Research Articles - 4th International Conference on Cloud, Big Data ...Call for Research Articles - 4th International Conference on Cloud, Big Data ...
Call for Research Articles - 4th International Conference on Cloud, Big Data ...
 
Experimental Based Learning and Modeling of Computer Networks
Experimental Based Learning and Modeling of Computer NetworksExperimental Based Learning and Modeling of Computer Networks
Experimental Based Learning and Modeling of Computer Networks
 
An Exploration of Grid Computing to be Utilized in Teaching and Research at TU
An Exploration of Grid Computing to be Utilized in Teaching and Research at TUAn Exploration of Grid Computing to be Utilized in Teaching and Research at TU
An Exploration of Grid Computing to be Utilized in Teaching and Research at TU
 
IoT Interfaces to Cloud + Big Data
 IoT Interfaces to Cloud + Big Data IoT Interfaces to Cloud + Big Data
IoT Interfaces to Cloud + Big Data
 
The Role of Machine Learning in Fluid Network Control and Data Planes.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdfThe Role of Machine Learning in Fluid Network Control and Data Planes.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdf
 
11 th International Conference on Parallel, Distributed Computing Technologi...
11 th International Conference on Parallel, Distributed Computing  Technologi...11 th International Conference on Parallel, Distributed Computing  Technologi...
11 th International Conference on Parallel, Distributed Computing Technologi...
 
Call for paper-11th International Conference on Parallel, Distributed Computi...
Call for paper-11th International Conference on Parallel, Distributed Computi...Call for paper-11th International Conference on Parallel, Distributed Computi...
Call for paper-11th International Conference on Parallel, Distributed Computi...
 
Call for Research Articles - 13th International Conference on Cloud Computing...
Call for Research Articles - 13th International Conference on Cloud Computing...Call for Research Articles - 13th International Conference on Cloud Computing...
Call for Research Articles - 13th International Conference on Cloud Computing...
 
Call for Research Articles - 13th International Conference on Cloud Computing...
Call for Research Articles - 13th International Conference on Cloud Computing...Call for Research Articles - 13th International Conference on Cloud Computing...
Call for Research Articles - 13th International Conference on Cloud Computing...
 
Call for Papers - 13th International Conference on Cloud Computing: Services ...
Call for Papers - 13th International Conference on Cloud Computing: Services ...Call for Papers - 13th International Conference on Cloud Computing: Services ...
Call for Papers - 13th International Conference on Cloud Computing: Services ...
 
A survey of fog computing concepts applications and issues
A survey of fog computing concepts  applications and issuesA survey of fog computing concepts  applications and issues
A survey of fog computing concepts applications and issues
 
11 th International Conference on Parallel, Distributed Computing Technologi...
11 th International Conference on Parallel, Distributed Computing  Technologi...11 th International Conference on Parallel, Distributed Computing  Technologi...
11 th International Conference on Parallel, Distributed Computing Technologi...
 
Cloud Module 1.pptx
Cloud Module 1.pptxCloud Module 1.pptx
Cloud Module 1.pptx
 
From Embedded to IoT and From Cloud to Edge & AIoT -- A computer technology t...
From Embedded to IoT and From Cloud to Edge & AIoT -- A computer technology t...From Embedded to IoT and From Cloud to Edge & AIoT -- A computer technology t...
From Embedded to IoT and From Cloud to Edge & AIoT -- A computer technology t...
 
Web Development in Advanced Threat Prevention
Web Development in Advanced Threat PreventionWeb Development in Advanced Threat Prevention
Web Development in Advanced Threat Prevention
 
Lecture 2_IoT.pptx
Lecture 2_IoT.pptxLecture 2_IoT.pptx
Lecture 2_IoT.pptx
 
11th International Conference on Parallel, Distributed Computing Technologies...
11th International Conference on Parallel, Distributed Computing Technologies...11th International Conference on Parallel, Distributed Computing Technologies...
11th International Conference on Parallel, Distributed Computing Technologies...
 

Recently uploaded

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Recently uploaded (20)

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 

SparkEdgeEmu: An Emulation Framework for Edge-enabled Apache Spark Deployments

  • 1. 29th International European Conference on Parallel and Distributed Computing (Euro-Par2023) Moysis Symeonides∗, Demetris Trihinas†, George Pallis∗, Marios D. Dikaiakos∗ ∗Department of Computer Science University of Cyprus {msymeo03, pallis, mdd}@ucy.ac.cy †Department of Computer Science University of Nicosia trihinas.d@unic.ac.cy An Emulation Framework for Edge-enabled Apache Spark Deployments
  • 2. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 The Internet of Things 2 ● Transforming the physical world into an information system. ● The predictions put the number of connected IoT devices up to 29.4 billion by 2030 and data generated from them to be 73.1 ZB by 2025. Analytic Insights cloud ● It only seems “natural” that IoT services offload analytic jobs to the cloud for processing. https://www.idc.com/getdoc.jsp?containerId=prAP46737220, 2020 https://www.statista.com/statistics/1183457/iot-connected-devices-worldwide/, 2023 Data Analyst
  • 3. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 Apache Spark and Internet of Things 3 Apache Spark democratized the big data processing by hiding the complexity related to machine communication, resource management, task scheduling, fault tolerance, etc… ● But… data analysis usually come with near real-time requirements and moving data centrally for processing penalizes analytics timeliness. cloud Analytic Insights Data Analyst
  • 4. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 Bringing Apache Spark to the Edge 4 Analytic Insights cloud The “Edge” Deploying Apache Spark to the Edge and performing data processing in place or in vicinity of data generating devices offers: ● Efficient processing ● Less network pressure ● Cost reduction Data Analyst
  • 5. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 ARM 8 cores @2.2GHz 32GB RAM 512-core GPU $695 Jetson Nano Challenges of Edge-enabled Spark Deployments 5 5 ARM 4 cores@1.5GHz 4GB RAM $54 Plethora of Devices and Heterogeneity ARM 4 cores @1.4GHz 4GB RAM + GPU No wifi $69 AGX Xavier Raspberry Pi 4 Scalable Geo-distributed Deployments Infrastructure and System Monitoring
  • 6. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 6 SparkEdgeEmu An Emulation Framework for Edge-enabled Apache Spark Deployments
  • 7. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 7 ● Statistically Generated Edge Infrastructures ● Automated Apache Spark Deployment ● Query-level Performance Profiler ● Infrastructure & Apache Spark Metrics SparkEdgeEmu
  • 8. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 8 SparkEdgeEmu Overview
  • 9. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 9 SparkEdgeEmu Overview
  • 10. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 10 SparkEdgeEmu Overview
  • 11. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 Rausch, T., Lachner, C., Frangoudis, P. A., Raith, P., & Dustdar, S. Ether: Synthesizing Plausible Infrastructure Configurations for Evaluating Edge Computing Systems. In 3rd USENIX Workshop on Hot Topics in Edge Computing (HotEdge 2020). USENIX Association 11 SparkEdgeEmu Overview
  • 12. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 12 SparkEdgeEmu Overview .… .…
  • 13. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 13 SparkEdgeEmu Overview
  • 14. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 14 SparkEdgeEmu Overview M. Symeonides, Z. Georgiou, D. Trihinas, G. Pallis and M. D. Dikaiakos, "Fogify: A Fog Computing Emulation Framework," IEEE/ACM Symposium on Edge Computing (SEC2020)
  • 15. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 15 SparkEdgeEmu Overview
  • 16. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 16 SparkEdgeEmu Overview
  • 17. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 17 Implementation Aspects
  • 18. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 18 Modeling Abstractions ● Infrastructure components, that include: ○ Devices types that describe workers capabilities including processor, memory, and disk. ○ Connections types that provide link characteristics for uplink and downlink QoS, including data rate, latency, error rate, etc. ● Use case ○ defines a scenario. ○ specifies use case parameters, like number of devices, regions, etc., and the infrastructure components.
  • 19. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 19 Modeling Abstractions ● Infrastructure components, that include: ○ Devices types that describe workers capabilities including processor, memory, and disk. ○ Connections types that provide link characteristics for uplink and downlink QoS, including data rate, latency, error rate, etc. ● Use case ○ defines a scenario. ○ specifies use case parameters, like number of devices, regions, etc., and the infrastructure components.
  • 20. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 20 Modeling Abstractions ● Infrastructure components, that include: ○ Devices types that describe workers capabilities including processor, memory, and disk. ○ Connections types that provide link characteristics for uplink and downlink QoS, including data rate, latency, error rate, etc. ● Use case ○ defines a scenario. ○ specifies use case parameters, like number of devices, regions, etc., and the infrastructure components.
  • 21. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 21 Modeling Abstractions ● Infrastructure components, that include: ○ Devices types that describe workers capabilities including processor, memory, and disk. ○ Connections types that provide link characteristics for uplink and downlink QoS, including data rate, latency, error rate, etc. ● Use case ○ defines a scenario. ○ specifies use case parameters, like number of devices, regions, etc., and the infrastructure components.
  • 22. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 22 Interactive Programming Interface SparkEdgeEmu offers a programming interface that allows users to: ● load the modeling abstractions ● deploy the description and create an emulated Edge-Spark testbed ● connect on Spark and submit queries ● retrieve metrics from infrastructure and Spark cluster
  • 23. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 23 Interactive Programming Interface SparkEdgeEmu offers a programming interface that allows users to: ● load the modeling abstractions ● deploy the description and create an emulated Edge-Spark testbed ● connect on Spark and submit queries ● retrieve metrics from infrastructure and Spark cluster
  • 24. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 24 Interactive Programming Interface SparkEdgeEmu offers a programming interface that allows users to: ● load the modeling abstractions ● deploy the description and create an emulated Edge-Spark testbed ● connect on Spark and submit queries ● retrieve metrics from infrastructure and Spark cluster
  • 25. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 25 Interactive Programming Interface SparkEdgeEmu offers a programming interface that allows users to: ● load the modeling abstractions ● deploy the description and create an emulated Edge-Spark testbed ● connect on Spark and submit queries ● retrieve metrics from infrastructure and Spark cluster
  • 26. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 26 Interactive Programming Interface SparkEdgeEmu offers a programming interface that allows users to: ● load the modeling abstractions ● deploy the description and create an emulated Edge-Spark testbed ● connect on Spark and submit queries ● retrieve metrics from infrastructure and Spark cluster
  • 27. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 27 Interactive Programming Interface SparkEdgeEmu offers a programming interface that allows users to: ● load the modeling abstractions ● deploy the description and create an emulated Edge-Spark testbed ● connect on Spark and submit queries ● retrieve metrics from infrastructure and Spark cluster
  • 28. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 28 Infrastructure Use Case Generator Ether is a framework for synthesizing plausible edge infrastructure use cases configurations, grounded on empirical data, like: ● Smart Cities Scenarios ● Industrial IoT deployment ● Mobile Edge clouds and vehicular networks, etc. Ether creates an in-memory network with all information about the use case annotated with resource and network QoS properties. Rausch, T., Lachner, C., Frangoudis, P. A., Raith, P., & Dustdar, S. Ether: Synthesizing Plausible Infrastructure Configurations for Evaluating Edge Computing Systems. In 3rd USENIX Workshop on Hot Topics in Edge Computing (HotEdge 2020). USENIX Association
  • 29. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 29 Use Case to Emulation Model SparkEdgeEmu generates the execution configuration files by traversing and translating the network’s structure. … worker-nuc-0 image: emulator/spark-worker:3.0.0 environment: - SPARK_WORKER_CORES=4 - SPARK_WORKER_MEMORY=8G - SPARK_MASTER_HOST=spark-master … volumes: - /data:{data_folder} … … - from_node: nuc-0 to_node: cloudlet-vm-0 network: edge_net properties: bandwidth: 125Mbps latency: delay: 24.45ms … Typically, an Edge emulation framework forces users to define every emulated node, pairwise connections, Spark configurations, etc… A topology with 20 nodes generates a YAML configuration file with more than 100 configuration blocks!!! … - name: nuc capabilities: processor: clock_speed: 2400 cores: 4 memory: 8G disk: … … - label: nuc-0 networks: - edge_net node: nuc replicas: 1 service: worker-nuc-0 … … - from_node: cloudlet-vm-0 to_node: nuc-0 network: edge_net properties: bandwidth: 125Mbps latency: delay: 24.45ms …
  • 30. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 30 Emulation & Deployment M. Symeonides, Z. Georgiou, D. Trihinas, G. Pallis and M. D. Dikaiakos, "Fogify: A Fog Computing Emulation Framework," IEEE/ACM Symposium on Edge Computing (SEC2020) SparkEdgeEmu utilizes Fogify as a scalable underlying emulator since it provides: ● Resource Heterogeneity ● Network Link Heterogeneity ● Any-scale Experimentation ● Monitoring Capabilities
  • 31. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 31 Emulation & Deployment M. Symeonides, Z. Georgiou, D. Trihinas, G. Pallis and M. D. Dikaiakos, "Fogify: A Fog Computing Emulation Framework," IEEE/ACM Symposium on Edge Computing (SEC2020) SparkEdgeEmu utilizes Fogify as a scalable underlying emulator since it provides: ● Resource Heterogeneity ● Network Link Heterogeneity ● Any-scale Experimentation ● Monitoring Capabilities
  • 32. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 32 Emulation & Deployment M. Symeonides, Z. Georgiou, D. Trihinas, G. Pallis and M. D. Dikaiakos, "Fogify: A Fog Computing Emulation Framework," IEEE/ACM Symposium on Edge Computing (SEC2020) SparkEdgeEmu utilizes Fogify as a scalable underlying emulator since it provides: ● Resource Heterogeneity ● Network Link Heterogeneity ● Any-scale Experimentation ● Monitoring Capabilities
  • 33. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 33 Emulation & Deployment M. Symeonides, Z. Georgiou, D. Trihinas, G. Pallis and M. D. Dikaiakos, "Fogify: A Fog Computing Emulation Framework," IEEE/ACM Symposium on Edge Computing (SEC2020) SparkEdgeEmu utilizes Fogify as a scalable underlying emulator since it provides: ● Resource Heterogeneity ● Network Link Heterogeneity ● Any-scale Experimentation ● Monitoring Capabilities
  • 34. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 34 Emulation & Deployment SparkEdgeEmu utilizes Fogify as a scalable underlying emulator since it provides: ● Resource Heterogeneity ● Network Link Heterogeneity ● Any-scale Experimentation ● Monitoring Capabilities M. Symeonides, Z. Georgiou, D. Trihinas, G. Pallis and M. D. Dikaiakos, "Fogify: A Fog Computing Emulation Framework," IEEE/ACM Symposium on Edge Computing (SEC2020)
  • 35. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 35 Experimental Study
  • 36. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 36 Smart City Use Case We utilize the framework’s templates to emulate a city-scale Apache Spark cluster with 22 heterogeneous Edge nodes deployed in three regions. The emulated topology includes 22 nodes: ● One cloudlet server, 8 cores@2.4GHz and 8GB RAM ● 8 raspberry Pi3b, 4 cores@1.2GHz and 1GB RAM ● 6 raspberry Pi4, 4 cores @1.8GHz and 4GB RAM ● 7 Jetson-Nano, 4 cores @1.43 GHz and 4GB RAM We use the real-world dataset* of NYC For-Hire Vehicle (”FHV”) trip routes in the first half of 2022 from New York city. *https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page
  • 37. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 37 Performance Evaluation of Queries We perform three queries on top of the deployed Spark cluster, namely: ● Average payment of each drop-off location ● Number of trips per company ● The overall summarized tips It seems that execution time follows Q1>Q2>Q3, while the refetching of data harms the performance of the queries.
  • 38. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 CPU utilization & Task placement 38 Apache Spark tends to assign more tasks to more powerful nodes, while these nodes usually are underutilized in Edge topologies…
  • 39. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 Memory Usage 39 Apache Spark occupied less memory on edge memory-constrained devices, even if it assigns to them a similar number of tasks as the other Edge nodes.
  • 40. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 Network Utilization 40 The health-check and the task-assigning messages contribute a non-negligible extra overhead in network traffic.
  • 41. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 41 Conclusion
  • 42. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 Conclusion ● We introduced SparkEdgeEmu an open-source emulation framework for Edge-enabled Apache Spark Deployments, that… ○ features a powerful model specification for Edge use case deployments ○ offers a ease-to-use interactive programming interface ○ generates representative and realistic large-scale deployments ○ provides monitoring capabilities for both emulated infrastructure and Spark cluster ● We provided a detailed description of the implementation aspects, such as modeling, deployment realization and Spark Edge cluster emulation. ● We demonstrate the use of SparkEdgeEmu utilizing a representative use case scenario highlighting how it influences the performance of the queries, and their effects on underlying resources. 42
  • 43. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 Thank You 43 repo: https://github.com/UCY-LINC-LAB/SparkEdgeEmu
  • 44. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 Backup slides 44
  • 45. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 Testbeds & Emulators 45 To get the most accurate results in testing, you need a physical deployment, however… ● to build a physical testbed is costly, time-consuming, and for minor changes requires extra money and re-configurations Edge and Fog Emulators can help by providing scalable testbeds with resource and network heterogeneity[1][2][3] but they do not… ● provide randomized large-scale Edge deployments, with users manually needing to configure and design every single emulated node ● provide specialized deployments for Big Data Engines (e.g., Apache Spark), and consequently ● offer fine-grained monitoring metrics from the Big Data Engines [1] Fogbed: A rapid-prototyping emulation environment for fog computing, A. Coutinho et al. IEEE International Conference on Communications (ICC), 2018 [2] Emuedge: A hybrid emulator for reproducible and realistic edge computing experiments, Y. Zeng IEEE International Conference on Fog Computing (ICFC), 2019 [3] MockFog: Emulating Fog Computing Infrastructure in the Cloud, J. Hasenburg et al. IEEE International Conference on Fog Computing (ICFC), 2019
  • 46. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 Use Case Scenario 46 Data Analyst ● The underlying cluster of 48cores@2.450 GHz and 176 GB memory
  • 47. Moysis Symeonides msymeo03@ucy.ac.cy Laboratory for Internet Computing Euro-Par 2023 Emulation Accuracy The emulated computing resources has only a small performance degradation for workloads approaching 100% CPU usage Fogify achieves near to real-world network link capabilities, with only outliers not captured and a slight overhead in low-latency connections The emulation results closely follow the real measurements with a 5%-8% deviation of the overall experiment time.