SlideShare a Scribd company logo
RIT Computer Science • Capstone Report • 2228
Secure Data Distribution Algorithm for Fog
Computing
Hima Bindu Krovvidi
Department of Computer Science
Golisano College of Computing and Information Sciences
Rochester Institute of Technology
Rochester, NY 14623
hk4233@rit.edu
Abstract – The use of cloud computing is turning into
an increasingly prominent way to access and store data
primarily because of its scalability and simplicity. Several
techniques have been taken to increase cloud infrastruc-
ture security, such as employing a three-layer privacy-
preserving cloud storage system so that the infrastructure
is built, such that even if an insider has access to the
data, they will have access to only a fraction of it and
cannot retrieve the rest. The current distribution algo-
rithm focuses only on providing a fault-tolerant and data
integrity module and has scope for security improvements.
In order to solve this problem, we propose a stronger
data fragmentation and data distribution algorithm to
help increase data confidentiality and enhance encryption.
When this enhancement is combined with the existing
design, it will result in a highly secure and fault-tolerant
system where users may protect their data. We describe
the proposed design’s architecture and evaluate its security
challenges.
Index Terms—Fog Computing, AES, Reed Solomon, Data
Distribution
I. INTRODUCTION
Since the beginning of cloud computing, there have always
been advancements in how to make the cloud more robust
and secure. It has undergone numerous enhancements, making
the cloud stronger than before. With increasing reliance on
the cloud, consumers store the majority of their data on a
cloud framework [1]. As a result, the cloud is critical for
data storage. This implies that cloud security is an incredibly
essential concern. With an increase in breaches comes an
increase in security. But there have always been attacks and
ways to access the data in spite of these extreme security
measures [2]. This gives us a reason to keep upgrading and
improving our data.
Data breaches caused by internal attacks can take place for
several reasons [3]. Employees or contractors may misuse
their privileges and steal or sell this data to unauthorized third-
party users. Reasons like employee negligence where they
leave a system unlocked or fail to follow appropriate security
measures is also a reason for attackers to gain access. Even
backdoor exploitation, where attackers analyze the weaknesses
in the system and exploit them to gain access to the data in
the cloud, is another possibility to gain access to the data in
the cloud.
To protect against these kinds of attacks, there have been
several studies on how one can improve cloud security. Meth-
ods like creating a three-layered architecture [4] or involving
fog computing [5] have all been inclusive in designing a
constructive architecture in improving the security of the
cloud. Studies have focused on implementing the idea that
instead of putting all the data on one server, spreading it across
various other servers would prevent the attacker from having
access to all the data, if at all they do.
The distribution and fragmentation algorithms used in this
paper [4] have focused on dividing the data on three different
servers, such as the local machine, the fog server, and the cloud
server. The distribution the authors have designed is such that
they have 1% of the data on the local machine, 4% on the
fog server, and 95% on the cloud server. They [4] have used
the Hash Soloman code algorithm [4] [6] to separate the data
which primarily focuses on providing a fault-tolerant and data
integrity module. However, when having 95% of the data on a
single cloud server, it is important to understand that although
the attacker does not have access to other servers, they still
have access to 95% of the data which is significantly large. So
adding security and encryption as another layer of protection
within each server is necessary and crucial to ensure that
the data distributed amongst these different servers is highly
encrypted and unbreakable.
Therefore, we propose a stronger, more secure data dis-
tribution module in which the data is divided and encrypted
in combination with the existing data fragmentation algo-
rithm to maintain fault tolerance and add a layer of data
confidentiality. The proposed algorithm focuses on restricting
any kind of access to the data, whether it be a portion on
a server or the entire data. Compared to previous methods
[4], our method gives us an enhanced layer of security in
combination with the Hash Soloman Code algorithm [4], [6]
to provide a comprehensive security solution that addresses
both data confidentiality and integrity. The remainder of the
paper is structured as follows: Section 3 focuses on related
work; Section 4 covers the project methodology; Section 5
describes the evaluation methodologies; Section 6 describes
Rochester Institute of Technology 1 | Page
RIT Computer Science • Capstone Report • 2228
the outcomes; Section 7 describes future work and the scope
of improvements; and Section 8 concludes the paper.
II. APPROACH
The project aims on improving the data confidentiality that
could be provided by the three-layer cloud storage system.
By introducing an enhanced security algorithm to the existing
infrastructure, we aim to create a more robust model to sort
and distribute data across different servers ensuring security.
Adding advanced encryption standard (AES) algorithm [7]in
combination with Reed-Solomon Code algorithm [4], [6]
helps us achieve this requirement. The existing data is pro-
cessed through a distribution algorithm and passed through this
new algorithm generated, to create another layer of security.
To solve this, we take the data sent to the cloud and add
redundant data to ensure data integrity, and then jumble this
data, and divide it into blocks. After this process, each block
will be sent to the particular servers, i.e., cloud server, fog
server, and local machine based on the existing distribution
algorithm. Before the data is sent into the respective servers,
it undergoes AES encryption [7].
The main part of our proposed algorithm lies in its combi-
nation with Reed-Solomon Code algorithm [4], [6] with AES
[7]. For this, we make use of the PKCS7 padding, to ensure
that the input data is appropriately aligned and meets the block
size as required by AES for the encryption and decryption
process. By using this padding, we are able to introduce false
or misleading information that would essentially scramble the
genuine data with the decoiled data, hence, making it difficult
for the malicious attackers to actually find the genuine data,
which is hidden amongst the padding.
Since AES [7] uses symmetric key for encryption and
decryption, it is important that we formulate a method such
that only the user has access to the data on the cloud. To
do this, we generate a private key that is only known to
the user (using a secure key generation algorithm), and the
symmetric key is encrypted using the secret key of the user.
This algorithm ensures a fault-tolerant system, and mitigates
data loss and corruption of data. The usage of AES [7] makes
it more difficult to breach and increases data confidentiality.
Using this method is definitely an improvement to previous
techniques since, this algorithm ensures that the attacker will
not have access to any portion of the data.
III. LITERATURE REVIEW
The importance of security in cloud storage has attracted
a lot of attention in both the industry and from the ed-
ucational perspective. There has been lots of research in
the field of secure cloud storage architectures. We look at
different encryption studies to provide a secure data stor-
age system in combination with a Fog computing server.
Arwa [8] talks about how using a key-exchange protocol
based on attribute-based encryption provides secure inter-
communication amongst different nodes. Xu [9] talks about
another method involving homomorphic encryption, based on
an article [10] and access policies to provide secure data
sharing. Xu [9] also talks about an access control system for
Fog computing environments, such that the data is confidential
and there is selective access only. According to Seol [11],
data leakage can occur when software is exploited or when
employees or administrators misuse sensitive knowledge. The
authors offer a system wherein an encrypted storage service is
used to protect data from potentially compromised software
and administrators. The authors discuss leveraging crypto
processors to provide a safe storage service for consumers,
with details inferences from a paper written by Gaspar [12].
The system is secure since the information that is saved has
been secured and will not be disclosed even with privileged
access.
There have been several security algorithms employed in
various cloud storage systems, and are yet to be utilized to
build a more robust system. Concepts like AES (Advanced
Encryption Standard) [7] focus on providing a strongly built,
symmetric encryption and decryption algorithm. AES aims on
providing data confidentiality. The paper [13] talks about how
it combines AES with quantum cryptography to create a highly
secure application. The authors say with the combination
of AES and quantum security, the application created is of
unprecedented level and can be used in high security requiring
systems such as the nuclear defense system. The paper [13]
mentions how AES is the fastest block cipher and is used for
practical security. On a comparative study on AES, DES, RSA,
and OTP, [14] explains how all four encryption algorithms
are extremely robust and describes AES and DES on their
accuracy of performance.
Reddy [15] was able to create a solution to the emerging
requirement for cloud data security. The technique employs
hybrid cryptography as well as file splitting. Hybrid cryptog-
raphy is an approach that combines the concepts of symmetric
encryption and public-key encryption. The author emphasized
that symmetric encryption is efficient but insecure because the
same key is shared, hence the solution entails encrypting the
symmetric key using the public key. This solution made it
sturdy, secure, and efficient. File splitting entails separating the
file into different pieces and encrypting each part individually,
such that even if one of them is gained by an attack, the
attacker will be unable to decode it.
IV. IMPLEMENTATION
The implementation of the proposed algorithm involves
making changes and enhancing the security of the data dis-
tribution algorithm. The proposed algorithm will focus on
dividing the data in a more secure way, such that the data
exposed on one server will not affect the data or security on
the other server.
The project basically aims to provide a robust and secure
data distribution algorithm for Fog computing architectures.
With the help of encryption, data fragmentation, and access
control methods, we are able to implement this algorithm
successfully. Below are the descriptions of the implementation
in detail:
Rochester Institute of Technology 2 | Page
RIT Computer Science • Capstone Report • 2228
Fig. 1. Architecture of Data Distribution on Cloud Servers
Fig. 2. Data Retrieval from the Cloud Servers
A. System Design
As the paper [4] describes their architecture on dividing
the data into three servers mainly, the cloud, fog, and the local
machine, we have implemented the same architecture.
We use this architecture to deploy our data distribution
algorithm. As seen in the Figure 1, we can find that the data
after going through the distribution algorithm will be divided
into servers, cloud, fog, and local machines by the ratio of
95:4:1. The majority of the data exists in the cloud.
From the Figure 2, we can understand the retrieval process
of data when the user requests information from the cloud
server. When the user requests some information or data from
the cloud server, the data will be extracted from combining
all the information that is stored and distributed on the three
different servers. The process initially begins with the cloud
server receiving the request of the user, then the cloud server
that holds 95% of the data will prepare the data and look for
the information that the user has requested after retrieving the
data. The request that the user has sent will go to the fog server
and it will prepare the data and retrieve it and will combine it
Fig. 3. System Design of Data Encryption and Distribution
with the cloud server’s data giving 99% of the data. Then, the
cloud server sends the request to the local server which has the
remaining 1% of data. This 1% of data will be retrieved and
will then be prepared by a local server and will be combined
with the remaining 99% of data that we retrieve from the other
two servers to get all the data, i.e., 100%. The data that has
been requested by the users will be fetched and will be sent
to the user.
B. System Architecture
The Data Distribution Algorithm is basically responsible
for the entire of data so that it is secure and efficient after
passing through the data distribution algorithm. In addition to
the fault tolerance and data integrity that the Reed-Solomon
algorithm offers, we are adding an important feature, which
is data confidentiality. Data confidentiality is very important,
since the data that is stored on one server, if attacked, will not
allow the attacker to have information about the data on other
servers.
1) Data Fragmentation with Padding: Initially in our algo-
rithm, the data is divided into different blocks and fragments
so that the data is manageable and can be further processed
for security and encryption during the distribution. Padded data
is beneficial as it does not have any meaning or contain any
sensitive information. After this data has been divided into
blocks, we add a layer of padding. The padding we use here
is the PKCS7. This is a very popular padding algorithm, that
is used for AES encryption. We use this particular padding, so
that the total length of the data after we divide it into chunks,
is essentially the multiple of the specified block size, i.e., as
per AES requirement of 16 bytes. This has been depicted in
the Figure 3.
2) Redundancy Addition: To improve fault tolerance, we
use the Reed-Solomon Algorithm. This algorithm is known
for storing all the files safely and when a server goes down for
maintenance, the files will still be available even though they
are on the down server. This is achieved by using redundant
Rochester Institute of Technology 3 | Page
RIT Computer Science • Capstone Report • 2228
Fig. 4. System Design of Data Retrieval
information, also known as parity blocks. The Reed-Solomon
Algorithm is responsible for generating the parity blocks so
that there is efficient recovery of lost or corrupted data.
3) Private-Key Generation: We create a private key that is
known only to the user so that it can be used for the encryption
and decryption of the data that would be put on the cloud. The
key generated will be done through a third-party independent
of the cloud and will allow only the user to have access to
it. This key will be of 16-bytes to support the AES 256-bit
encryption algorithm. It has been designed such that in order
to perform encryption or decryption, the user must input the
private key each time to perform the operation. The secret key
will be generated only once and the user must store it.
4) Encryption Algorithm: Once the data has been padded
and added parity bits to support fault tolerance, the algorithm
applies Advanced Encryption Standard (AES) to each of the
blocks. By encrypting the data blocks, the data is secure
even if an attacker gains unauthorized access to a server. By
using the 256-bit key, we are able to ensure maximum data
confidentiality. After this, the encrypted data will be divided
onto the Cloud server, Fog server, and local machine.
5) Data Retrieval: As discussed in Figure 2, all the data is
combined after retrieval from each cloud server. However, the
restored data is encrypted. In order to retrieve the data, the
user has to enter the 16-byte private key that had been issued
earlier. On using this key for decryption, the plain text is now
visible. However, it may be possible that the data may have
been corrupted or simply lost. In order to recover the files
completely, we process Reed-Solomon to retrieve any missing
files that may have been requested by the user. After this
process, the user has his original data that has been securely
protected.
V. PERFORMANCE EVALUATION
Evaluating the model is a crucial aspect of assessing the
effectiveness and performance of our algorithm. To understand
where our data distribution algorithm is doing well and to
find where it is lacking, we assess our algorithm on several
factors including, security measures, vulnerabilities, attack
vectors, effectiveness of data distribution and confidentiality,
resource utilization, processing speed, and memory usage, and
the overall performance.
The security measures of this algorithm are used to provide
confidentiality, integrity, and authentication. We will also by
analyzing the vulnerabilities of this algorithm and will evaluate
how resistant it is to different attacks, like brute force, or
data tampering attacks. In the case of resource utilization, we
will evaluate CPU usage, memory consumption, and network
bandwidth. Factors such as the processing speed in comparison
with previous algorithms will also be assessed.
A. Evaluation Methodology
To evaluate our designed system, we test the model on
various scales of data such as text files, images, and video files
that range from 480p to a 4K file. This allows our algorithm
to be tested on a large range of data sizes so that we can
understand and analyze its performance on smaller data files
and how the overhead changes when processing a large data
file. We will be evaluating diverse sets of data sets, which
include multi-media files and text documents to assess the
performance of the algorithm.
B. Performance Metrics
To understand the effectiveness of our system and algorithm,
performance metrics are useful. In our paper, several aspects
play a critical role in the functionality and impact of the overall
system. These metrics determine whether our algorithm is
strong, weak, or ineffective. By evaluating the performance
metrics of our system, we are able to understand where our
system is performing well and where it lacks. This will allow
future researchers to improve the system further. Some of the
key performance metrics that we will be taking into account
by trying to evaluate the model are:
1) Encryption and Decryption Speed: This metric quan-
tifies the efficiency of the AES distribution algorithm. By
evaluating this, we are able to understand how well our
encryption algorithm is performing with respect to our data.
2) Data Retrieval Time: Data retrieval time essentially
measures the amount of time it takes for the user to retrieve the
data once uploaded onto the cloud servers. The process of data
retrieval includes several steps once the data is stored on the
cloud server. Retrieval of the data includes the user providing
the private key to perform decryption of the data once the data
is restored from the cloud servers. After the data is decrypted,
measures are taken to verify whether there has been missing,
lost or corrupted data. If this happens to be the case, the Reed-
Solomon algorithm will allow the same restoration of the data.
This is all measured in the data retrieval time.
Rochester Institute of Technology 4 | Page
RIT Computer Science • Capstone Report • 2228
3) Memory Utilization: The memory utilization metric
measures how efficiently the algorithm utilizes memory across
the computing machine. By evaluating the memory utilization,
we are able to see where potential bottlenecks can take place
and are able to carefully analyze and target what is contributing
to this memory utilization.
4) Resource Utilization: Resource utilization is one of the
most prominent performance evaluation metrics that we focus
on in this paper. We take into account several aspects such as
CPU usage, network bandwidth, processing speed, and overall
performance. Having evaluated this, we can understand where
our algorithm lacks and improve on it.
5) Throughput: Throughput is a basic measure of how
much work is done per unit of time. We can test how
well our system is doing by measuring the throughput. A
higher throughput means more work is done, whereas a low
throughput means less work is done.
6) Security Vulnerabilities: Security vulnerabilities of the
system include a comparison of potential attack vectors,
the effectiveness of data encryption, and data confidentiality.
Measuring this allows us to understand how resistant our
encryption scheme is. This will be a qualitative analysis to
understand and compare the proposed model with the existing
ones. We will examine various techniques that could break the
encryption scheme.
7) Error Handling: This evaluation examines how error
correction is performed and how fast it is done. Another aspect
that is tested is the amount of memory that gets consumed. To
test this, we will take into account the time to recover from
server failures. We will measure the overhead taken for when
one server is down, two, and so on. This will be very helpful
to understand how efficient our system is in failure recovery.
This also has a real-time application advantage as corruption
of data or loss is not highly uncommon. Having this feature
in our system proves to be an advantage.
C. Experimental Setup
The experimental setup for evaluating the proposed algo-
rithm involves creating a controlled environment that simu-
lates the computing scenarios. The setup has the necessary
hardware, software, and network components to replicate the
data distribution process environment. The algorithm has been
tested locally on a personal computer.
1) Hardware Components: The setup includes a personal
computing device having 12th Gen Intel(R) Core(TM) i7-
12700H 2.30 GHz x-64 based processor, with a RAM of 16.0
GB and a 64-bit operating system.
2) Software Components: The software components in-
clude the implementation of the proposed algorithm using VS
Code IDE, Docker for containerization, and relevant libraries
to perform encryption, decryption, Reed-Solomon encoding,
and tracking the utilization of resources. The libraries include
ReedSolo and Cryptography modules.
3) Data-sets: We selected a diverse set of different data
types such as multi-media files and text documents with
Fig. 5. Processing Time With Varying File Sizes
varying data volumes so that the algorithm’s performance is
tested across various data sizes.
4) Performance Measurement Tools: To measure and mon-
itor the different performance metrics such as data distribution
time, encryption and decryption speed, data retrieval time,
CPU usage, memory usage, and network bandwidth, we em-
ploy tools such as psutil in Python to collect these metrics. We
also containerized the application by using Docker, which gave
us the application’s performance in an isolated environment
without the interference of any other application running in
the background.
VI. PERFORMANCE RESULTS
After deploying our application into Docker, we measured
various metrics which are crucial to understand the algorithm’s
effectiveness, efficiency, and overall impact on data distribu-
tion and security. Having measured these results, we were pro-
vided valuable insights into how the algorithm is performing
under various conditions and we are able to identify areas for
improvement or optimizations. Following are some of the key
results we recorded:
A. Data Retrieval Time
The first performance metric that we measured is the time
for data retrieval including encryption and decryption of the
data that is stored on the cloud server. We tested the application
in two different environments, one being a 14-core system and
another being an 8-core system. This provided insights into
how quickly the data would be processed and retrieved as
the number of cores increased, resembling a real-time server
having a good number of cores to process large amounts of
data.
For easy understanding and even distribution on the graph,
we employed Log-scale X-axis to represent an increase in file
sizes. The file sizes taken are of various file types ranging
Rochester Institute of Technology 5 | Page
RIT Computer Science • Capstone Report • 2228
Fig. 6. Memory Utilization
from 10 KB up to 1 GB. On the Y-axis, the processing times
are represented in seconds. When the graph is plotted on
both the 14-core and 8-core systems, we observe a common
pattern, i.e., the processing time growing exponentially with
the increase in file size. The processing time lies between 0 to
10 seconds within the first MB of data. However, after that,
there is a steep slope, and the processing time increases.
B. Impact of Memory Utilization
The performance metric we measured next is memory
utilization. We evaluated the memory utilization on a 16-
core computing device and took into account two aspects of
memory consumption including the Memory Percentage and
RAM Memory. The memory percentage was evaluated based
on a 16 GB ram computing device. By evaluating this, we
were given insights on how much memory was consumed,
and whether the overhead is too expensive. Also, the amount
of memory needed to be employed if a large-scale application
was deployed was inferred.
For easy understanding and even distribution on the graph,
we employed Log-scale X-axis to represent an increase in file
sizes. The file sizes taken are of various file types ranging
from 10 KB and up to 1 GB but were represented in the
powers of 10. On the Y-axis, the memory percentage was
recorded and on the minor axis, the RAM memory measured
in terms of GB(s) was recorded. Our evaluation plotted on the
graph recorded these two aspects simultaneously as they are
significantly dependent on one another. There are two graphs
plotted, one for each metric. If we take a look at the memory
percentage graph, there was not much increase until the file
size crossed 1000 MB. And the other graph has a similarity
with the memory percentage graph, where there has been kind
of same utilization for the file sizes under 1 GB, but for the
files above 1 GB, there has been a steep increase, depicting
the usage of the memory.
C. Impact of Resource Utilization
As a part of resource utilization, we analyzed the CPU
utilization of the application as it was running. CPU utilization
Fig. 7. CPU Utilization
is an important aspect as it helps us understand the amount
of resources that are being consumed by the application on
the computing device. The CPU utilization has been recorded
on a personal computing device having 12th Gen Intel(R)
Core(TM) i7-12700H 2.30 GHz x-64 based processor, with
a RAM of 16.0 GB and 64-bit operating system.
The CPU utilization can be understood from the graph,
where the file sizes are expressed logarithmically on the X-
axis to provide even distribution, whereas the Percentage of
CPU usage is represented on the Y-axis. The CPU utilization
seems to be almost steady and below 2% for file sizes up to 1
MB, but after that, there has been an increase in the slope for
file sizes greater than 1 MB. It can also be observed that the
inclination for file sizes greater than 1 MB was just half of
what has been achieved for file sizes greater than 500 MB. It
can be inferred that as larger data files require to be processed,
a device with higher specifications would be required.
D. Qualitative Analysis
In the qualitative analysis of this paper, we discuss the
effectiveness of our data model and how robust it is against
potential attack vectors, and how it provides effective data
encryption and confidentiality against many common threats
in fog computing environments.
Most importantly, by equipping AES [7] to include in
our algorithm, we significantly enhance the data protection in
comparison to the previous methods which lacked any sort
of encryption mechanisms. Especially, since we used AES
in combination with a private key, which enables strict data
access control [16]. This allows only authorized users (in
our case, only the user) to access the data, preventing all
unauthorized parties including the cloud administrators from
accessing any kind of information. This, in turn, provides a
defense against insider attacks.
Another advantage of using AES is the protection against
eavesdroppers during data transmission [17]. This makes
the data protected and indecipherable during data upload and
retrieval.
Overall, in combination with AES with Reed-Solomon, the
system achieves data confidentiality and provides a robust
method to keep the data from undergoing permanent loss [18].
Rochester Institute of Technology 6 | Page
RIT Computer Science • Capstone Report • 2228
Fig. 8. Impact on Recovery of Node Failure
E. Impact of Error Handling
Our algorithm employs the Reed-Solomon code algorithm
to reinforce a fault-tolerant, data integrity-holding model. In
combination with our encryption techniques, it is important
to understand and observe how the Reed-Solomon algorithm
behaves with respect to time and overall performance. To
evaluate this, we performed testing on a 16-core computing
device and employed 10 nodes to hold the data. We observed
how the system is performing when a single node is down,
two, and four, and recorded how long it was taking for the
application to bounce back with the same data, which has
been unchanged and has been successfully recovered.
Similar to the previous graphs, we plotted the file sizes on
the X-axis and the corresponding number of node failures on
the Y-axis. We equipped a heat map to display the recovery
times in comparison to the number of nodes that have been
corrupted. We can see that for a particular file, for example,
of size 10 KB, the recovery time increases with the increase
in number of nodes that have failed. This pattern remains
consistent with all the file sizes. Another observation that can
be made is that with the increase in file size, the time to recover
from node failure also increases. Most importantly, given out
of 10 servers, 4 have failed, the application has successfully
been able to recover all the lost data. The time to recover is
recorded in seconds.
VII. DISCUSSIONS
Fog computing has emerged as a promising platform to
overcome the challenges posed in contemporary computing
settings by the enormous influx of data. The decentralized
nature of fog computing provides faster computing and stores
the data on different servers, where the entire data is not stored
on the cloud server. But even this decentralized architecture
has some security concerns. When we deal with sensitive
data, it is important to ensure that the data is safeguarded
against data exposure and unauthorized access. This is a very
important aspect where there can be more robust and safer
security algorithms used to safeguard the data.
This paper proposes, along with the existing three-layer
storage framework that [4] states, to involve a hybrid approach
where both AES and Reed Solomon Code Algorithm are used.
This ensures that even if a portion of the data on a certain
server is compromised, the attacker will not be able to gain any
information. The proposed algorithm goes beyond traditional
methods and offers another layer of security, where data is
protected during data transmission, and the system has a fault-
tolerant design.
Compared to the conventional data distribution techniques
employed in fog computing environments, the suggested algo-
rithm has significant advantages. Due to the inclusion of both
a data confidentiality algorithm and a data recovery model,
the foremost advantage is that the data cannot be either lost
or hacked.
The additional feature of allowing a user-private password
keeps away everyone other than the user from performing
encryption or decryption on the data that is to be stored on
the data servers. This prevents all of the unauthorized cloud
server access that could take place, and thus the data from
getting stolen.
By leveraging the processing power of edge devices, the
proposed data distribution algorithm further improves effi-
ciency in data retrieval and dissemination while lowering sys-
tem latency. This enables efficient simultaneous data retrieval
from various servers and improves data access times. This
allows the entire system suitable for real-time.
VIII. FUTURE WORK
Some potential avenues for future work and development are
for one, focusing on further enhancing the security measures.
The research done has focused on working on integrating
additional techniques to improve security, but further research
can be done to strengthen data confidentiality by incorporating
quantum-crypto techniques. Another aspect would be to im-
prove the dynamic data distribution by developing dynamically
adjustable data distribution based on the workloads, network
conditions, and device capabilities.
Scalability and performance are other aspects that can
be improved, and also thinking about distributing the data
in an energy-efficient manner is another aspect to look at.
An interesting work that could be worked on is integrating
machine learning-based security to detect any anomaly and or
threats to remove the security threats, thereby improving the
algorithm even more.
IX. CONCLUSION
Therefore, the proposed three-layer storage system for
fog computing, which combines the Hash Solomon Code
algorithm and AES encryption, provides a comprehensive
solution for the security problems that inevitably occur in
Rochester Institute of Technology 7 | Page
RIT Computer Science • Capstone Report • 2228
these circumstances. The method significantly improves the
efficiency, fault tolerance, and data secrecy of fog computing
systems, strengthening their security posture. The successful
development and evaluation of the algorithm in real-world
deployment settings attest to its efficacy and suitability for
data distribution in cloud and edge computing systems.
Ensuring cloud security has remained an issue in the digital
world, whereas the proposed strategy significantly advances
data distribution and encryption approaches, paving the way
for more secure and long-lasting fog computing infrastruc-
tures. By conducting more research and looking into ways to
improve the algorithm’s performance and scalability, it will
only become clearer how useful the algorithm may be as a
fundamental framework for safeguarding private information
in distributed computing systems.
REFERENCES
[1] J. F. Gantz, D. Reinsel, and J. Rydning, “The u.s. datasphere: Consumers
flocking to cloud,” IDC, 2019.
[2] D. Kolevski, K. Michael, R. Abbas, and M. Freeman, “Cloud computing
data breaches: A review of u.s. regulation and data breach notification
literature,” in 2021 IEEE International Symposium on Technology and
Society (ISTAS), 2021, pp. 1–7.
[3] Y. Sun, J. Zhang, Y. Xiong, and G. Zhu, “Data security and privacy
in cloud computing,” International Journal of Distributed Sensor
Networks, vol. 10, no. 7, p. 190903, 2014. [Online]. Available:
https://doi.org/10.1155/2014/190903
[4] T. Wang, J. Zhou, X. Chen, G. Wang, A. Liu, and Y. Liu, “A three-
layer privacy preserving cloud storage scheme based on computational
intelligence in fog computing,” IEEE Transactions on Emerging Topics
in Computational Intelligence, vol. 2, no. 1, pp. 3–12, 2018.
[5] S. Cao, H. Han, J. Wei, Y. Zhao, S. Yang, and L. Yan, “Space
cloud-fog computing: Architecture, application and challenge,” in
Proceedings of the 3rd International Conference on Computer Science
and Application Engineering, ser. CSAE ’19. New York, NY, USA:
Association for Computing Machinery, 2019. [Online]. Available:
https://doi-org.ezproxy.rit.edu/10.1145/3331453.3361637
[6] H. Xu and D. Bhalerao, “Reliable and secure distributed cloud data
storage using reed-solomon codes,” International Journal of Software
Engineering and Knowledge Engineering, vol. 25, pp. 1611–1632, 11
2015.
[7] M. Dworkin, E. Barker, J. Nechvatal, J. Foti, L. Bassham, E. Roback,
and J. Dray, “Advanced encryption standard (aes),” 2001-11-26 2001.
[8] A. Alrawais, A. Alhothaily, C. Hu, X. Xing, and X. Cheng, “An attribute-
based encryption scheme to secure fog communications,” IEEE Access,
vol. PP, pp. 1–1, 05 2017.
[9] Q. Xu, C. Tan, Z. Fan, W. Zhu, Y. Xiao, and F. Cheng, “Secure
data access control for fog computing based on multi-authority
attribute-based signcryption with computation outsourcing and attribute
revocation,” Sensors, vol. 18, no. 5, 2018. [Online]. Available:
https://www.mdpi.com/1424-8220/18/5/1609
[10] X. Yi, R. Paulet, and E. Bertino, Homomorphic Encryption. Cham:
Springer International Publishing, 2014, pp. 27–46.
[11] J. Seol, S. Jin, and S. Maeng, “Secure storage service for iaas
cloud users,” in Proceedings of the 13th IEEE/ACM International
Symposium on Cluster, Cloud, and Grid Computing, ser. CCGRID
’13. IEEE Press, 2013, p. 190–191. [Online]. Available: https://doi-
org.ezproxy.rit.edu/10.1109/CCGrid.2013.31
[12] L. Gaspar, “Crypto-processor - architecture, programming and evaluation
of the security,” 11 2012.
[13] G. Sharma and S. Kalra, “A novel scheme for data security in
cloud computing using quantum cryptography,” in Proceedings of the
International Conference on Advances in Information Communication
Technology & Computing, ser. AICTC ’16. New York, NY, USA:
Association for Computing Machinery, 2016. [Online]. Available:
https://doi-org.ezproxy.rit.edu/10.1145/2979779.2979816
[14] T. Talaei Khoei, E. Ghribi, R. Prakash, and N. Kaabouch, “A perfor-
mance comparison of encryption/decryption algorithms for uav swarm
communications,” 02 2021.
[15] H. S. C. Reddy, V. V. Karthik, D. V, A. Pavan, and S. V, “Data storage on
cloud using split-merge and hybrid cryptographic techniques,” in 2022
International Conference for Advancement in Technology (ICONAT),
2022, pp. 1–5.
[16] S. Duggal, V. Mohindru, P. Vadiya, and S. Sharma, “A comparative
analysis of private key cryptography algorithms: Des, aes and triple
des,” International Journal of Advanced Research in Computer Science
and Software Engineering, vol. 6, p. 1373, 06 2014.
[17] M. Yahaya and A. Ajibola, “Cryptosystem for secure data transmission
using advance encryption standard (aes) and steganography,” Interna-
tional Journal of Scientific Research in Computer Science, Engineering
and Information Technology, vol. Volume 5, pp. 317–322, 12 2019.
[18] D. U. Singh, “Error detection and correction using reed solomon codes,”
Error Detection and Correction Using Reed Solomon Codes, vol. 3, 03
2013.
Rochester Institute of Technology 8 | Page

More Related Content

Similar to Secure_Data_Distribution_Algorithm_for_Fog_Computing.pdf

IRJET - A Novel Approach Implementing Deduplication using Message Locked Encr...
IRJET - A Novel Approach Implementing Deduplication using Message Locked Encr...IRJET - A Novel Approach Implementing Deduplication using Message Locked Encr...
IRJET - A Novel Approach Implementing Deduplication using Message Locked Encr...
IRJET Journal
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
ijceronline
 
Enforcing multi user access policies in cloud computing
Enforcing multi user access policies in cloud computingEnforcing multi user access policies in cloud computing
Enforcing multi user access policies in cloud computing
IAEME Publication
 
Dynamic Resource Allocation and Data Security for Cloud
Dynamic Resource Allocation and Data Security for CloudDynamic Resource Allocation and Data Security for Cloud
Dynamic Resource Allocation and Data Security for Cloud
AM Publications
 
Bio-Cryptography Based Secured Data Replication Management in Cloud Storage
Bio-Cryptography Based Secured Data Replication Management in Cloud StorageBio-Cryptography Based Secured Data Replication Management in Cloud Storage
Bio-Cryptography Based Secured Data Replication Management in Cloud Storage
IJERA Editor
 
Effective & Flexible Cryptography Based Scheme for Ensuring User`s Data Secur...
Effective & Flexible Cryptography Based Scheme for Ensuring User`s Data Secur...Effective & Flexible Cryptography Based Scheme for Ensuring User`s Data Secur...
Effective & Flexible Cryptography Based Scheme for Ensuring User`s Data Secur...
ijsrd.com
 
A PRACTICAL CLIENT APPLICATION BASED ON ATTRIBUTE-BASED ACCESS CONTROL FOR UN...
A PRACTICAL CLIENT APPLICATION BASED ON ATTRIBUTE-BASED ACCESS CONTROL FOR UN...A PRACTICAL CLIENT APPLICATION BASED ON ATTRIBUTE-BASED ACCESS CONTROL FOR UN...
A PRACTICAL CLIENT APPLICATION BASED ON ATTRIBUTE-BASED ACCESS CONTROL FOR UN...
cscpconf
 
Improved deduplication with keys and chunks in HDFS storage providers
Improved deduplication with keys and chunks in HDFS storage providersImproved deduplication with keys and chunks in HDFS storage providers
Improved deduplication with keys and chunks in HDFS storage providers
IRJET Journal
 
Cloud security: literature survey
Cloud security: literature surveyCloud security: literature survey
Cloud security: literature survey
IJECEIAES
 
A novel cloud storage system with support of sensitive data application
A novel cloud storage system with support of sensitive data applicationA novel cloud storage system with support of sensitive data application
A novel cloud storage system with support of sensitive data application
ijmnct
 
Iaetsd secured and efficient data scheduling of intermediate data sets
Iaetsd secured and efficient data scheduling of intermediate data setsIaetsd secured and efficient data scheduling of intermediate data sets
Iaetsd secured and efficient data scheduling of intermediate data sets
Iaetsd Iaetsd
 
IJARCCE 20
IJARCCE 20IJARCCE 20
IJARCCE 20
Nahan Rahman
 
A Secure Multi-Owner Data Sharing Scheme for Dynamic Group in Public Cloud.
A Secure Multi-Owner Data Sharing Scheme for Dynamic Group in Public Cloud. A Secure Multi-Owner Data Sharing Scheme for Dynamic Group in Public Cloud.
A Secure Multi-Owner Data Sharing Scheme for Dynamic Group in Public Cloud.
IJCERT JOURNAL
 
Revocation based De-duplication Systems for Improving Reliability in Cloud St...
Revocation based De-duplication Systems for Improving Reliability in Cloud St...Revocation based De-duplication Systems for Improving Reliability in Cloud St...
Revocation based De-duplication Systems for Improving Reliability in Cloud St...
IRJET Journal
 
H1803035056
H1803035056H1803035056
H1803035056
IOSR Journals
 
IRJET- Providing Privacy in Healthcare Cloud for Medical Data using Fog Compu...
IRJET- Providing Privacy in Healthcare Cloud for Medical Data using Fog Compu...IRJET- Providing Privacy in Healthcare Cloud for Medical Data using Fog Compu...
IRJET- Providing Privacy in Healthcare Cloud for Medical Data using Fog Compu...
IRJET Journal
 
Encryption Technique for a Trusted Cloud Computing Environment
Encryption Technique for a Trusted Cloud Computing EnvironmentEncryption Technique for a Trusted Cloud Computing Environment
Encryption Technique for a Trusted Cloud Computing Environment
IOSR Journals
 
Encryption Technique for a Trusted Cloud Computing Environment
Encryption Technique for a Trusted Cloud Computing EnvironmentEncryption Technique for a Trusted Cloud Computing Environment
Encryption Technique for a Trusted Cloud Computing Environment
IOSR Journals
 
H017155360
H017155360H017155360
H017155360
IOSR Journals
 
Encryption Technique for a Trusted Cloud Computing Environment
Encryption Technique for a Trusted Cloud Computing EnvironmentEncryption Technique for a Trusted Cloud Computing Environment
Encryption Technique for a Trusted Cloud Computing Environment
IOSR Journals
 

Similar to Secure_Data_Distribution_Algorithm_for_Fog_Computing.pdf (20)

IRJET - A Novel Approach Implementing Deduplication using Message Locked Encr...
IRJET - A Novel Approach Implementing Deduplication using Message Locked Encr...IRJET - A Novel Approach Implementing Deduplication using Message Locked Encr...
IRJET - A Novel Approach Implementing Deduplication using Message Locked Encr...
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
Enforcing multi user access policies in cloud computing
Enforcing multi user access policies in cloud computingEnforcing multi user access policies in cloud computing
Enforcing multi user access policies in cloud computing
 
Dynamic Resource Allocation and Data Security for Cloud
Dynamic Resource Allocation and Data Security for CloudDynamic Resource Allocation and Data Security for Cloud
Dynamic Resource Allocation and Data Security for Cloud
 
Bio-Cryptography Based Secured Data Replication Management in Cloud Storage
Bio-Cryptography Based Secured Data Replication Management in Cloud StorageBio-Cryptography Based Secured Data Replication Management in Cloud Storage
Bio-Cryptography Based Secured Data Replication Management in Cloud Storage
 
Effective & Flexible Cryptography Based Scheme for Ensuring User`s Data Secur...
Effective & Flexible Cryptography Based Scheme for Ensuring User`s Data Secur...Effective & Flexible Cryptography Based Scheme for Ensuring User`s Data Secur...
Effective & Flexible Cryptography Based Scheme for Ensuring User`s Data Secur...
 
A PRACTICAL CLIENT APPLICATION BASED ON ATTRIBUTE-BASED ACCESS CONTROL FOR UN...
A PRACTICAL CLIENT APPLICATION BASED ON ATTRIBUTE-BASED ACCESS CONTROL FOR UN...A PRACTICAL CLIENT APPLICATION BASED ON ATTRIBUTE-BASED ACCESS CONTROL FOR UN...
A PRACTICAL CLIENT APPLICATION BASED ON ATTRIBUTE-BASED ACCESS CONTROL FOR UN...
 
Improved deduplication with keys and chunks in HDFS storage providers
Improved deduplication with keys and chunks in HDFS storage providersImproved deduplication with keys and chunks in HDFS storage providers
Improved deduplication with keys and chunks in HDFS storage providers
 
Cloud security: literature survey
Cloud security: literature surveyCloud security: literature survey
Cloud security: literature survey
 
A novel cloud storage system with support of sensitive data application
A novel cloud storage system with support of sensitive data applicationA novel cloud storage system with support of sensitive data application
A novel cloud storage system with support of sensitive data application
 
Iaetsd secured and efficient data scheduling of intermediate data sets
Iaetsd secured and efficient data scheduling of intermediate data setsIaetsd secured and efficient data scheduling of intermediate data sets
Iaetsd secured and efficient data scheduling of intermediate data sets
 
IJARCCE 20
IJARCCE 20IJARCCE 20
IJARCCE 20
 
A Secure Multi-Owner Data Sharing Scheme for Dynamic Group in Public Cloud.
A Secure Multi-Owner Data Sharing Scheme for Dynamic Group in Public Cloud. A Secure Multi-Owner Data Sharing Scheme for Dynamic Group in Public Cloud.
A Secure Multi-Owner Data Sharing Scheme for Dynamic Group in Public Cloud.
 
Revocation based De-duplication Systems for Improving Reliability in Cloud St...
Revocation based De-duplication Systems for Improving Reliability in Cloud St...Revocation based De-duplication Systems for Improving Reliability in Cloud St...
Revocation based De-duplication Systems for Improving Reliability in Cloud St...
 
H1803035056
H1803035056H1803035056
H1803035056
 
IRJET- Providing Privacy in Healthcare Cloud for Medical Data using Fog Compu...
IRJET- Providing Privacy in Healthcare Cloud for Medical Data using Fog Compu...IRJET- Providing Privacy in Healthcare Cloud for Medical Data using Fog Compu...
IRJET- Providing Privacy in Healthcare Cloud for Medical Data using Fog Compu...
 
Encryption Technique for a Trusted Cloud Computing Environment
Encryption Technique for a Trusted Cloud Computing EnvironmentEncryption Technique for a Trusted Cloud Computing Environment
Encryption Technique for a Trusted Cloud Computing Environment
 
Encryption Technique for a Trusted Cloud Computing Environment
Encryption Technique for a Trusted Cloud Computing EnvironmentEncryption Technique for a Trusted Cloud Computing Environment
Encryption Technique for a Trusted Cloud Computing Environment
 
H017155360
H017155360H017155360
H017155360
 
Encryption Technique for a Trusted Cloud Computing Environment
Encryption Technique for a Trusted Cloud Computing EnvironmentEncryption Technique for a Trusted Cloud Computing Environment
Encryption Technique for a Trusted Cloud Computing Environment
 

Recently uploaded

How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Zilliz
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 

Recently uploaded (20)

How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 

Secure_Data_Distribution_Algorithm_for_Fog_Computing.pdf

  • 1. RIT Computer Science • Capstone Report • 2228 Secure Data Distribution Algorithm for Fog Computing Hima Bindu Krovvidi Department of Computer Science Golisano College of Computing and Information Sciences Rochester Institute of Technology Rochester, NY 14623 hk4233@rit.edu Abstract – The use of cloud computing is turning into an increasingly prominent way to access and store data primarily because of its scalability and simplicity. Several techniques have been taken to increase cloud infrastruc- ture security, such as employing a three-layer privacy- preserving cloud storage system so that the infrastructure is built, such that even if an insider has access to the data, they will have access to only a fraction of it and cannot retrieve the rest. The current distribution algo- rithm focuses only on providing a fault-tolerant and data integrity module and has scope for security improvements. In order to solve this problem, we propose a stronger data fragmentation and data distribution algorithm to help increase data confidentiality and enhance encryption. When this enhancement is combined with the existing design, it will result in a highly secure and fault-tolerant system where users may protect their data. We describe the proposed design’s architecture and evaluate its security challenges. Index Terms—Fog Computing, AES, Reed Solomon, Data Distribution I. INTRODUCTION Since the beginning of cloud computing, there have always been advancements in how to make the cloud more robust and secure. It has undergone numerous enhancements, making the cloud stronger than before. With increasing reliance on the cloud, consumers store the majority of their data on a cloud framework [1]. As a result, the cloud is critical for data storage. This implies that cloud security is an incredibly essential concern. With an increase in breaches comes an increase in security. But there have always been attacks and ways to access the data in spite of these extreme security measures [2]. This gives us a reason to keep upgrading and improving our data. Data breaches caused by internal attacks can take place for several reasons [3]. Employees or contractors may misuse their privileges and steal or sell this data to unauthorized third- party users. Reasons like employee negligence where they leave a system unlocked or fail to follow appropriate security measures is also a reason for attackers to gain access. Even backdoor exploitation, where attackers analyze the weaknesses in the system and exploit them to gain access to the data in the cloud, is another possibility to gain access to the data in the cloud. To protect against these kinds of attacks, there have been several studies on how one can improve cloud security. Meth- ods like creating a three-layered architecture [4] or involving fog computing [5] have all been inclusive in designing a constructive architecture in improving the security of the cloud. Studies have focused on implementing the idea that instead of putting all the data on one server, spreading it across various other servers would prevent the attacker from having access to all the data, if at all they do. The distribution and fragmentation algorithms used in this paper [4] have focused on dividing the data on three different servers, such as the local machine, the fog server, and the cloud server. The distribution the authors have designed is such that they have 1% of the data on the local machine, 4% on the fog server, and 95% on the cloud server. They [4] have used the Hash Soloman code algorithm [4] [6] to separate the data which primarily focuses on providing a fault-tolerant and data integrity module. However, when having 95% of the data on a single cloud server, it is important to understand that although the attacker does not have access to other servers, they still have access to 95% of the data which is significantly large. So adding security and encryption as another layer of protection within each server is necessary and crucial to ensure that the data distributed amongst these different servers is highly encrypted and unbreakable. Therefore, we propose a stronger, more secure data dis- tribution module in which the data is divided and encrypted in combination with the existing data fragmentation algo- rithm to maintain fault tolerance and add a layer of data confidentiality. The proposed algorithm focuses on restricting any kind of access to the data, whether it be a portion on a server or the entire data. Compared to previous methods [4], our method gives us an enhanced layer of security in combination with the Hash Soloman Code algorithm [4], [6] to provide a comprehensive security solution that addresses both data confidentiality and integrity. The remainder of the paper is structured as follows: Section 3 focuses on related work; Section 4 covers the project methodology; Section 5 describes the evaluation methodologies; Section 6 describes Rochester Institute of Technology 1 | Page
  • 2. RIT Computer Science • Capstone Report • 2228 the outcomes; Section 7 describes future work and the scope of improvements; and Section 8 concludes the paper. II. APPROACH The project aims on improving the data confidentiality that could be provided by the three-layer cloud storage system. By introducing an enhanced security algorithm to the existing infrastructure, we aim to create a more robust model to sort and distribute data across different servers ensuring security. Adding advanced encryption standard (AES) algorithm [7]in combination with Reed-Solomon Code algorithm [4], [6] helps us achieve this requirement. The existing data is pro- cessed through a distribution algorithm and passed through this new algorithm generated, to create another layer of security. To solve this, we take the data sent to the cloud and add redundant data to ensure data integrity, and then jumble this data, and divide it into blocks. After this process, each block will be sent to the particular servers, i.e., cloud server, fog server, and local machine based on the existing distribution algorithm. Before the data is sent into the respective servers, it undergoes AES encryption [7]. The main part of our proposed algorithm lies in its combi- nation with Reed-Solomon Code algorithm [4], [6] with AES [7]. For this, we make use of the PKCS7 padding, to ensure that the input data is appropriately aligned and meets the block size as required by AES for the encryption and decryption process. By using this padding, we are able to introduce false or misleading information that would essentially scramble the genuine data with the decoiled data, hence, making it difficult for the malicious attackers to actually find the genuine data, which is hidden amongst the padding. Since AES [7] uses symmetric key for encryption and decryption, it is important that we formulate a method such that only the user has access to the data on the cloud. To do this, we generate a private key that is only known to the user (using a secure key generation algorithm), and the symmetric key is encrypted using the secret key of the user. This algorithm ensures a fault-tolerant system, and mitigates data loss and corruption of data. The usage of AES [7] makes it more difficult to breach and increases data confidentiality. Using this method is definitely an improvement to previous techniques since, this algorithm ensures that the attacker will not have access to any portion of the data. III. LITERATURE REVIEW The importance of security in cloud storage has attracted a lot of attention in both the industry and from the ed- ucational perspective. There has been lots of research in the field of secure cloud storage architectures. We look at different encryption studies to provide a secure data stor- age system in combination with a Fog computing server. Arwa [8] talks about how using a key-exchange protocol based on attribute-based encryption provides secure inter- communication amongst different nodes. Xu [9] talks about another method involving homomorphic encryption, based on an article [10] and access policies to provide secure data sharing. Xu [9] also talks about an access control system for Fog computing environments, such that the data is confidential and there is selective access only. According to Seol [11], data leakage can occur when software is exploited or when employees or administrators misuse sensitive knowledge. The authors offer a system wherein an encrypted storage service is used to protect data from potentially compromised software and administrators. The authors discuss leveraging crypto processors to provide a safe storage service for consumers, with details inferences from a paper written by Gaspar [12]. The system is secure since the information that is saved has been secured and will not be disclosed even with privileged access. There have been several security algorithms employed in various cloud storage systems, and are yet to be utilized to build a more robust system. Concepts like AES (Advanced Encryption Standard) [7] focus on providing a strongly built, symmetric encryption and decryption algorithm. AES aims on providing data confidentiality. The paper [13] talks about how it combines AES with quantum cryptography to create a highly secure application. The authors say with the combination of AES and quantum security, the application created is of unprecedented level and can be used in high security requiring systems such as the nuclear defense system. The paper [13] mentions how AES is the fastest block cipher and is used for practical security. On a comparative study on AES, DES, RSA, and OTP, [14] explains how all four encryption algorithms are extremely robust and describes AES and DES on their accuracy of performance. Reddy [15] was able to create a solution to the emerging requirement for cloud data security. The technique employs hybrid cryptography as well as file splitting. Hybrid cryptog- raphy is an approach that combines the concepts of symmetric encryption and public-key encryption. The author emphasized that symmetric encryption is efficient but insecure because the same key is shared, hence the solution entails encrypting the symmetric key using the public key. This solution made it sturdy, secure, and efficient. File splitting entails separating the file into different pieces and encrypting each part individually, such that even if one of them is gained by an attack, the attacker will be unable to decode it. IV. IMPLEMENTATION The implementation of the proposed algorithm involves making changes and enhancing the security of the data dis- tribution algorithm. The proposed algorithm will focus on dividing the data in a more secure way, such that the data exposed on one server will not affect the data or security on the other server. The project basically aims to provide a robust and secure data distribution algorithm for Fog computing architectures. With the help of encryption, data fragmentation, and access control methods, we are able to implement this algorithm successfully. Below are the descriptions of the implementation in detail: Rochester Institute of Technology 2 | Page
  • 3. RIT Computer Science • Capstone Report • 2228 Fig. 1. Architecture of Data Distribution on Cloud Servers Fig. 2. Data Retrieval from the Cloud Servers A. System Design As the paper [4] describes their architecture on dividing the data into three servers mainly, the cloud, fog, and the local machine, we have implemented the same architecture. We use this architecture to deploy our data distribution algorithm. As seen in the Figure 1, we can find that the data after going through the distribution algorithm will be divided into servers, cloud, fog, and local machines by the ratio of 95:4:1. The majority of the data exists in the cloud. From the Figure 2, we can understand the retrieval process of data when the user requests information from the cloud server. When the user requests some information or data from the cloud server, the data will be extracted from combining all the information that is stored and distributed on the three different servers. The process initially begins with the cloud server receiving the request of the user, then the cloud server that holds 95% of the data will prepare the data and look for the information that the user has requested after retrieving the data. The request that the user has sent will go to the fog server and it will prepare the data and retrieve it and will combine it Fig. 3. System Design of Data Encryption and Distribution with the cloud server’s data giving 99% of the data. Then, the cloud server sends the request to the local server which has the remaining 1% of data. This 1% of data will be retrieved and will then be prepared by a local server and will be combined with the remaining 99% of data that we retrieve from the other two servers to get all the data, i.e., 100%. The data that has been requested by the users will be fetched and will be sent to the user. B. System Architecture The Data Distribution Algorithm is basically responsible for the entire of data so that it is secure and efficient after passing through the data distribution algorithm. In addition to the fault tolerance and data integrity that the Reed-Solomon algorithm offers, we are adding an important feature, which is data confidentiality. Data confidentiality is very important, since the data that is stored on one server, if attacked, will not allow the attacker to have information about the data on other servers. 1) Data Fragmentation with Padding: Initially in our algo- rithm, the data is divided into different blocks and fragments so that the data is manageable and can be further processed for security and encryption during the distribution. Padded data is beneficial as it does not have any meaning or contain any sensitive information. After this data has been divided into blocks, we add a layer of padding. The padding we use here is the PKCS7. This is a very popular padding algorithm, that is used for AES encryption. We use this particular padding, so that the total length of the data after we divide it into chunks, is essentially the multiple of the specified block size, i.e., as per AES requirement of 16 bytes. This has been depicted in the Figure 3. 2) Redundancy Addition: To improve fault tolerance, we use the Reed-Solomon Algorithm. This algorithm is known for storing all the files safely and when a server goes down for maintenance, the files will still be available even though they are on the down server. This is achieved by using redundant Rochester Institute of Technology 3 | Page
  • 4. RIT Computer Science • Capstone Report • 2228 Fig. 4. System Design of Data Retrieval information, also known as parity blocks. The Reed-Solomon Algorithm is responsible for generating the parity blocks so that there is efficient recovery of lost or corrupted data. 3) Private-Key Generation: We create a private key that is known only to the user so that it can be used for the encryption and decryption of the data that would be put on the cloud. The key generated will be done through a third-party independent of the cloud and will allow only the user to have access to it. This key will be of 16-bytes to support the AES 256-bit encryption algorithm. It has been designed such that in order to perform encryption or decryption, the user must input the private key each time to perform the operation. The secret key will be generated only once and the user must store it. 4) Encryption Algorithm: Once the data has been padded and added parity bits to support fault tolerance, the algorithm applies Advanced Encryption Standard (AES) to each of the blocks. By encrypting the data blocks, the data is secure even if an attacker gains unauthorized access to a server. By using the 256-bit key, we are able to ensure maximum data confidentiality. After this, the encrypted data will be divided onto the Cloud server, Fog server, and local machine. 5) Data Retrieval: As discussed in Figure 2, all the data is combined after retrieval from each cloud server. However, the restored data is encrypted. In order to retrieve the data, the user has to enter the 16-byte private key that had been issued earlier. On using this key for decryption, the plain text is now visible. However, it may be possible that the data may have been corrupted or simply lost. In order to recover the files completely, we process Reed-Solomon to retrieve any missing files that may have been requested by the user. After this process, the user has his original data that has been securely protected. V. PERFORMANCE EVALUATION Evaluating the model is a crucial aspect of assessing the effectiveness and performance of our algorithm. To understand where our data distribution algorithm is doing well and to find where it is lacking, we assess our algorithm on several factors including, security measures, vulnerabilities, attack vectors, effectiveness of data distribution and confidentiality, resource utilization, processing speed, and memory usage, and the overall performance. The security measures of this algorithm are used to provide confidentiality, integrity, and authentication. We will also by analyzing the vulnerabilities of this algorithm and will evaluate how resistant it is to different attacks, like brute force, or data tampering attacks. In the case of resource utilization, we will evaluate CPU usage, memory consumption, and network bandwidth. Factors such as the processing speed in comparison with previous algorithms will also be assessed. A. Evaluation Methodology To evaluate our designed system, we test the model on various scales of data such as text files, images, and video files that range from 480p to a 4K file. This allows our algorithm to be tested on a large range of data sizes so that we can understand and analyze its performance on smaller data files and how the overhead changes when processing a large data file. We will be evaluating diverse sets of data sets, which include multi-media files and text documents to assess the performance of the algorithm. B. Performance Metrics To understand the effectiveness of our system and algorithm, performance metrics are useful. In our paper, several aspects play a critical role in the functionality and impact of the overall system. These metrics determine whether our algorithm is strong, weak, or ineffective. By evaluating the performance metrics of our system, we are able to understand where our system is performing well and where it lacks. This will allow future researchers to improve the system further. Some of the key performance metrics that we will be taking into account by trying to evaluate the model are: 1) Encryption and Decryption Speed: This metric quan- tifies the efficiency of the AES distribution algorithm. By evaluating this, we are able to understand how well our encryption algorithm is performing with respect to our data. 2) Data Retrieval Time: Data retrieval time essentially measures the amount of time it takes for the user to retrieve the data once uploaded onto the cloud servers. The process of data retrieval includes several steps once the data is stored on the cloud server. Retrieval of the data includes the user providing the private key to perform decryption of the data once the data is restored from the cloud servers. After the data is decrypted, measures are taken to verify whether there has been missing, lost or corrupted data. If this happens to be the case, the Reed- Solomon algorithm will allow the same restoration of the data. This is all measured in the data retrieval time. Rochester Institute of Technology 4 | Page
  • 5. RIT Computer Science • Capstone Report • 2228 3) Memory Utilization: The memory utilization metric measures how efficiently the algorithm utilizes memory across the computing machine. By evaluating the memory utilization, we are able to see where potential bottlenecks can take place and are able to carefully analyze and target what is contributing to this memory utilization. 4) Resource Utilization: Resource utilization is one of the most prominent performance evaluation metrics that we focus on in this paper. We take into account several aspects such as CPU usage, network bandwidth, processing speed, and overall performance. Having evaluated this, we can understand where our algorithm lacks and improve on it. 5) Throughput: Throughput is a basic measure of how much work is done per unit of time. We can test how well our system is doing by measuring the throughput. A higher throughput means more work is done, whereas a low throughput means less work is done. 6) Security Vulnerabilities: Security vulnerabilities of the system include a comparison of potential attack vectors, the effectiveness of data encryption, and data confidentiality. Measuring this allows us to understand how resistant our encryption scheme is. This will be a qualitative analysis to understand and compare the proposed model with the existing ones. We will examine various techniques that could break the encryption scheme. 7) Error Handling: This evaluation examines how error correction is performed and how fast it is done. Another aspect that is tested is the amount of memory that gets consumed. To test this, we will take into account the time to recover from server failures. We will measure the overhead taken for when one server is down, two, and so on. This will be very helpful to understand how efficient our system is in failure recovery. This also has a real-time application advantage as corruption of data or loss is not highly uncommon. Having this feature in our system proves to be an advantage. C. Experimental Setup The experimental setup for evaluating the proposed algo- rithm involves creating a controlled environment that simu- lates the computing scenarios. The setup has the necessary hardware, software, and network components to replicate the data distribution process environment. The algorithm has been tested locally on a personal computer. 1) Hardware Components: The setup includes a personal computing device having 12th Gen Intel(R) Core(TM) i7- 12700H 2.30 GHz x-64 based processor, with a RAM of 16.0 GB and a 64-bit operating system. 2) Software Components: The software components in- clude the implementation of the proposed algorithm using VS Code IDE, Docker for containerization, and relevant libraries to perform encryption, decryption, Reed-Solomon encoding, and tracking the utilization of resources. The libraries include ReedSolo and Cryptography modules. 3) Data-sets: We selected a diverse set of different data types such as multi-media files and text documents with Fig. 5. Processing Time With Varying File Sizes varying data volumes so that the algorithm’s performance is tested across various data sizes. 4) Performance Measurement Tools: To measure and mon- itor the different performance metrics such as data distribution time, encryption and decryption speed, data retrieval time, CPU usage, memory usage, and network bandwidth, we em- ploy tools such as psutil in Python to collect these metrics. We also containerized the application by using Docker, which gave us the application’s performance in an isolated environment without the interference of any other application running in the background. VI. PERFORMANCE RESULTS After deploying our application into Docker, we measured various metrics which are crucial to understand the algorithm’s effectiveness, efficiency, and overall impact on data distribu- tion and security. Having measured these results, we were pro- vided valuable insights into how the algorithm is performing under various conditions and we are able to identify areas for improvement or optimizations. Following are some of the key results we recorded: A. Data Retrieval Time The first performance metric that we measured is the time for data retrieval including encryption and decryption of the data that is stored on the cloud server. We tested the application in two different environments, one being a 14-core system and another being an 8-core system. This provided insights into how quickly the data would be processed and retrieved as the number of cores increased, resembling a real-time server having a good number of cores to process large amounts of data. For easy understanding and even distribution on the graph, we employed Log-scale X-axis to represent an increase in file sizes. The file sizes taken are of various file types ranging Rochester Institute of Technology 5 | Page
  • 6. RIT Computer Science • Capstone Report • 2228 Fig. 6. Memory Utilization from 10 KB up to 1 GB. On the Y-axis, the processing times are represented in seconds. When the graph is plotted on both the 14-core and 8-core systems, we observe a common pattern, i.e., the processing time growing exponentially with the increase in file size. The processing time lies between 0 to 10 seconds within the first MB of data. However, after that, there is a steep slope, and the processing time increases. B. Impact of Memory Utilization The performance metric we measured next is memory utilization. We evaluated the memory utilization on a 16- core computing device and took into account two aspects of memory consumption including the Memory Percentage and RAM Memory. The memory percentage was evaluated based on a 16 GB ram computing device. By evaluating this, we were given insights on how much memory was consumed, and whether the overhead is too expensive. Also, the amount of memory needed to be employed if a large-scale application was deployed was inferred. For easy understanding and even distribution on the graph, we employed Log-scale X-axis to represent an increase in file sizes. The file sizes taken are of various file types ranging from 10 KB and up to 1 GB but were represented in the powers of 10. On the Y-axis, the memory percentage was recorded and on the minor axis, the RAM memory measured in terms of GB(s) was recorded. Our evaluation plotted on the graph recorded these two aspects simultaneously as they are significantly dependent on one another. There are two graphs plotted, one for each metric. If we take a look at the memory percentage graph, there was not much increase until the file size crossed 1000 MB. And the other graph has a similarity with the memory percentage graph, where there has been kind of same utilization for the file sizes under 1 GB, but for the files above 1 GB, there has been a steep increase, depicting the usage of the memory. C. Impact of Resource Utilization As a part of resource utilization, we analyzed the CPU utilization of the application as it was running. CPU utilization Fig. 7. CPU Utilization is an important aspect as it helps us understand the amount of resources that are being consumed by the application on the computing device. The CPU utilization has been recorded on a personal computing device having 12th Gen Intel(R) Core(TM) i7-12700H 2.30 GHz x-64 based processor, with a RAM of 16.0 GB and 64-bit operating system. The CPU utilization can be understood from the graph, where the file sizes are expressed logarithmically on the X- axis to provide even distribution, whereas the Percentage of CPU usage is represented on the Y-axis. The CPU utilization seems to be almost steady and below 2% for file sizes up to 1 MB, but after that, there has been an increase in the slope for file sizes greater than 1 MB. It can also be observed that the inclination for file sizes greater than 1 MB was just half of what has been achieved for file sizes greater than 500 MB. It can be inferred that as larger data files require to be processed, a device with higher specifications would be required. D. Qualitative Analysis In the qualitative analysis of this paper, we discuss the effectiveness of our data model and how robust it is against potential attack vectors, and how it provides effective data encryption and confidentiality against many common threats in fog computing environments. Most importantly, by equipping AES [7] to include in our algorithm, we significantly enhance the data protection in comparison to the previous methods which lacked any sort of encryption mechanisms. Especially, since we used AES in combination with a private key, which enables strict data access control [16]. This allows only authorized users (in our case, only the user) to access the data, preventing all unauthorized parties including the cloud administrators from accessing any kind of information. This, in turn, provides a defense against insider attacks. Another advantage of using AES is the protection against eavesdroppers during data transmission [17]. This makes the data protected and indecipherable during data upload and retrieval. Overall, in combination with AES with Reed-Solomon, the system achieves data confidentiality and provides a robust method to keep the data from undergoing permanent loss [18]. Rochester Institute of Technology 6 | Page
  • 7. RIT Computer Science • Capstone Report • 2228 Fig. 8. Impact on Recovery of Node Failure E. Impact of Error Handling Our algorithm employs the Reed-Solomon code algorithm to reinforce a fault-tolerant, data integrity-holding model. In combination with our encryption techniques, it is important to understand and observe how the Reed-Solomon algorithm behaves with respect to time and overall performance. To evaluate this, we performed testing on a 16-core computing device and employed 10 nodes to hold the data. We observed how the system is performing when a single node is down, two, and four, and recorded how long it was taking for the application to bounce back with the same data, which has been unchanged and has been successfully recovered. Similar to the previous graphs, we plotted the file sizes on the X-axis and the corresponding number of node failures on the Y-axis. We equipped a heat map to display the recovery times in comparison to the number of nodes that have been corrupted. We can see that for a particular file, for example, of size 10 KB, the recovery time increases with the increase in number of nodes that have failed. This pattern remains consistent with all the file sizes. Another observation that can be made is that with the increase in file size, the time to recover from node failure also increases. Most importantly, given out of 10 servers, 4 have failed, the application has successfully been able to recover all the lost data. The time to recover is recorded in seconds. VII. DISCUSSIONS Fog computing has emerged as a promising platform to overcome the challenges posed in contemporary computing settings by the enormous influx of data. The decentralized nature of fog computing provides faster computing and stores the data on different servers, where the entire data is not stored on the cloud server. But even this decentralized architecture has some security concerns. When we deal with sensitive data, it is important to ensure that the data is safeguarded against data exposure and unauthorized access. This is a very important aspect where there can be more robust and safer security algorithms used to safeguard the data. This paper proposes, along with the existing three-layer storage framework that [4] states, to involve a hybrid approach where both AES and Reed Solomon Code Algorithm are used. This ensures that even if a portion of the data on a certain server is compromised, the attacker will not be able to gain any information. The proposed algorithm goes beyond traditional methods and offers another layer of security, where data is protected during data transmission, and the system has a fault- tolerant design. Compared to the conventional data distribution techniques employed in fog computing environments, the suggested algo- rithm has significant advantages. Due to the inclusion of both a data confidentiality algorithm and a data recovery model, the foremost advantage is that the data cannot be either lost or hacked. The additional feature of allowing a user-private password keeps away everyone other than the user from performing encryption or decryption on the data that is to be stored on the data servers. This prevents all of the unauthorized cloud server access that could take place, and thus the data from getting stolen. By leveraging the processing power of edge devices, the proposed data distribution algorithm further improves effi- ciency in data retrieval and dissemination while lowering sys- tem latency. This enables efficient simultaneous data retrieval from various servers and improves data access times. This allows the entire system suitable for real-time. VIII. FUTURE WORK Some potential avenues for future work and development are for one, focusing on further enhancing the security measures. The research done has focused on working on integrating additional techniques to improve security, but further research can be done to strengthen data confidentiality by incorporating quantum-crypto techniques. Another aspect would be to im- prove the dynamic data distribution by developing dynamically adjustable data distribution based on the workloads, network conditions, and device capabilities. Scalability and performance are other aspects that can be improved, and also thinking about distributing the data in an energy-efficient manner is another aspect to look at. An interesting work that could be worked on is integrating machine learning-based security to detect any anomaly and or threats to remove the security threats, thereby improving the algorithm even more. IX. CONCLUSION Therefore, the proposed three-layer storage system for fog computing, which combines the Hash Solomon Code algorithm and AES encryption, provides a comprehensive solution for the security problems that inevitably occur in Rochester Institute of Technology 7 | Page
  • 8. RIT Computer Science • Capstone Report • 2228 these circumstances. The method significantly improves the efficiency, fault tolerance, and data secrecy of fog computing systems, strengthening their security posture. The successful development and evaluation of the algorithm in real-world deployment settings attest to its efficacy and suitability for data distribution in cloud and edge computing systems. Ensuring cloud security has remained an issue in the digital world, whereas the proposed strategy significantly advances data distribution and encryption approaches, paving the way for more secure and long-lasting fog computing infrastruc- tures. By conducting more research and looking into ways to improve the algorithm’s performance and scalability, it will only become clearer how useful the algorithm may be as a fundamental framework for safeguarding private information in distributed computing systems. REFERENCES [1] J. F. Gantz, D. Reinsel, and J. Rydning, “The u.s. datasphere: Consumers flocking to cloud,” IDC, 2019. [2] D. Kolevski, K. Michael, R. Abbas, and M. Freeman, “Cloud computing data breaches: A review of u.s. regulation and data breach notification literature,” in 2021 IEEE International Symposium on Technology and Society (ISTAS), 2021, pp. 1–7. [3] Y. Sun, J. Zhang, Y. Xiong, and G. Zhu, “Data security and privacy in cloud computing,” International Journal of Distributed Sensor Networks, vol. 10, no. 7, p. 190903, 2014. [Online]. Available: https://doi.org/10.1155/2014/190903 [4] T. Wang, J. Zhou, X. Chen, G. Wang, A. Liu, and Y. Liu, “A three- layer privacy preserving cloud storage scheme based on computational intelligence in fog computing,” IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 2, no. 1, pp. 3–12, 2018. [5] S. Cao, H. Han, J. Wei, Y. Zhao, S. Yang, and L. Yan, “Space cloud-fog computing: Architecture, application and challenge,” in Proceedings of the 3rd International Conference on Computer Science and Application Engineering, ser. CSAE ’19. New York, NY, USA: Association for Computing Machinery, 2019. [Online]. Available: https://doi-org.ezproxy.rit.edu/10.1145/3331453.3361637 [6] H. Xu and D. Bhalerao, “Reliable and secure distributed cloud data storage using reed-solomon codes,” International Journal of Software Engineering and Knowledge Engineering, vol. 25, pp. 1611–1632, 11 2015. [7] M. Dworkin, E. Barker, J. Nechvatal, J. Foti, L. Bassham, E. Roback, and J. Dray, “Advanced encryption standard (aes),” 2001-11-26 2001. [8] A. Alrawais, A. Alhothaily, C. Hu, X. Xing, and X. Cheng, “An attribute- based encryption scheme to secure fog communications,” IEEE Access, vol. PP, pp. 1–1, 05 2017. [9] Q. Xu, C. Tan, Z. Fan, W. Zhu, Y. Xiao, and F. Cheng, “Secure data access control for fog computing based on multi-authority attribute-based signcryption with computation outsourcing and attribute revocation,” Sensors, vol. 18, no. 5, 2018. [Online]. Available: https://www.mdpi.com/1424-8220/18/5/1609 [10] X. Yi, R. Paulet, and E. Bertino, Homomorphic Encryption. Cham: Springer International Publishing, 2014, pp. 27–46. [11] J. Seol, S. Jin, and S. Maeng, “Secure storage service for iaas cloud users,” in Proceedings of the 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, ser. CCGRID ’13. IEEE Press, 2013, p. 190–191. [Online]. Available: https://doi- org.ezproxy.rit.edu/10.1109/CCGrid.2013.31 [12] L. Gaspar, “Crypto-processor - architecture, programming and evaluation of the security,” 11 2012. [13] G. Sharma and S. Kalra, “A novel scheme for data security in cloud computing using quantum cryptography,” in Proceedings of the International Conference on Advances in Information Communication Technology & Computing, ser. AICTC ’16. New York, NY, USA: Association for Computing Machinery, 2016. [Online]. Available: https://doi-org.ezproxy.rit.edu/10.1145/2979779.2979816 [14] T. Talaei Khoei, E. Ghribi, R. Prakash, and N. Kaabouch, “A perfor- mance comparison of encryption/decryption algorithms for uav swarm communications,” 02 2021. [15] H. S. C. Reddy, V. V. Karthik, D. V, A. Pavan, and S. V, “Data storage on cloud using split-merge and hybrid cryptographic techniques,” in 2022 International Conference for Advancement in Technology (ICONAT), 2022, pp. 1–5. [16] S. Duggal, V. Mohindru, P. Vadiya, and S. Sharma, “A comparative analysis of private key cryptography algorithms: Des, aes and triple des,” International Journal of Advanced Research in Computer Science and Software Engineering, vol. 6, p. 1373, 06 2014. [17] M. Yahaya and A. Ajibola, “Cryptosystem for secure data transmission using advance encryption standard (aes) and steganography,” Interna- tional Journal of Scientific Research in Computer Science, Engineering and Information Technology, vol. Volume 5, pp. 317–322, 12 2019. [18] D. U. Singh, “Error detection and correction using reed solomon codes,” Error Detection and Correction Using Reed Solomon Codes, vol. 3, 03 2013. Rochester Institute of Technology 8 | Page