In this work we highlighted some of the concepts of data privacy, techniques used in data privacy, and some techniques used in data privacy in the cloud plus some new research trends.
Application of Matrices in real life. Presentation on application of matrices
Data security and privacy
1. Data Privacy and Security
Rajab Ssemwogerere1[2019/HD05/29911U]
and Wamwoyo
Faruk2[2019/HD05/25248U]
Department of Computer Science, Makerere University
{srajab@cis.mak.ac.ug,faroukissa85@gmail.com
Abstract. Data privacy is a intricate job and is tremendously becoming
a major key area of research as far as information technology and cloud
technologies are concerned. This is due to the fact that, data or informa-
tion is massively generated from many sources (smart phones, IP cam-
eras, wireless sensor networks), it’s collected, shared and disseminated
in many different segments. Due to this factor, countless individuals or
organizations have absconded the cloud services despite the fact of their
endless fruitiness benefits. Hence data security and privacy are becoming
more important for the future development and improvement of cloud
technologies in the government, business and industry. Data privacy pro-
tection issues are relevant to both hardware and software in the cloud
architecture. This study is to review different data privacy concepts, ap-
proaches / techniques of managing data privacy, data privacy techniques
in the cloud and some research trends in data privacy.
Keywords: Data privacy; data protection; syntactic privacy; semantic
privacy.
1 Introduction
In this era of many disruptive technological innovations like IoT smart projects,
artificial intelligence, robotics, blockchain technology, advanced virtual reality,
3D printing etc. Our lives have gradually changed since these technologies have
made things simpler. Despite the fact of the enormous benefits these technolo-
gies have impacted to our communities and societies, they have also significantly
impacted the users’ privacy since more and more personal data is processed, col-
lected, shared and dispersed. Data like location info, personal info over social
medias e.g. what’s-app and Facebook, medical data e.g. the percentage of num-
ber of people who have so far being diagnosed and have been tested positive
with Ebola[1]. There are different reasons as to why businesses or organization’s
collect, share and distribute personal information.
One of the reasons why organization’s collect data; to make informed deci-
sions from this data collected, for further analysis, study and research and to
also provide effective improved services. Some of the reasons why they share
personal data; to meet a contractual obligation or pursue a research project but
basing on some restriction’s and conditions set by the General Data Protection
2. 2 Rajab Ssemwogerere and Wamwoyo Faruk
Regulation (GDPR); individuals must be aware that their data is being shared,
sharing must be secure and documented. All this is to ensure that the personal
data is protected adequately and handled properly by others.
However, sharing personal data has put many individuals at risk, the research
community has therefore dedicated many efforts in developing appropriate defi-
nitions of privacy along with data protection techniques specifically targeted to
enforce them[2]. Several different definitions of data privacy and techniques have
been defined over years.
According to (De Capitani et al, 2012), they categorized them into two
(syntactic privacy and semantic privacy). Syntactic privacy definitions capture
the protection degree enjoyed by data respondants with a numerical value. And
Semanti privacy definitions are based on the satisfaction of a semantic privacy
requirement.
The main objective of our paper is to discuss some of the concepts of data privacy,
approaches / techniques of managing data privacy, data privacy techniques in
the cloud and some of the research trends in data privacy.
1.1 What is Data privacy?
Data privacy also known as information privacy is a branch of data security which
relates to how a pierce of data is properly handled, collected, shared and used
[3-5]. In commerce consumers privacy needs to be protected, in organization’s
privacy entitles the application of processes, standards and laws of managing
personally identifiable information[6].
1.2 Approaches/ techniques of managing data privacy.
Incorporating ‘privacy by design’ into our IT systems: Taking this ap-
proach of security to your security projects through incorporating privacy and
data protection from this start. Helps your organization to comply with global
data privacy regulations. This can be done when;
– Deploying any new IT infrastructure that stores or processes personal data.
– Implementing new security policies or strategies.
– Sharing any data with third parties or customers.
– Using data for any analytical purposes
Conducting a privacy impact assessment (PIA): A PIA is a beneficial
tool used to identify and reduce the risk of poor data privacy practices in your
organization. These assessments reduce your risk of mishandling personal data.
Key stakeholders are involved in a PIA interview which results in identifying
potential privacy problems and offers recommendations on how to address chal-
lenges. Ultimately, a PIA will help an organization and security team develop
better policies and systems for handling sensitive personal data.
Unfortunately, managing data privacy can’t be treated as a check-box ex-
ercise. Global data privacy regulations are often loosely structured and can be
3. Data Privacy and Security 3
interpreted in many ways. There’s no defined standard of security controls on
how an organization should handle personal data and privacy. In reality, manag-
ing data privacy is about creating a comprehensive governance framework that’s
suited to your business alone.
Demonstrating compliance with global data privacy regulations: Demon-
strating compliance with global data privacy regulations is a long-term outcome
of implementing the right privacy and security controls with your people, pro-
cesses, governance and technology. It requires a steadfast approach to each of
these areas. Unfortunately, managing data privacy can’t be treated as a check-
box exercise. Global data privacy regulations are often loosely structured and
can be interpreted in many ways. There’s no defined standard of security con-
trols on how an organization should handle personal data and privacy. In reality,
managing data privacy is about creating a comprehensive governance framework
that’s suited to your business alone.
Data semantic and Data syntactic techniques: Data syntactic approaches
or techniques traditionally guarantee data protection preserving the truthful-
ness of the released information. Semantic approaches operate in scenarios; non-
interactive scenarios and interactive scenarios. They typically add noise to the
released data, this agitates the original content of the dataset, thus achieving
privacy at the price of truthfulness[2]. Non-interactive scenarios entail in the
release of a data collection and Interactive scenario in evaluating queries over a
private data collection managed by the data holder. These techniques are used
to guarantee that the query results plus those collected by data recipients cannot
be exploited to gain information that should be kept secret.
1.3 Data privacy techniques of managing data privacy in the cloud
Encryption of data within the application: This is one technique of imple-
menting data confidentiality directly by encrypting data within the application.
From the end-user’s perspective, data is protected against external hackers, in-
ternal attacks and phishing attacks. And from the service providers perspective,
it significantly reduces their risks and this makes it a joint responsibility between
them and the users[7]. Examples of cloud encryption services include; boxcryp-
tor, and uSav. This technique raised two major challenges; how will the encrypted
data be stored at the service provider without changing the implementation of
the applications? And since many functionalities supported by application re-
quire data to be in plain text form (e.g., language translation function available
in Google Doc). How can such functionalities continue to be made available to
the end-user, even though data is encrypted?
These challenges where addressed firstly, using a variety of format preserving
encryption techniques and then secondly, an extendible middleware called cloud-
protect was proposed respectively. Cloudprotect enables user-driven application
data to be stored on the service provider side in encrypted form, also adds an
extra layer to the user data. It also transforms users’ requests to operate on the
4. 4 Rajab Ssemwogerere and Wamwoyo Faruk
encrypted data. It facilitates key management and secure sharing of encrypted
data.
Fig. 1. Figure 1 illustrating a right scale of the 2018 state of the cloud report
User Authentication technique: Only allows access to your data stored in
the cloud to only authorized persons, making it crucial to restrict as well as
monitor the one who accesses the company’s data. For the purpose of user au-
thentication, organizations should be able to see data access logs and audit trails
to permit only authorized users to view the data. These access logs and audit
trails also have to be protected against threats and managed till the company
requires it.
Cryptography technique: This technique can achieve confidentiality of data
of information using three different types of algorithms. They include symmetric-
key, Asymmetric-key and Hashing algorithms.
Symmetric key algorithms use the same cryptographic keys for both encryp-
tion of plaintext and decryption of ciphertext. Using the same key for both
encryption and decryption is a major drawback of symmetric key algorithms.
These algorithms are also primarily used for the bulk encryption of data or data
streams. Once data is encrypted with a given key, there is no fast way to de-
crypt the data without possessing the same key. These algorithms are divided
into block and stream algorithms. Block algorithms encrypt a data block (many
bytes) at a time, while stream algorithms encrypt byte by byte.
Asymmetric key algorithms are a newer version of Symmetric key algorithms.
Its secret key is divided into a public and private key. This algorithm uses a pair
of public key and a private key to encrypt and decrypt messages when com-
5. Data Privacy and Security 5
municating. The public key can be given to anyone, trusted or not, while the
private key must be kept secret. This algorithm provides both authentication
and confidentiality. A public key is made freely available to anyone who might
want to send you a message. The second private key is kept a secret so that you
can only know what is contained inside that message. Hashing algorithms just
ensure integrity of data.
Using secured protocols like the HTTPS: These good protocols provide
data confidentiality but don’t guarantee data integrity[8]. HTTP stands for hy-
pertext transfer protocol. HTTPS (hypertext transfer protocol secure) is an ex-
tension of the HTTP. widely used on the Internet. In HTTPS, the communication
protocol is encrypted using Transport Layer Security (TLS) which is the prede-
cessor of the Secure Sockets Layer (SSL). The HTTPS does authentication of the
accessed website, protection of the privacy and integrity of the exchanged data.
It also protects against theman-in-the-middle attacks likely to happen when
a company is deployed under the public cloud, whose infrastructure is completely
managed by the third party called the service provider. HTTPS creates a secure
channel over an insecure network against man-in-the-middle attacks.
VPN (Virtual Private Network): VPN provides a secure communication.
Many Cloud service providers offer VPNs to offer protection between user con-
nection to the internet and across the internet. It enables users to send and
receive data across shared or public networks as if their computing devices were
directly connected to the private network. VPN can’t completely make cloud
platforms completely secure, but they increase data privacy and security. Secure
VPN protocols include; internet protocol security (IPsec), Transport Layer Se-
curity (SSL/TLS), Datagram Transport Layer Security (DTLS) and many more,
they provide data confidentiality, data authentication and data integrity.
1.4 Some research trends in data privacy.
Data privacy issues are a major concept especially in cloud platforms, moti-
vating some companies to build their own clouds to escape these issues[9]. The
introduction of cloud computing has brought a number of risks, opportunities
and possibilities for the new innovations. Under these research trends in data
privacy, we will focus on some of the relevant future solutions or developments
that have been developed and their long-term effects. Common security issues
around cloud computing are divided into four main categories[10], cloud infras-
tructure, data, access and compliance. Our major focus is data privacy, looking
at data integrity, data lock in, data remanence, provenance, data confidentiality
and user privacy specific concerns.
Authentication and identity management: Users can easily access their
personal information and make it available to various services across the Internet
using an identity management (IDM) mechanism, it can authenticate users and
6. 6 Rajab Ssemwogerere and Wamwoyo Faruk
services based on their credentials and characteristics. The existing password-
based authentication systems have a limitation and poses significant risks. But
an IDM based system protects private and sensitive data or information related
to users and processes[9].
Access control and accounting: Access control services should be flexible
enough to capture dynamic, context, or attribute-or credential-based access re-
quirements and to enforce the principle of least privilege. Such access control ser-
vices might need to integrate privacy-protection requirements expressed through
complex rules. It’s important that access control system employed in clouds is
easily managed and its privilege distribution is administered efficiently[9].
Consumer activism: Consumer awareness and involvement will create more
meaning about data privacy, how it has been or will be applied and implemented
in the future research trends. Consumers will become more eager to know how
organizations or companies got their data i.e. how a consumer’s email has been
compromised to start receiving spam emails. Data breaches and data misuse
court cases have all eroded the trust that individuals place in for-profit and non-
profit entities. I firmly believe that this damage can be repaired, but it will take
work on the part of organizations to win trust through transparency.
Ethical questions around automation: Data privacy around the automated
disruptive technologies like IOT, mobile and wearable devices combined with
machine learning and artificial intelligence is still encountering difficulties. In a
way that some organizations are still profiting from the use of a person’s infor-
mation i.e. Facebook. Secondly, an ethical dilemma around anonymized data i.e.
wearable health devices that track data patterns of individuals healthy activity
pattern, these results are analyzed anonymously by healthcare researchers using
AI. If one of these researchers finds a correlation between a certain reading and
a healthcare risk, is there an ethical obligation to then inform users who exhibit
this pattern? But of course, if the data is all anonymized, this should not be
possible.
1.5 Conclusion
Data Privacy is a heterogenous term subjected to several definitions. Different
data definitions have been defined to clarify this term. In this paper, we first
defined the definition of privacy, explained Data privacy techniques of managing
data privacy in the cloud and then lastly, we discussed some of the future and
research trends in data privacy which provides the basis for the development of
the innovation strategy and future orientation.
7. Data Privacy and Security 7
References
1. S. I. Okware et al., ”Managing Ebola from rural to urban slum settings: experiences
from Uganda,” African health sciences, vol. 15, no. 1, pp. 312-321, 2015.
2. S. De Capitani Di Vimercati, S. Foresti, G. Livraga, and P. Samarati, ”Data privacy:
definitions and techniques,” International Journal of Uncertainty, Fuzziness and
Knowledge-Based Systems, vol. 20, no. 06, pp. 793-817, 2012.
3. D. E. Robling Denning, Cryptography and data security. Addison-Wesley Longman
Publishing Co., Inc., 1982.
4. V. Diamantopoulou, A. Tsohou, and M. Karyda, ”General Data Protection Reg-
ulation and ISO/IEC 27001: 2013: Synergies of Activities Towards Organisations’
Compliance,” in International Conference on Trust and Privacy in Digital Business,
2019: Springer, pp. 94-109.
5. M. Li, W. Lou, and K. Ren, ”Data security and privacy in wireless body area
networks,” IEEE Wireless communications, vol. 17, no. 1, pp. 51-58, 2010.
6. Y. Sun, J. Zhang, Y. Xiong, and G. Zhu, ”Data security and privacy in cloud
computing,” International Journal of Distributed Sensor Networks, vol. 10, no. 7,
p. 190903, 2014.
7. M. H. Diallo, B. Hore, E.-C. Chang, S. Mehrotra, and N. Venkatasubramanian,
”Cloudprotect: managing data privacy in cloud applications,” in 2012 IEEE Fifth
International Conference on Cloud Computing, 2012: IEEE, pp. 303-310.
8. M. Y. Pandith, ”Data security and privacy concerns in cloud computing,” Internet
of Things and Cloud Computing, vol. 2, no. 2, pp. 6-11, 2014.
9. J. Sen, ”Security and privacy issues in cloud computing,” in Cloud Technology:
Concepts, Methodologies, Tools, and Applications: IGI Global, 2015, pp. 1585-1630.
10. S. Sengupta, V. Kaulgud, and V. S. Sharma, ”Cloud computing security–trends
and research directions,” in 2011 IEEE World Congress on Services, 2011: IEEE,
pp. 524-531.