SlideShare a Scribd company logo
1 of 14
Download to read offline
Secure File Sharing on Cloud
Attribute-Based Access Control and Keyword Search over Encrypted Data
Rachita Gupta, Supriya, Qiuxiang Dong
School of Computing, Informatics and Decision Systems Engineering
Arizona State University, Tempe, AZ, USA
{rgupta36, sashokk2, qiuxiang.dong}@asu.edu
Abstract— Cloud computing is a promising new technology where computing resources are provided as a service
to users over the internet. Since cloud storage provides greater flexibility and availability at a much lower investment,
more and more users are attracted to outsource their data on the cloud servers. As more sensitive data are outsourced
on cloud server environment, the security of the data may be compromised. To combat unsolicited accesses from
malicious insiders (e.g., cloud server administrator) and outsiders (e.g., hacker), these sensitive data have to be
encrypted before outsourcing. Although data confidentiality could be protected by traditional encryption schemes
(e.g., AES, RSA), it is difficult to implement data sharing and searches over the encrypted data.
This project aims to implement secure file sharing in cloud storage through utilizing some recently proposed
encryption schemes. Firstly, the data owners encrypt their files by Attribute-Based Encryption (ABE) scheme so that
only authorized data users could decrypt these files. Second, Attribute-Based Keyword Searchable (ABKS)
encryption scheme is implemented, thus enabling the authorized data users to search over the encrypted files to find
their desired files. Finally, some parallel processing approaches will be employed in this project to enhance search
efficiency, when there are huge number of files stored on the cloud servers.
Keywords— Cloud Computing, Data Leakage, Secure Cloud Storage, Attribute-Based Encryption (ABE), Attribute-
Based Keyword Search (ABKS), Access Control, Keyword Search
INTRODUCTION
Cloud computing has emerged as an innovative technology which has changed the way we do computing by
providing all services and applications through the internet. It is an ecosystem which enables on-demand access to pool
resources with greater flexibility and availability at a much lower investment cost by allowing the users to pay only for the
resources and services they use. The main concept of cloud computing is not to own any hardware or software but to avail
the computational resources, storage databases or any other services at a minimal cost to the cloud provider. Cloud
infrastructure incorporates the five characteristics which are on-demand service, broad network access, resource pooling,
rapid elasticity and measured service. Adopting the cloud for enterprises has empowered businesses by ensuring lesser cost,
faster deployment, lesser acquisition of physical assets and much greater customer satisfaction.
Since many IT enterprises are hosting their services, applications and data on cloud, there is a growing concern
regarding the privacy of the sensitive data being compromised. The most obvious solution to this problem is to apply some
encryption techniques before uploading or sharing the data on cloud. But then how the uploaded encrypted data would be
used on cloud is a new challenging problem that needs to be addressed. This project involves the search over encrypted data
but most of the techniques selectively search the data only at a coarse-grained level i.e., by sharing the private key with the
other authorized users. But this is not a feasible solution in the multi-owner and multi-user cloud based file storage system
where owner either has to act as an intermediary and decrypt all the files it wants to share with some other parties or give
them its private decryption key, giving access to all its data. Hence, to enable multiple owners to encrypt and upload their
data files to the cloud storage and make them searchable by other users, a new cryptosystem “fine grained search
authorization” was proposed that only allows the authorized users to search over the encrypted data [1].
“Fine grained” implies that the search authorization is controlled at the granularity of each file and exploits the
attribute-based encryption (ABE) technique introduced by Sahai and Waters [2]. In attribute based encryption the data
owner encrypts the index of each file with a self defined access policy, which dictates that which type of user can search
this index [1]. An access control policy would define the type of users who would have permission to access the documents.
For instance, in an academic institute only the professor of CS department or the TA of that course would have access to
the grade sheet for a particular course. This can be expressed in form of a policy as below. Hence only the users satisfying
this attribute predicate would be able to access the grade sheet.
((Professor ∧ CS dept.)∨(CS student ∧ course TA))
The keywords and attributes are defined separately in the report. Keywords are the actual content of the files and attributes
are the properties of the users[1]. The owner of data creates the index of all the keywords in the data file but encrypts the
index with the access control policy structure based on the attributes of the authorized users [1]. Attribute-Based Keyword
Searchable encryption scheme (ABKS) is a data-owner-enforced search authorization system that only users who meet
the owner-defined access policy can retrieve the valid search results [1]. Since the searchable scope is very large in the
cloud environment and cryptographic search has to be done with linear time complexity, so some parallel computing
approaches are used to enhance search efficiency.
The outcome of this project would be the implementation of attribute-based keyword search, and attribute-based
data access control using parallel processing. Table 1 describes the various tasks taken up by the following group members
in the defined time frame to the project completion.
SYSTEM MODELS
System Model
There are three kinds of servers in our system (ref. Figure 1). The first is key generation server (namely Trusted
Authority, TA for short), which generates private keys for users. The second is metadata server, which is responsible for
managing the metadata and also process data uploading, data search requests from data contributors (owners) and data
readers (users) respectively. The third is data storage server, which stores the files. There are multiple data contributors
(or owners) and multiple data readers. The design goals of this system are as follows.
 Data access control: Enable data-owner-enforced data access control, i.e., only users who meet the owner-defined
access policy can decrypt the files.
 Authorized Keyword Search: Enable data-owner-enforced search authorization, i.e., only users who meet the
owner-defined access policy can obtain the valid search results.
 Multiple Data Owners and Data Readers: accommodate multiple data owners and data readers. Each reader is able
to read and search over the encrypted data contributed by multiple data owners.
 Security Goals: Protect both the contents and the keywords privacy.
 Efficiency Goals: Search operations could be implemented by the cloud server efficiently.
The workflow of our system is presented in Figure 3. The detailed description is as follows.
 System Setup: TA runs the system setup algorithm to generate the public parameters and master secret key. The
public parameters are distributed to the whole system, including the servers, the data owners and the data users. The
master secret key is kept private by TA and is used in the Key Generation phase below to generate private key for
users.
 Key Generation: TA takes public parameters, master secret key and a user attribute list (sent from data user) to
generate private key for the user.
Task assigned Timeline Group Member
Team Formation 25th Aug - 28th Aug
2016
Idea and Design development
Time and task allocation
29th Aug - 8th Sep
2016
Rachita Gupta,
Supriya,
Qiuxiang Dong
Project Proposal Report Writing 9th Sep - 14th Sep
2016
Rachita Gupta,
Supriya,
Qiuxiang Dong
Implementation
Data structure design and Environment Setup
15th Sep - 17th Sep
2016
Rachita Gupta,
Supriya,
Qiuxiang Dong
Algorithm design and
implementation
Client Server Model 18th Sep - 25th Sep
2016
Supriya
Cryptographic
Algorithm
18th Sep - 25th Sep
2016
Qiuxiang Dong
Parallel processing
18th Sep - 25th Sep
2016
Rachita
System Integration
26th Sep - 29th Sep
2016
Rachita Gupta,
Supriya,
Qiuxiang Dong
Testing 30th Sep - 1st Oct 2016 Rachita Gupta,
Supriya,
Qiuxiang Dong
Second Report writing 2nd Oct - 5th Oct 2016 Rachita Gupta,
Supriya,
Qiuxiang Dong
Testing and troubleshooting 6th Oct - 15th Oct 2016 Rachita Gupta,
Supriya,
Qiuxiang Dong
Final Report writing 16th Oct - 26th Oct
2016
Rachita Gupta,
Supriya,
Qiuxiang Dong
Table 1: Project Task allocation and Timeline
 File Upload: When the data owner uploads a file, the following operations are performed. First, encrypt the file f
with a symmetric encryption algorithm (e.g., AES). Second, encrypt the AES key (K) with an Attribute-Based
Encryption algorithm (ABE for short). Third, generate the Keyword List (KL) for the file and encrypt the keywords
with the keyword encryption algorithm of the Attribute-Based Keyword Searchable Encryption scheme (ABKS for
short). Finally, the user sends the ciphertext of the f, K, and KL to the metadata server, i.e., the index and encrypted
file.
Upon receiving the file uploading request from the data contributor, the metadata server will store the following
item (ref. Figure 2) for this request and store the file on the storage server(s).
Figure 1: System architecture
Figure 2: Storage format on metadata server
 Token Generation: When a data user wants to search his/her interested files, he/she will generate a search ToKen
(TK) for his/her interested keyword (e.g., kw) with his/her private key and send the token to the metadata server.
 Search: Upon receiving the search request, the metadata server runs the search algorithm of the ABKS scheme to
check whether a file contains the keyword (i.e., kw is included in some KL). On the one hand, randomness in
included in the ciphertexts, so the complexity of the search will be linear. On the other hand, there are a lot of files.
Therefore, it will be time-consuming to perform keyword search over encrypted data. To solve this problem, some
parallel computing technology will be used on metadata server. If some file(s) contain the interested keyword, the
metadata server will send back the corresponding ABE(K) and encrypted files to the data reader.
 File Decryption: Upon receiving the search results, the data reader decrypts ABE(K) and obtains K, which can be
used to decrypt the files.
Software
PBC [8] and OpenSSL [9] library will be utilized to implement the cryptographic algorithms. On the file system, to
search a particular file, Apache spark will be used. For the communication between the metadata server and the user, the
project uses client- server architecture.
Security Model (optional)
The TA is assumed to be fully trusted. The cloud servers (including the metadata server and the data storage servers) are
semi-honest, which means they will run the pre-defined algorithms or protocols honestly but they will try to find private
information based on background knowledge or other information they could get. The data readers are assumed to be
malicious. They want to access the files of which they are actually not authorized to read. They might collude with each
other.
Key Generation
Figure 3: Message sharing between entities
PROJECT DESCRIPTION
With the advent of cloud computing, a new computing paradigm, more and more users and enterprises are shifting their
data to cloud storage servers. As more sensitive data are outsourced on cloud server environment the new challenging
problem is that confidentiality or privacy of the data may be compromised, and access control over the data becomes difficult
to handle, because the commercially operated clouds are likely to be outside the trusted domain of the users. The obvious
solution to ensure the confidentiality and desired access control of data is to encrypt the data and share the private key with
the authorized users. However this can only implement coarse-grained level access control and is an unscalable solution
involving lot of computation for key distribution and management on the client side. This project explores the scenario of
multiple owners uploading their encrypted data files to the cloud storage and make them searchable to multiple users by
“fined grained search authorization”, which is an access control policy method that makes the data files accessible only to
the authorized users without revealing the actual data content.
Attribute-Based Keyword Searchable Encryption scheme (ABKS), a data-owner-enforced authorization search system,
is used that only allows users who meet the owner-defined access policy to search over the encrypted data [1]. Some parallel
processing approaches will be employed to achieve faster keyword search over encrypted data in the huge searchable space
of cloud environment.
Figure 4: This project will use the above architecture. This architecture is provided by the thoth lab.
RESPONSIBILITY MACHINE
Metadata Server devstack 1
Trusted Authority(TA) devstack2
Data Storage Server devstack 1/2
user VM01
Table 2: Server Machines in system architecture and their responsibilities
Project Overview
Task 1 : Environment Setup
Figure 4 shows the environment setup architecture in thoth lab. The different types of servers needed in this project are
key generation server (i.e., TA), metadata server, and data storage server. Table 2 shows the various machines in the system
and their various responsibilities.
1. Trusted Authority: This server is responsible for setting up cryptographic system, i.e., public parameters and master secret
key, and generating private keys for all the users. After system setup it could be offline until when new users join the system.
2. Metadata server: This server stores metadata of users’ files, including searchable keywords ciphertexts, encrypted key
as well as the references to the files stored on the data storage server. It processes the data uploading request from data
owners and data search requests from data users.
3. Data storage Server: These server stores all the files on the cloud storage. Number of data storage server may vary
depending on the requirement. This project uses two data storage servers. Some cloud storage management platform will
be installed there. The physical machine or VM should provide large storage space, or else open cloud computing platform
could be used.
Task 2: Data structure design
Design data structure of each file’s index stored on the metadata server. Design data structure used in the ABE and ABKS
algorithm.
Task 3: Algorithm design and implementation
Design and implement ABE and ABKS algorithms. Design and implement parallel searching algorithm with some
parallel processing approach.
TEAM MEMBER TASKS PERCENTILE
Rachita Gupta Parallel Computations on the metadata using apache spark. This will help in faster search
by the user on the data stores.
33.3%
Supriya (Team
Leader)
Setting up client server communication between the user and the servers. 33.3%
Qiuxiang Dong Implement file encryption, attribute-based encryption and the attribute-based keyword
encryption schemes.
33.3%
Table 3: Task distribution
Task 4: System development
Integrate all the algorithms and components into a whole system. Web service development will be needed in this part.
Task 5: System Testing
Use some real-world dataset to test whether the system could work effectively and efficiently.
Project Task Allocation
Project will be completed by 3 members of the team. Table 3 provides details of the task allocation of each member in the
project.Once individual tasks are completed all the team members will integrate the entire project to obtain full functionality.
Deliverables
This project will create a software package for the user. This software helps in interacting with the TA to obtain private key.
Also, it will provide encryption algorithms, which will use the user’s private key and encrypt the files, symmetric encryption
key and keyword list. It will also create service for the cloud system which will provide users secure file sharing on cloud.
Project Timeline
Figure 5 shows the Gantt chart for this project depicting the detailed task-time distribution.
Figure 5: Gantt Chart for the project
IMPLEMENTATION DETAILS
System Setup (run by TA)
System setup is done by TA by running the function “SystemSetup()” in our code “Setup.c”. This function takes
use of a cryptographic parameter file a.param as the security parameters for the whole system and generates the public key
as defined by the data structure “PublicKey”, and a master secret key as defined by the data structure “MasterKey”. We
store them in “PK.bin” and “MSK.bin” respectively. “PK.bin” is shared by all the entities in the whole system, and
“MSK.bin” is kept secret by TA himself and will be used to generate private keys for users in the key generation phase.
The algorithm “SystemSetup()” is as follows:
The data structure of PublicKey and MasterKey is as follows.
Key Generation (algorithm run by TA, data transmission between TA and users, storage is needed on client
side)
After setting up, TA has public parameters stored in “PK.bin” and master secret key “MSK.bin”. If one user wants
to obtain a private key for searching over the encrypted files stored on the cloud servers and decrypt the corresponding files,
he/she would send an attribute list to TA. The attribute list is a boolean array. The user defines this attribute list in a file and
sends it to the TA.
After the TA receives the attribute list, it runs the function “ABEKeyGen(int AttributeList[ ], struct PrivateKey*
PrivK)” in “ABEKeyGen.c” to generate a private key for that user. The first input “AttributeList[ ]” is the attribute list
of that user, it is an ATT_NUM-length 0/1 array, where ATT_NUM is defined to be the number of attributes in the whole
system. 0 indicates that the user does not have the corresponding attribute while 1 indicates he/she has that attribute. So this
attribute list defines a user’s access privilege.The second parameter “PrivK” stores the generated private key for that user.
The algorithm of private key generation is as follows.
The private key data structure is defined as follows.
The generated private key would be stored in a file named “PrivK.bin” and sent by TA to the user in a secure way.
To obtain the private key, the data owner requests the TA for the key by first establishing a secure TCP server-client socket
and then sending an attribute list.
1. Create the socket. The three arguments are:
1) Internet domain 2) Stream socket 3) Default protocol (TCP in this case)
2. clientSocket = socket(PF_INET, SOCK_STREAM, 0);
3. Configure settings of the server address struct
4. Set port number, using htons function to use proper byte order
5. serverAddr.sin_port = htons(7891);
6. Set IP address to localhost
7. Set all bits of the padding field to 0
8. Connect the socket to the server using the address struct
9. addr_size = sizeof serverAddr;
10. connect(clientSocket, (struct sockaddr *) &serverAddr, addr_size);
11. Read the attributelistfrom the server into the buffer
12. Reply with a private key file(PKey.bin)
Client
1. Create the socket. The three arguments are
2. 1) Internet domain 2) Stream socket 3) Default protocol (TCP in this case)
3.
4. Address family = Internet
5. Set port number, using htons function to use proper byte order
6. Set IP address to localhost
7. Set all bits of the padding field to 0
8. Bind the address struct to the socket
9. Listen on the socket, with 5 max connection requests queued
10. Accept call creates a new socket for the incoming connection
11. Send attributeList to the socket of the incoming connection
Server
File Upload (run by data owner, transmission between data owner and metadata server, storage is needed
on server side)
For secure file sharing, the data owner could encrypt the file and generate the corresponding index for the file. The
functionality is implemented by codes in “FileUpload.c”. Data owner first generates searchable keyword ciphertexts by
calling the function “SecureIndexGeneration(int Policy[], char* keywords[])”, where the first parameter “Policy” is the
access control policy that data owner defines for searching and decrypting the uploaded file, it is a 0/1/2 array, 1 denotes
that the corresponding attribute is required to exist in a data users’ attribute list and 2 denotes that the corresponding attribute
is required to not exist in a data user’s attribute list, while 0 means the data user could either have or not have the
corresponding attribute. For example, if we set ATT_NUM to be 10, then Policy = [1, 1, 2, 0, 0, 0, 0, 0, 0,0] means that
only the data user who has the first and second attribute and does not have the third attribute will be a legal data user.
Keywords are the keywords chosen by the data owner for that uploaded file.
The data owner also generates an AES key to encrypt that file by calling the function “ SecretKeyGen(char*
filename) ” and uses this key to encrypt the uploaded file by calling “FileEncryption(char* PlaintextFile, char* KeyFile,
char* CiphertextFile)”, the name of the generated ciphertext file is “char * CiphertextFile” (the third parameter of this
function). Then the data owner encrypts this AES key by ABE encryption algorithm by calling the function
“ABEFileKeyEncrypt(char* KeyFile, int Policy[ ])”, which calls twice function “ABEEncrypt(int Policy[], element_t
plaintext, struct ABECiphertext* abeciphertext)”. The ABE ciphertext is defined by the data structure
“ABEAESCiphertext”.
The searchable keyword ciphertext is defined by the data structure:
Once the user finishes generating secure index and the encrypted file, he/she makes a file upload request to the
server. For this request, we have set up a secure server client communication between the user and the server. This is a TCP
connection which uses the IP address of both and a unique port number. After the connection has been set up, the service
installed at the data owner site, sends three items to the server on the TCP socket.
These 3 items are the following:
1. Encrypted file
2. Index 1: Attribute list (Defining his/her access privilege) (plzz change according to the description above, I
remember we have a picture to show what you need to transmit, although the index should be generated and stored
by two parts, one for keywords and the other for AES key, when transmission, one encrypted file corresponds to
one encrypted file, plzz revise accordingly, or else a little bit confusing)
3. Index 2: Keyword search list
DATABASE IMPLEMENTATION
Once the encrypted file and the index is generated , they are sent to the server over the TCP socket and the client connection
is closed. The server socket is always open and waiting for the user’s request. Once the server gets a request it stores the
encrypted file and the indexes to store in the database.
The filename and its two corresponding indexes are stored in the persistent mysql database for storage and faster retrieval
at the metadata server. Here the indices are stored in the database as a String after converting element_t in the String format.
We need to read the database entries which running the search algorithm for Trapdoor, by converting the String back into
element_t.
//Write str in the database by converting element_t ele to String at Metadata server
element_t ele;
element_init_GT(ele, pairing);
element_from_bytes(ele, buffer);
element_snprintf(str ,size_t, "%B",ele);
//Read str from the database in element_t ele for Trapdoor
element_t ele;
element_init_GT(ele, pairing);
element_set_str(ele,str,10);
The database storage is implemented on the metadata server (devstack 1 in the system model). Below are the database
implementation details. The various libraries installed for the mysql are mysql-client, mysql-server, libmysql++-dev and
libmysqlclient-dev. Whenever a file is received from the client(VM01 in the system model) to the metadata server it is
stored on the system and its indexes and the filename are populated in the mysql database.
C implementation details of connecting to the database and querying it are as below.
MYSQL *con = mysql_init(NULL);//Allocates and initializes a MYSQL object
mysql_real_connect(con, "localhost", "root", "pwd","filekey_table", 0, NULL, 0)//connect to the database
mysql_query(con1, "CREATE DATABASE filekey_table");//Query the database
mysql_query(con, "CREATE TABLE filekey_table(Varchar(10) filename, varchar(10) index1, Varchar(10) index2)")
//Create the table
mysql_close(con);//Close connection to the mysql database
We can check the database by running the command and it takes us to the mysql shell.
mysql -u root -p
We can run the following commands in the mysql shell to check the persistent database and it entries.
SHOW DATABASES;
USE db;
SHOW TABLES;
SELECT * FROM table_name;
Token Generation (run by data user, data transmission between data user and metadata server)
This is an algorithm enabling data user to generate a token (or trapdoor) with inputs: private key and interested
keywords. This trapdoor could be treated as a special search request, where the searched keywords are kept private. The
algorithm is as follows:
The data structure of trapdoor is as follows:
The trapdoor is generated by the data user and then sent to the metadata server as follows.
Search (run by metadata server, data transmission between data user and metadata server)
The search algorithm is run by the metadata server when the data user sends his/her search request (i.e., the
trapdoor). There are a large number of indexes stored on the metadata server. Each index has two parts, i.e.,
keyword ciphertext and AES key ciphertext. In the search phase only the first part is needed. Denote the algorithm
as follows, where Indexes denotes all the indexes stored on the metadata server and trapdoor denotes the search
trapdoor received from the data user.
Search( Indexes, trapdoor ):
ENCFILE = empty set;
AESKEYCIPHERTEXT = empty set;
for (each ind (the first part of the index) stored on the metadata server)
{
Read ind from Database stored on Metadata server
if (SearchIndex(ind, trapdoor) = 1)
{
add the corresponding encrypted file into ENCFILE;
add the corresponding AES key ciphertext into AESKEYCIPHERTEXT;
}
}
return ENCFILE and AESKEYCIPHERTEXT;
If the equation above holds, then w=w’, the algorithm return 1, that means the data user is interested in
the corresponding file, so the metadata server should send back the second part (i.e., AES key ciphertext) and the
encrypted file back to the data user. After running the Search algorithm, if ENCFILE and
AESKEYCIPHERTEXT are not empty sets, then they should be sent to the data user by the metadata server.
We parallelize the linear search process as follows:
We send back the encrypted files and the corresponding AES key ciphertext as follows:
At the end of this phase, the data user should receive the encrypted files and the corresponding AES key
ciphertext.
Decrypt (run by the data user locally)
Upon receiving the encrypted files and the corresponding AES key ciphertext. The data user runs the
following algorithm.
For each encrypted file in ENCFILE, and each AES key ciphertext in AESKEYCIPHERTEXT. First
decrypt the AES key ciphertext by running the algorithm “ABEFileKeyDecrypt(struct ABEAESCiphertext
abeaes, char* keyfilename)”, the first input is the AES key ciphertext, the second is the file storing the decrypted
AES key. This algorithm calls “ABEDecypt(stuct ABECiphertext abeciphertext, element_t *m)”, where
abeciphertext is the ciphertext to be decrypted and m is the corresponding plaintext, there is another implicit input
“PrivateKey PrivK” (we set that as a global variable in our program). The ABE decryption algorithm runs as
follows.
The ABE decryption algorithm decrypts the AES key ciphertext and gets the AES key, stores it ino a file,
with name “keyfilename”. By running the AES decryption algorithm with the AES key and the encrypted file as
input, the data user could obtain the plainttext file.
RISK MANAGEMENT OF THE PROJECT
Table 4 shows the various probable risks in the project and the possible mitigation risk plans to avoid them.
Probable Risks Risk Mitigation
Cloud Computing is a new domain for all the members. Complete
knowledge of all the tools and technology is difficult. Having
time as a constraint, research may consume lots of time.
All the members will sit together and discuss with the
professor about each and everything. Devoting proper and
efficient time in research in correct direction as guided by
the professor.
Technology is changing at a very fast pace. Being aware of
everything is not an easy task for anyone. This could be an issue
as choice of softwares, tools and technology may not be updates
or correct.
In depth knowledge of the domain is required to solve this
issue. Lots of research and paper reading reading is to be done
for this.
Project planning if not done properly, estimations made without
right calculations can lead to technical risks. Main issue could be
wrong estimation of schedule and effort.
To solve this issue all the team members will have regular
meetings. These will involve discussions on status update
and project forecast. This will allow us in early mitigation
of the risk.
Table 4: Risks Management Plan
CONCLUSION
This project aims to implement secure file sharing on cloud. The data owners could implement access control and search
capability authorization, so that only authorized users could search or decrypt the encrypted files. This project uses some
existing algorithms. These current schemes do not provide complete solution for data sharing in cloud computing, e.g., they
do not provide effective and efficient user revocation mechanism, which is very important in real-world applications.
Therefore, in the future, based on the established demo system, user revocation functionality might be added. In addition,
in this project, the stored data are files, therefore, keywords search will be implemented. With the popularity of internet of
things (IoT), more and more private sensor data might be stored on the cloud servers, thus making searches over numerical
data, e.g., range search, important. Finally, how to implement heavy cryptographic algorithms on resource-constrained
devices, e.g., smartphones, is also an interesting future work. In our developed system each user’s attributes are revealed
and in the future, work on protecting this information could be done.
ACKNOWLEDGMENT
Thanks Dr. Huang, the instructor of this course for discussions and guiding us through project idea selection. We also
thank Ankur, the TA for his support in idea formulation and helping us with the project environment setup.
REFERENCES
[1] Sun, Wenhai, et al. "Protecting your right: Attribute-based keyword search with fine-grained owner-enforced search authorization in the cloud." IEEE INFOCOM
2014-IEEE Conference on Computer Communications. IEEE, 2014
[2] V. Goyal, O. Pandey, A. Sahai, and B. Waters, “Attribute-based encryption for fine-grained access control of encrypted data,” in Proc. of CCS. ACM, 2006, pp.
89–98.
[3] J. Bethencourt, A. Sahai, and B. Waters, “Ciphertext-policy attribute-based encryption,” in Proc. of S&P. IEEE, 2007, pp. 321–334.
[4] L. Cheung and C. Newport, “Provably secure ciphertext policy abe,” in Proc. of CCS. ACM, 2007, pp. 456–465.
[5] S. Yu, C. Wang, K. Ren, and W. Lou, “Attribute based data sharing with attribute revocation,” in Proc. of ASIACCS. ACM, 2010, pp. 261–270.
[6] S. Yu, C. Wang, K. Ren, and W. Lou, “Achieving secure, scalable, and fine-grained data access control in cloud computing,” in Proc. of INFOCOM. IEEE,
2010, pp. 1–9.
[7] http://www.cs.cityu.edu.hk/~congwang/research.html
[8] https://crypto.stanford.edu/pbc/download.html
[9] https://www.openssl.org
[10] http://www.mysql.com/
[11] http://www.programminglogic.com/example-of-client-server-program-in-c-using-sockets-and-tcp/

More Related Content

What's hot

Parrot Security OS | Introduction to Parrot Security OS | Cybersecurity Train...
Parrot Security OS | Introduction to Parrot Security OS | Cybersecurity Train...Parrot Security OS | Introduction to Parrot Security OS | Cybersecurity Train...
Parrot Security OS | Introduction to Parrot Security OS | Cybersecurity Train...Edureka!
 
Digital Right Management
Digital Right ManagementDigital Right Management
Digital Right ManagementRatul Alahy
 
Trends in Information Security
Trends in Information SecurityTrends in Information Security
Trends in Information SecurityCompTIA
 
NIST Zero Trust Explained
NIST Zero Trust ExplainedNIST Zero Trust Explained
NIST Zero Trust Explainedrtp2009
 
Microsoft Information Protection.pptx
Microsoft Information Protection.pptxMicrosoft Information Protection.pptx
Microsoft Information Protection.pptxChrisaldyChandra
 
Presentation On Steganography
Presentation On SteganographyPresentation On Steganography
Presentation On SteganographyTeachMission
 
project-report-steganography.docx
project-report-steganography.docxproject-report-steganography.docx
project-report-steganography.docxssusere02009
 
Cryptography and steganography
Cryptography and steganographyCryptography and steganography
Cryptography and steganographyJishnu Grandhi
 
Web Services Security Tutorial
Web Services Security TutorialWeb Services Security Tutorial
Web Services Security TutorialJorgen Thelin
 
BI and Dashboarding Best Practices
 BI and Dashboarding Best Practices BI and Dashboarding Best Practices
BI and Dashboarding Best PracticesRocket Software
 
Wi-Fi security – WEP, WPA and WPA2
Wi-Fi security – WEP, WPA and WPA2Wi-Fi security – WEP, WPA and WPA2
Wi-Fi security – WEP, WPA and WPA2Fábio Afonso
 
Basics of Information System Security
Basics of Information System SecurityBasics of Information System Security
Basics of Information System Securitychauhankapil
 
Anti forensic
Anti forensicAnti forensic
Anti forensicMilap Oza
 
Digital Rights Management PPT
Digital Rights Management PPTDigital Rights Management PPT
Digital Rights Management PPTSuresh Khutale
 
Introduction to information security
Introduction to information securityIntroduction to information security
Introduction to information securityjayashri kolekar
 
Breast cancer classification
Breast cancer classificationBreast cancer classification
Breast cancer classificationAshwan Abdulmunem
 

What's hot (20)

Parrot Security OS | Introduction to Parrot Security OS | Cybersecurity Train...
Parrot Security OS | Introduction to Parrot Security OS | Cybersecurity Train...Parrot Security OS | Introduction to Parrot Security OS | Cybersecurity Train...
Parrot Security OS | Introduction to Parrot Security OS | Cybersecurity Train...
 
Digital Right Management
Digital Right ManagementDigital Right Management
Digital Right Management
 
Trends in Information Security
Trends in Information SecurityTrends in Information Security
Trends in Information Security
 
AAA Implementation
AAA ImplementationAAA Implementation
AAA Implementation
 
NIST Zero Trust Explained
NIST Zero Trust ExplainedNIST Zero Trust Explained
NIST Zero Trust Explained
 
CyberSecurity Best Practices for the IIoT
CyberSecurity Best Practices for the IIoTCyberSecurity Best Practices for the IIoT
CyberSecurity Best Practices for the IIoT
 
Microsoft Information Protection.pptx
Microsoft Information Protection.pptxMicrosoft Information Protection.pptx
Microsoft Information Protection.pptx
 
Presentation On Steganography
Presentation On SteganographyPresentation On Steganography
Presentation On Steganography
 
Data security
Data securityData security
Data security
 
project-report-steganography.docx
project-report-steganography.docxproject-report-steganography.docx
project-report-steganography.docx
 
Cryptography and steganography
Cryptography and steganographyCryptography and steganography
Cryptography and steganography
 
Web Services Security Tutorial
Web Services Security TutorialWeb Services Security Tutorial
Web Services Security Tutorial
 
BI and Dashboarding Best Practices
 BI and Dashboarding Best Practices BI and Dashboarding Best Practices
BI and Dashboarding Best Practices
 
Wi-Fi security – WEP, WPA and WPA2
Wi-Fi security – WEP, WPA and WPA2Wi-Fi security – WEP, WPA and WPA2
Wi-Fi security – WEP, WPA and WPA2
 
Basics of Information System Security
Basics of Information System SecurityBasics of Information System Security
Basics of Information System Security
 
Anti forensic
Anti forensicAnti forensic
Anti forensic
 
Digital Rights Management PPT
Digital Rights Management PPTDigital Rights Management PPT
Digital Rights Management PPT
 
Introduction to information security
Introduction to information securityIntroduction to information security
Introduction to information security
 
Breast cancer classification
Breast cancer classificationBreast cancer classification
Breast cancer classification
 
File Encryption
File EncryptionFile Encryption
File Encryption
 

Similar to Secure File Sharing on Cloud

IRJET- Privacy Preserving Encrypted Keyword Search Schemes
IRJET-  	  Privacy Preserving Encrypted Keyword Search SchemesIRJET-  	  Privacy Preserving Encrypted Keyword Search Schemes
IRJET- Privacy Preserving Encrypted Keyword Search SchemesIRJET Journal
 
Retrieving Secure Data from Cloud Using OTP
Retrieving Secure Data from Cloud Using OTPRetrieving Secure Data from Cloud Using OTP
Retrieving Secure Data from Cloud Using OTPAM Publications
 
Implementation and Review Paper of Secure and Dynamic Multi Keyword Search in...
Implementation and Review Paper of Secure and Dynamic Multi Keyword Search in...Implementation and Review Paper of Secure and Dynamic Multi Keyword Search in...
Implementation and Review Paper of Secure and Dynamic Multi Keyword Search in...IRJET Journal
 
A Secure, Scalable, Flexible and Fine-Grained Access Control Using Hierarchic...
A Secure, Scalable, Flexible and Fine-Grained Access Control Using Hierarchic...A Secure, Scalable, Flexible and Fine-Grained Access Control Using Hierarchic...
A Secure, Scalable, Flexible and Fine-Grained Access Control Using Hierarchic...Editor IJCATR
 
Ieeepro techno solutions 2014 ieee java project - query services in cost ef...
Ieeepro techno solutions   2014 ieee java project - query services in cost ef...Ieeepro techno solutions   2014 ieee java project - query services in cost ef...
Ieeepro techno solutions 2014 ieee java project - query services in cost ef...hemanthbbc
 
Ieeepro techno solutions 2014 ieee dotnet project - query services in cost ...
Ieeepro techno solutions   2014 ieee dotnet project - query services in cost ...Ieeepro techno solutions   2014 ieee dotnet project - query services in cost ...
Ieeepro techno solutions 2014 ieee dotnet project - query services in cost ...ASAITHAMBIRAJAA
 
Ranking Efficient Attribute Based Keyword Searching Over Encrypted Data Along...
Ranking Efficient Attribute Based Keyword Searching Over Encrypted Data Along...Ranking Efficient Attribute Based Keyword Searching Over Encrypted Data Along...
Ranking Efficient Attribute Based Keyword Searching Over Encrypted Data Along...IRJET Journal
 
IRJET- Securing Cloud Data Under Key Exposure
IRJET- Securing Cloud Data Under Key ExposureIRJET- Securing Cloud Data Under Key Exposure
IRJET- Securing Cloud Data Under Key ExposureIRJET Journal
 
Privacy-Preserving Multi-keyword Top-k Similarity Search Over Encrypted Data
Privacy-Preserving Multi-keyword Top-k Similarity Search Over Encrypted DataPrivacy-Preserving Multi-keyword Top-k Similarity Search Over Encrypted Data
Privacy-Preserving Multi-keyword Top-k Similarity Search Over Encrypted DataCloudTechnologies
 
IRJET- Secure Data Sharing Scheme for Mobile Cloud Computing using SEDASC
IRJET-  	  Secure Data Sharing Scheme for Mobile Cloud Computing using SEDASCIRJET-  	  Secure Data Sharing Scheme for Mobile Cloud Computing using SEDASC
IRJET- Secure Data Sharing Scheme for Mobile Cloud Computing using SEDASCIRJET Journal
 
IRJET- Secure Data Sharing Scheme for Mobile Cloud Computing using SEDASC
IRJET- Secure Data Sharing Scheme for Mobile Cloud Computing using SEDASCIRJET- Secure Data Sharing Scheme for Mobile Cloud Computing using SEDASC
IRJET- Secure Data Sharing Scheme for Mobile Cloud Computing using SEDASCIRJET Journal
 
A PRACTICAL CLIENT APPLICATION BASED ON ATTRIBUTE-BASED ACCESS CONTROL FOR UN...
A PRACTICAL CLIENT APPLICATION BASED ON ATTRIBUTE-BASED ACCESS CONTROL FOR UN...A PRACTICAL CLIENT APPLICATION BASED ON ATTRIBUTE-BASED ACCESS CONTROL FOR UN...
A PRACTICAL CLIENT APPLICATION BASED ON ATTRIBUTE-BASED ACCESS CONTROL FOR UN...cscpconf
 
Paper id 28201425
Paper id 28201425Paper id 28201425
Paper id 28201425IJRAT
 
A Secure Multi-Owner Data Sharing Scheme for Dynamic Group in Public Cloud.
A Secure Multi-Owner Data Sharing Scheme for Dynamic Group in Public Cloud. A Secure Multi-Owner Data Sharing Scheme for Dynamic Group in Public Cloud.
A Secure Multi-Owner Data Sharing Scheme for Dynamic Group in Public Cloud. IJCERT JOURNAL
 
E FFICIENT D ATA R ETRIEVAL F ROM C LOUD S TORAGE U SING D ATA M ININ...
E FFICIENT  D ATA  R ETRIEVAL  F ROM  C LOUD  S TORAGE  U SING  D ATA  M ININ...E FFICIENT  D ATA  R ETRIEVAL  F ROM  C LOUD  S TORAGE  U SING  D ATA  M ININ...
E FFICIENT D ATA R ETRIEVAL F ROM C LOUD S TORAGE U SING D ATA M ININ...IJCI JOURNAL
 
Two Level Auditing Architecture to Maintain Consistent In Cloud
Two Level Auditing Architecture to Maintain Consistent In CloudTwo Level Auditing Architecture to Maintain Consistent In Cloud
Two Level Auditing Architecture to Maintain Consistent In Cloudtheijes
 
Enabling secure and efficient ranked keyword
Enabling secure and efficient ranked keywordEnabling secure and efficient ranked keyword
Enabling secure and efficient ranked keywordIMPULSE_TECHNOLOGY
 
Efficient Privacy Preserving Clustering Based Multi Keyword Search
Efficient Privacy Preserving Clustering Based Multi Keyword Search        Efficient Privacy Preserving Clustering Based Multi Keyword Search
Efficient Privacy Preserving Clustering Based Multi Keyword Search IRJET Journal
 
Achieving Secure, sclable and finegrained Cloud computing report
Achieving Secure, sclable and finegrained Cloud computing reportAchieving Secure, sclable and finegrained Cloud computing report
Achieving Secure, sclable and finegrained Cloud computing reportKiran Girase
 

Similar to Secure File Sharing on Cloud (20)

IRJET- Privacy Preserving Encrypted Keyword Search Schemes
IRJET-  	  Privacy Preserving Encrypted Keyword Search SchemesIRJET-  	  Privacy Preserving Encrypted Keyword Search Schemes
IRJET- Privacy Preserving Encrypted Keyword Search Schemes
 
Retrieving Secure Data from Cloud Using OTP
Retrieving Secure Data from Cloud Using OTPRetrieving Secure Data from Cloud Using OTP
Retrieving Secure Data from Cloud Using OTP
 
Implementation and Review Paper of Secure and Dynamic Multi Keyword Search in...
Implementation and Review Paper of Secure and Dynamic Multi Keyword Search in...Implementation and Review Paper of Secure and Dynamic Multi Keyword Search in...
Implementation and Review Paper of Secure and Dynamic Multi Keyword Search in...
 
A Secure, Scalable, Flexible and Fine-Grained Access Control Using Hierarchic...
A Secure, Scalable, Flexible and Fine-Grained Access Control Using Hierarchic...A Secure, Scalable, Flexible and Fine-Grained Access Control Using Hierarchic...
A Secure, Scalable, Flexible and Fine-Grained Access Control Using Hierarchic...
 
Ieeepro techno solutions 2014 ieee java project - query services in cost ef...
Ieeepro techno solutions   2014 ieee java project - query services in cost ef...Ieeepro techno solutions   2014 ieee java project - query services in cost ef...
Ieeepro techno solutions 2014 ieee java project - query services in cost ef...
 
Ieeepro techno solutions 2014 ieee dotnet project - query services in cost ...
Ieeepro techno solutions   2014 ieee dotnet project - query services in cost ...Ieeepro techno solutions   2014 ieee dotnet project - query services in cost ...
Ieeepro techno solutions 2014 ieee dotnet project - query services in cost ...
 
Ranking Efficient Attribute Based Keyword Searching Over Encrypted Data Along...
Ranking Efficient Attribute Based Keyword Searching Over Encrypted Data Along...Ranking Efficient Attribute Based Keyword Searching Over Encrypted Data Along...
Ranking Efficient Attribute Based Keyword Searching Over Encrypted Data Along...
 
J017547478
J017547478J017547478
J017547478
 
IRJET- Securing Cloud Data Under Key Exposure
IRJET- Securing Cloud Data Under Key ExposureIRJET- Securing Cloud Data Under Key Exposure
IRJET- Securing Cloud Data Under Key Exposure
 
Privacy-Preserving Multi-keyword Top-k Similarity Search Over Encrypted Data
Privacy-Preserving Multi-keyword Top-k Similarity Search Over Encrypted DataPrivacy-Preserving Multi-keyword Top-k Similarity Search Over Encrypted Data
Privacy-Preserving Multi-keyword Top-k Similarity Search Over Encrypted Data
 
IRJET- Secure Data Sharing Scheme for Mobile Cloud Computing using SEDASC
IRJET-  	  Secure Data Sharing Scheme for Mobile Cloud Computing using SEDASCIRJET-  	  Secure Data Sharing Scheme for Mobile Cloud Computing using SEDASC
IRJET- Secure Data Sharing Scheme for Mobile Cloud Computing using SEDASC
 
IRJET- Secure Data Sharing Scheme for Mobile Cloud Computing using SEDASC
IRJET- Secure Data Sharing Scheme for Mobile Cloud Computing using SEDASCIRJET- Secure Data Sharing Scheme for Mobile Cloud Computing using SEDASC
IRJET- Secure Data Sharing Scheme for Mobile Cloud Computing using SEDASC
 
A PRACTICAL CLIENT APPLICATION BASED ON ATTRIBUTE-BASED ACCESS CONTROL FOR UN...
A PRACTICAL CLIENT APPLICATION BASED ON ATTRIBUTE-BASED ACCESS CONTROL FOR UN...A PRACTICAL CLIENT APPLICATION BASED ON ATTRIBUTE-BASED ACCESS CONTROL FOR UN...
A PRACTICAL CLIENT APPLICATION BASED ON ATTRIBUTE-BASED ACCESS CONTROL FOR UN...
 
Paper id 28201425
Paper id 28201425Paper id 28201425
Paper id 28201425
 
A Secure Multi-Owner Data Sharing Scheme for Dynamic Group in Public Cloud.
A Secure Multi-Owner Data Sharing Scheme for Dynamic Group in Public Cloud. A Secure Multi-Owner Data Sharing Scheme for Dynamic Group in Public Cloud.
A Secure Multi-Owner Data Sharing Scheme for Dynamic Group in Public Cloud.
 
E FFICIENT D ATA R ETRIEVAL F ROM C LOUD S TORAGE U SING D ATA M ININ...
E FFICIENT  D ATA  R ETRIEVAL  F ROM  C LOUD  S TORAGE  U SING  D ATA  M ININ...E FFICIENT  D ATA  R ETRIEVAL  F ROM  C LOUD  S TORAGE  U SING  D ATA  M ININ...
E FFICIENT D ATA R ETRIEVAL F ROM C LOUD S TORAGE U SING D ATA M ININ...
 
Two Level Auditing Architecture to Maintain Consistent In Cloud
Two Level Auditing Architecture to Maintain Consistent In CloudTwo Level Auditing Architecture to Maintain Consistent In Cloud
Two Level Auditing Architecture to Maintain Consistent In Cloud
 
Enabling secure and efficient ranked keyword
Enabling secure and efficient ranked keywordEnabling secure and efficient ranked keyword
Enabling secure and efficient ranked keyword
 
Efficient Privacy Preserving Clustering Based Multi Keyword Search
Efficient Privacy Preserving Clustering Based Multi Keyword Search        Efficient Privacy Preserving Clustering Based Multi Keyword Search
Efficient Privacy Preserving Clustering Based Multi Keyword Search
 
Achieving Secure, sclable and finegrained Cloud computing report
Achieving Secure, sclable and finegrained Cloud computing reportAchieving Secure, sclable and finegrained Cloud computing report
Achieving Secure, sclable and finegrained Cloud computing report
 

Secure File Sharing on Cloud

  • 1. Secure File Sharing on Cloud Attribute-Based Access Control and Keyword Search over Encrypted Data Rachita Gupta, Supriya, Qiuxiang Dong School of Computing, Informatics and Decision Systems Engineering Arizona State University, Tempe, AZ, USA {rgupta36, sashokk2, qiuxiang.dong}@asu.edu Abstract— Cloud computing is a promising new technology where computing resources are provided as a service to users over the internet. Since cloud storage provides greater flexibility and availability at a much lower investment, more and more users are attracted to outsource their data on the cloud servers. As more sensitive data are outsourced on cloud server environment, the security of the data may be compromised. To combat unsolicited accesses from malicious insiders (e.g., cloud server administrator) and outsiders (e.g., hacker), these sensitive data have to be encrypted before outsourcing. Although data confidentiality could be protected by traditional encryption schemes (e.g., AES, RSA), it is difficult to implement data sharing and searches over the encrypted data. This project aims to implement secure file sharing in cloud storage through utilizing some recently proposed encryption schemes. Firstly, the data owners encrypt their files by Attribute-Based Encryption (ABE) scheme so that only authorized data users could decrypt these files. Second, Attribute-Based Keyword Searchable (ABKS) encryption scheme is implemented, thus enabling the authorized data users to search over the encrypted files to find their desired files. Finally, some parallel processing approaches will be employed in this project to enhance search efficiency, when there are huge number of files stored on the cloud servers. Keywords— Cloud Computing, Data Leakage, Secure Cloud Storage, Attribute-Based Encryption (ABE), Attribute- Based Keyword Search (ABKS), Access Control, Keyword Search INTRODUCTION Cloud computing has emerged as an innovative technology which has changed the way we do computing by providing all services and applications through the internet. It is an ecosystem which enables on-demand access to pool resources with greater flexibility and availability at a much lower investment cost by allowing the users to pay only for the resources and services they use. The main concept of cloud computing is not to own any hardware or software but to avail the computational resources, storage databases or any other services at a minimal cost to the cloud provider. Cloud infrastructure incorporates the five characteristics which are on-demand service, broad network access, resource pooling, rapid elasticity and measured service. Adopting the cloud for enterprises has empowered businesses by ensuring lesser cost, faster deployment, lesser acquisition of physical assets and much greater customer satisfaction. Since many IT enterprises are hosting their services, applications and data on cloud, there is a growing concern regarding the privacy of the sensitive data being compromised. The most obvious solution to this problem is to apply some encryption techniques before uploading or sharing the data on cloud. But then how the uploaded encrypted data would be used on cloud is a new challenging problem that needs to be addressed. This project involves the search over encrypted data but most of the techniques selectively search the data only at a coarse-grained level i.e., by sharing the private key with the other authorized users. But this is not a feasible solution in the multi-owner and multi-user cloud based file storage system where owner either has to act as an intermediary and decrypt all the files it wants to share with some other parties or give them its private decryption key, giving access to all its data. Hence, to enable multiple owners to encrypt and upload their data files to the cloud storage and make them searchable by other users, a new cryptosystem “fine grained search authorization” was proposed that only allows the authorized users to search over the encrypted data [1]. “Fine grained” implies that the search authorization is controlled at the granularity of each file and exploits the attribute-based encryption (ABE) technique introduced by Sahai and Waters [2]. In attribute based encryption the data owner encrypts the index of each file with a self defined access policy, which dictates that which type of user can search this index [1]. An access control policy would define the type of users who would have permission to access the documents. For instance, in an academic institute only the professor of CS department or the TA of that course would have access to
  • 2. the grade sheet for a particular course. This can be expressed in form of a policy as below. Hence only the users satisfying this attribute predicate would be able to access the grade sheet. ((Professor ∧ CS dept.)∨(CS student ∧ course TA)) The keywords and attributes are defined separately in the report. Keywords are the actual content of the files and attributes are the properties of the users[1]. The owner of data creates the index of all the keywords in the data file but encrypts the index with the access control policy structure based on the attributes of the authorized users [1]. Attribute-Based Keyword Searchable encryption scheme (ABKS) is a data-owner-enforced search authorization system that only users who meet the owner-defined access policy can retrieve the valid search results [1]. Since the searchable scope is very large in the cloud environment and cryptographic search has to be done with linear time complexity, so some parallel computing approaches are used to enhance search efficiency. The outcome of this project would be the implementation of attribute-based keyword search, and attribute-based data access control using parallel processing. Table 1 describes the various tasks taken up by the following group members in the defined time frame to the project completion. SYSTEM MODELS System Model There are three kinds of servers in our system (ref. Figure 1). The first is key generation server (namely Trusted Authority, TA for short), which generates private keys for users. The second is metadata server, which is responsible for managing the metadata and also process data uploading, data search requests from data contributors (owners) and data readers (users) respectively. The third is data storage server, which stores the files. There are multiple data contributors (or owners) and multiple data readers. The design goals of this system are as follows.  Data access control: Enable data-owner-enforced data access control, i.e., only users who meet the owner-defined access policy can decrypt the files.  Authorized Keyword Search: Enable data-owner-enforced search authorization, i.e., only users who meet the owner-defined access policy can obtain the valid search results.  Multiple Data Owners and Data Readers: accommodate multiple data owners and data readers. Each reader is able to read and search over the encrypted data contributed by multiple data owners.  Security Goals: Protect both the contents and the keywords privacy.  Efficiency Goals: Search operations could be implemented by the cloud server efficiently. The workflow of our system is presented in Figure 3. The detailed description is as follows.  System Setup: TA runs the system setup algorithm to generate the public parameters and master secret key. The public parameters are distributed to the whole system, including the servers, the data owners and the data users. The master secret key is kept private by TA and is used in the Key Generation phase below to generate private key for users.  Key Generation: TA takes public parameters, master secret key and a user attribute list (sent from data user) to generate private key for the user. Task assigned Timeline Group Member Team Formation 25th Aug - 28th Aug 2016 Idea and Design development Time and task allocation 29th Aug - 8th Sep 2016 Rachita Gupta, Supriya, Qiuxiang Dong Project Proposal Report Writing 9th Sep - 14th Sep 2016 Rachita Gupta, Supriya, Qiuxiang Dong
  • 3. Implementation Data structure design and Environment Setup 15th Sep - 17th Sep 2016 Rachita Gupta, Supriya, Qiuxiang Dong Algorithm design and implementation Client Server Model 18th Sep - 25th Sep 2016 Supriya Cryptographic Algorithm 18th Sep - 25th Sep 2016 Qiuxiang Dong Parallel processing 18th Sep - 25th Sep 2016 Rachita System Integration 26th Sep - 29th Sep 2016 Rachita Gupta, Supriya, Qiuxiang Dong Testing 30th Sep - 1st Oct 2016 Rachita Gupta, Supriya, Qiuxiang Dong Second Report writing 2nd Oct - 5th Oct 2016 Rachita Gupta, Supriya, Qiuxiang Dong Testing and troubleshooting 6th Oct - 15th Oct 2016 Rachita Gupta, Supriya, Qiuxiang Dong Final Report writing 16th Oct - 26th Oct 2016 Rachita Gupta, Supriya, Qiuxiang Dong Table 1: Project Task allocation and Timeline  File Upload: When the data owner uploads a file, the following operations are performed. First, encrypt the file f with a symmetric encryption algorithm (e.g., AES). Second, encrypt the AES key (K) with an Attribute-Based Encryption algorithm (ABE for short). Third, generate the Keyword List (KL) for the file and encrypt the keywords with the keyword encryption algorithm of the Attribute-Based Keyword Searchable Encryption scheme (ABKS for short). Finally, the user sends the ciphertext of the f, K, and KL to the metadata server, i.e., the index and encrypted file. Upon receiving the file uploading request from the data contributor, the metadata server will store the following item (ref. Figure 2) for this request and store the file on the storage server(s).
  • 4. Figure 1: System architecture Figure 2: Storage format on metadata server  Token Generation: When a data user wants to search his/her interested files, he/she will generate a search ToKen (TK) for his/her interested keyword (e.g., kw) with his/her private key and send the token to the metadata server.  Search: Upon receiving the search request, the metadata server runs the search algorithm of the ABKS scheme to check whether a file contains the keyword (i.e., kw is included in some KL). On the one hand, randomness in included in the ciphertexts, so the complexity of the search will be linear. On the other hand, there are a lot of files. Therefore, it will be time-consuming to perform keyword search over encrypted data. To solve this problem, some parallel computing technology will be used on metadata server. If some file(s) contain the interested keyword, the metadata server will send back the corresponding ABE(K) and encrypted files to the data reader.  File Decryption: Upon receiving the search results, the data reader decrypts ABE(K) and obtains K, which can be used to decrypt the files. Software PBC [8] and OpenSSL [9] library will be utilized to implement the cryptographic algorithms. On the file system, to search a particular file, Apache spark will be used. For the communication between the metadata server and the user, the project uses client- server architecture. Security Model (optional) The TA is assumed to be fully trusted. The cloud servers (including the metadata server and the data storage servers) are semi-honest, which means they will run the pre-defined algorithms or protocols honestly but they will try to find private information based on background knowledge or other information they could get. The data readers are assumed to be malicious. They want to access the files of which they are actually not authorized to read. They might collude with each other. Key Generation
  • 5. Figure 3: Message sharing between entities PROJECT DESCRIPTION With the advent of cloud computing, a new computing paradigm, more and more users and enterprises are shifting their data to cloud storage servers. As more sensitive data are outsourced on cloud server environment the new challenging problem is that confidentiality or privacy of the data may be compromised, and access control over the data becomes difficult to handle, because the commercially operated clouds are likely to be outside the trusted domain of the users. The obvious solution to ensure the confidentiality and desired access control of data is to encrypt the data and share the private key with the authorized users. However this can only implement coarse-grained level access control and is an unscalable solution involving lot of computation for key distribution and management on the client side. This project explores the scenario of multiple owners uploading their encrypted data files to the cloud storage and make them searchable to multiple users by “fined grained search authorization”, which is an access control policy method that makes the data files accessible only to the authorized users without revealing the actual data content. Attribute-Based Keyword Searchable Encryption scheme (ABKS), a data-owner-enforced authorization search system, is used that only allows users who meet the owner-defined access policy to search over the encrypted data [1]. Some parallel processing approaches will be employed to achieve faster keyword search over encrypted data in the huge searchable space of cloud environment. Figure 4: This project will use the above architecture. This architecture is provided by the thoth lab.
  • 6. RESPONSIBILITY MACHINE Metadata Server devstack 1 Trusted Authority(TA) devstack2 Data Storage Server devstack 1/2 user VM01 Table 2: Server Machines in system architecture and their responsibilities Project Overview Task 1 : Environment Setup Figure 4 shows the environment setup architecture in thoth lab. The different types of servers needed in this project are key generation server (i.e., TA), metadata server, and data storage server. Table 2 shows the various machines in the system and their various responsibilities. 1. Trusted Authority: This server is responsible for setting up cryptographic system, i.e., public parameters and master secret key, and generating private keys for all the users. After system setup it could be offline until when new users join the system. 2. Metadata server: This server stores metadata of users’ files, including searchable keywords ciphertexts, encrypted key as well as the references to the files stored on the data storage server. It processes the data uploading request from data owners and data search requests from data users. 3. Data storage Server: These server stores all the files on the cloud storage. Number of data storage server may vary depending on the requirement. This project uses two data storage servers. Some cloud storage management platform will be installed there. The physical machine or VM should provide large storage space, or else open cloud computing platform could be used. Task 2: Data structure design Design data structure of each file’s index stored on the metadata server. Design data structure used in the ABE and ABKS algorithm. Task 3: Algorithm design and implementation Design and implement ABE and ABKS algorithms. Design and implement parallel searching algorithm with some parallel processing approach. TEAM MEMBER TASKS PERCENTILE Rachita Gupta Parallel Computations on the metadata using apache spark. This will help in faster search by the user on the data stores. 33.3% Supriya (Team Leader) Setting up client server communication between the user and the servers. 33.3% Qiuxiang Dong Implement file encryption, attribute-based encryption and the attribute-based keyword encryption schemes. 33.3% Table 3: Task distribution Task 4: System development Integrate all the algorithms and components into a whole system. Web service development will be needed in this part. Task 5: System Testing Use some real-world dataset to test whether the system could work effectively and efficiently.
  • 7. Project Task Allocation Project will be completed by 3 members of the team. Table 3 provides details of the task allocation of each member in the project.Once individual tasks are completed all the team members will integrate the entire project to obtain full functionality. Deliverables This project will create a software package for the user. This software helps in interacting with the TA to obtain private key. Also, it will provide encryption algorithms, which will use the user’s private key and encrypt the files, symmetric encryption key and keyword list. It will also create service for the cloud system which will provide users secure file sharing on cloud. Project Timeline Figure 5 shows the Gantt chart for this project depicting the detailed task-time distribution. Figure 5: Gantt Chart for the project IMPLEMENTATION DETAILS System Setup (run by TA) System setup is done by TA by running the function “SystemSetup()” in our code “Setup.c”. This function takes use of a cryptographic parameter file a.param as the security parameters for the whole system and generates the public key as defined by the data structure “PublicKey”, and a master secret key as defined by the data structure “MasterKey”. We store them in “PK.bin” and “MSK.bin” respectively. “PK.bin” is shared by all the entities in the whole system, and “MSK.bin” is kept secret by TA himself and will be used to generate private keys for users in the key generation phase. The algorithm “SystemSetup()” is as follows: The data structure of PublicKey and MasterKey is as follows.
  • 8. Key Generation (algorithm run by TA, data transmission between TA and users, storage is needed on client side) After setting up, TA has public parameters stored in “PK.bin” and master secret key “MSK.bin”. If one user wants to obtain a private key for searching over the encrypted files stored on the cloud servers and decrypt the corresponding files, he/she would send an attribute list to TA. The attribute list is a boolean array. The user defines this attribute list in a file and sends it to the TA. After the TA receives the attribute list, it runs the function “ABEKeyGen(int AttributeList[ ], struct PrivateKey* PrivK)” in “ABEKeyGen.c” to generate a private key for that user. The first input “AttributeList[ ]” is the attribute list of that user, it is an ATT_NUM-length 0/1 array, where ATT_NUM is defined to be the number of attributes in the whole system. 0 indicates that the user does not have the corresponding attribute while 1 indicates he/she has that attribute. So this attribute list defines a user’s access privilege.The second parameter “PrivK” stores the generated private key for that user. The algorithm of private key generation is as follows. The private key data structure is defined as follows. The generated private key would be stored in a file named “PrivK.bin” and sent by TA to the user in a secure way. To obtain the private key, the data owner requests the TA for the key by first establishing a secure TCP server-client socket and then sending an attribute list. 1. Create the socket. The three arguments are: 1) Internet domain 2) Stream socket 3) Default protocol (TCP in this case) 2. clientSocket = socket(PF_INET, SOCK_STREAM, 0); 3. Configure settings of the server address struct 4. Set port number, using htons function to use proper byte order 5. serverAddr.sin_port = htons(7891); 6. Set IP address to localhost 7. Set all bits of the padding field to 0
  • 9. 8. Connect the socket to the server using the address struct 9. addr_size = sizeof serverAddr; 10. connect(clientSocket, (struct sockaddr *) &serverAddr, addr_size); 11. Read the attributelistfrom the server into the buffer 12. Reply with a private key file(PKey.bin) Client 1. Create the socket. The three arguments are 2. 1) Internet domain 2) Stream socket 3) Default protocol (TCP in this case) 3. 4. Address family = Internet 5. Set port number, using htons function to use proper byte order 6. Set IP address to localhost 7. Set all bits of the padding field to 0 8. Bind the address struct to the socket 9. Listen on the socket, with 5 max connection requests queued 10. Accept call creates a new socket for the incoming connection 11. Send attributeList to the socket of the incoming connection Server File Upload (run by data owner, transmission between data owner and metadata server, storage is needed on server side) For secure file sharing, the data owner could encrypt the file and generate the corresponding index for the file. The functionality is implemented by codes in “FileUpload.c”. Data owner first generates searchable keyword ciphertexts by calling the function “SecureIndexGeneration(int Policy[], char* keywords[])”, where the first parameter “Policy” is the access control policy that data owner defines for searching and decrypting the uploaded file, it is a 0/1/2 array, 1 denotes that the corresponding attribute is required to exist in a data users’ attribute list and 2 denotes that the corresponding attribute is required to not exist in a data user’s attribute list, while 0 means the data user could either have or not have the corresponding attribute. For example, if we set ATT_NUM to be 10, then Policy = [1, 1, 2, 0, 0, 0, 0, 0, 0,0] means that only the data user who has the first and second attribute and does not have the third attribute will be a legal data user. Keywords are the keywords chosen by the data owner for that uploaded file. The data owner also generates an AES key to encrypt that file by calling the function “ SecretKeyGen(char* filename) ” and uses this key to encrypt the uploaded file by calling “FileEncryption(char* PlaintextFile, char* KeyFile, char* CiphertextFile)”, the name of the generated ciphertext file is “char * CiphertextFile” (the third parameter of this function). Then the data owner encrypts this AES key by ABE encryption algorithm by calling the function “ABEFileKeyEncrypt(char* KeyFile, int Policy[ ])”, which calls twice function “ABEEncrypt(int Policy[], element_t plaintext, struct ABECiphertext* abeciphertext)”. The ABE ciphertext is defined by the data structure “ABEAESCiphertext”.
  • 10. The searchable keyword ciphertext is defined by the data structure: Once the user finishes generating secure index and the encrypted file, he/she makes a file upload request to the server. For this request, we have set up a secure server client communication between the user and the server. This is a TCP connection which uses the IP address of both and a unique port number. After the connection has been set up, the service installed at the data owner site, sends three items to the server on the TCP socket. These 3 items are the following: 1. Encrypted file 2. Index 1: Attribute list (Defining his/her access privilege) (plzz change according to the description above, I remember we have a picture to show what you need to transmit, although the index should be generated and stored by two parts, one for keywords and the other for AES key, when transmission, one encrypted file corresponds to one encrypted file, plzz revise accordingly, or else a little bit confusing) 3. Index 2: Keyword search list DATABASE IMPLEMENTATION Once the encrypted file and the index is generated , they are sent to the server over the TCP socket and the client connection is closed. The server socket is always open and waiting for the user’s request. Once the server gets a request it stores the encrypted file and the indexes to store in the database.
  • 11. The filename and its two corresponding indexes are stored in the persistent mysql database for storage and faster retrieval at the metadata server. Here the indices are stored in the database as a String after converting element_t in the String format. We need to read the database entries which running the search algorithm for Trapdoor, by converting the String back into element_t. //Write str in the database by converting element_t ele to String at Metadata server element_t ele; element_init_GT(ele, pairing); element_from_bytes(ele, buffer); element_snprintf(str ,size_t, "%B",ele); //Read str from the database in element_t ele for Trapdoor element_t ele; element_init_GT(ele, pairing); element_set_str(ele,str,10); The database storage is implemented on the metadata server (devstack 1 in the system model). Below are the database implementation details. The various libraries installed for the mysql are mysql-client, mysql-server, libmysql++-dev and libmysqlclient-dev. Whenever a file is received from the client(VM01 in the system model) to the metadata server it is stored on the system and its indexes and the filename are populated in the mysql database. C implementation details of connecting to the database and querying it are as below. MYSQL *con = mysql_init(NULL);//Allocates and initializes a MYSQL object mysql_real_connect(con, "localhost", "root", "pwd","filekey_table", 0, NULL, 0)//connect to the database mysql_query(con1, "CREATE DATABASE filekey_table");//Query the database mysql_query(con, "CREATE TABLE filekey_table(Varchar(10) filename, varchar(10) index1, Varchar(10) index2)") //Create the table mysql_close(con);//Close connection to the mysql database We can check the database by running the command and it takes us to the mysql shell. mysql -u root -p We can run the following commands in the mysql shell to check the persistent database and it entries. SHOW DATABASES; USE db; SHOW TABLES; SELECT * FROM table_name; Token Generation (run by data user, data transmission between data user and metadata server) This is an algorithm enabling data user to generate a token (or trapdoor) with inputs: private key and interested keywords. This trapdoor could be treated as a special search request, where the searched keywords are kept private. The algorithm is as follows:
  • 12. The data structure of trapdoor is as follows: The trapdoor is generated by the data user and then sent to the metadata server as follows. Search (run by metadata server, data transmission between data user and metadata server) The search algorithm is run by the metadata server when the data user sends his/her search request (i.e., the trapdoor). There are a large number of indexes stored on the metadata server. Each index has two parts, i.e., keyword ciphertext and AES key ciphertext. In the search phase only the first part is needed. Denote the algorithm as follows, where Indexes denotes all the indexes stored on the metadata server and trapdoor denotes the search trapdoor received from the data user. Search( Indexes, trapdoor ): ENCFILE = empty set; AESKEYCIPHERTEXT = empty set; for (each ind (the first part of the index) stored on the metadata server) { Read ind from Database stored on Metadata server if (SearchIndex(ind, trapdoor) = 1) { add the corresponding encrypted file into ENCFILE; add the corresponding AES key ciphertext into AESKEYCIPHERTEXT; } } return ENCFILE and AESKEYCIPHERTEXT;
  • 13. If the equation above holds, then w=w’, the algorithm return 1, that means the data user is interested in the corresponding file, so the metadata server should send back the second part (i.e., AES key ciphertext) and the encrypted file back to the data user. After running the Search algorithm, if ENCFILE and AESKEYCIPHERTEXT are not empty sets, then they should be sent to the data user by the metadata server. We parallelize the linear search process as follows: We send back the encrypted files and the corresponding AES key ciphertext as follows: At the end of this phase, the data user should receive the encrypted files and the corresponding AES key ciphertext. Decrypt (run by the data user locally) Upon receiving the encrypted files and the corresponding AES key ciphertext. The data user runs the following algorithm. For each encrypted file in ENCFILE, and each AES key ciphertext in AESKEYCIPHERTEXT. First decrypt the AES key ciphertext by running the algorithm “ABEFileKeyDecrypt(struct ABEAESCiphertext abeaes, char* keyfilename)”, the first input is the AES key ciphertext, the second is the file storing the decrypted AES key. This algorithm calls “ABEDecypt(stuct ABECiphertext abeciphertext, element_t *m)”, where abeciphertext is the ciphertext to be decrypted and m is the corresponding plaintext, there is another implicit input “PrivateKey PrivK” (we set that as a global variable in our program). The ABE decryption algorithm runs as follows. The ABE decryption algorithm decrypts the AES key ciphertext and gets the AES key, stores it ino a file, with name “keyfilename”. By running the AES decryption algorithm with the AES key and the encrypted file as input, the data user could obtain the plainttext file. RISK MANAGEMENT OF THE PROJECT Table 4 shows the various probable risks in the project and the possible mitigation risk plans to avoid them.
  • 14. Probable Risks Risk Mitigation Cloud Computing is a new domain for all the members. Complete knowledge of all the tools and technology is difficult. Having time as a constraint, research may consume lots of time. All the members will sit together and discuss with the professor about each and everything. Devoting proper and efficient time in research in correct direction as guided by the professor. Technology is changing at a very fast pace. Being aware of everything is not an easy task for anyone. This could be an issue as choice of softwares, tools and technology may not be updates or correct. In depth knowledge of the domain is required to solve this issue. Lots of research and paper reading reading is to be done for this. Project planning if not done properly, estimations made without right calculations can lead to technical risks. Main issue could be wrong estimation of schedule and effort. To solve this issue all the team members will have regular meetings. These will involve discussions on status update and project forecast. This will allow us in early mitigation of the risk. Table 4: Risks Management Plan CONCLUSION This project aims to implement secure file sharing on cloud. The data owners could implement access control and search capability authorization, so that only authorized users could search or decrypt the encrypted files. This project uses some existing algorithms. These current schemes do not provide complete solution for data sharing in cloud computing, e.g., they do not provide effective and efficient user revocation mechanism, which is very important in real-world applications. Therefore, in the future, based on the established demo system, user revocation functionality might be added. In addition, in this project, the stored data are files, therefore, keywords search will be implemented. With the popularity of internet of things (IoT), more and more private sensor data might be stored on the cloud servers, thus making searches over numerical data, e.g., range search, important. Finally, how to implement heavy cryptographic algorithms on resource-constrained devices, e.g., smartphones, is also an interesting future work. In our developed system each user’s attributes are revealed and in the future, work on protecting this information could be done. ACKNOWLEDGMENT Thanks Dr. Huang, the instructor of this course for discussions and guiding us through project idea selection. We also thank Ankur, the TA for his support in idea formulation and helping us with the project environment setup. REFERENCES [1] Sun, Wenhai, et al. "Protecting your right: Attribute-based keyword search with fine-grained owner-enforced search authorization in the cloud." IEEE INFOCOM 2014-IEEE Conference on Computer Communications. IEEE, 2014 [2] V. Goyal, O. Pandey, A. Sahai, and B. Waters, “Attribute-based encryption for fine-grained access control of encrypted data,” in Proc. of CCS. ACM, 2006, pp. 89–98. [3] J. Bethencourt, A. Sahai, and B. Waters, “Ciphertext-policy attribute-based encryption,” in Proc. of S&P. IEEE, 2007, pp. 321–334. [4] L. Cheung and C. Newport, “Provably secure ciphertext policy abe,” in Proc. of CCS. ACM, 2007, pp. 456–465. [5] S. Yu, C. Wang, K. Ren, and W. Lou, “Attribute based data sharing with attribute revocation,” in Proc. of ASIACCS. ACM, 2010, pp. 261–270. [6] S. Yu, C. Wang, K. Ren, and W. Lou, “Achieving secure, scalable, and fine-grained data access control in cloud computing,” in Proc. of INFOCOM. IEEE, 2010, pp. 1–9. [7] http://www.cs.cityu.edu.hk/~congwang/research.html [8] https://crypto.stanford.edu/pbc/download.html [9] https://www.openssl.org [10] http://www.mysql.com/ [11] http://www.programminglogic.com/example-of-client-server-program-in-c-using-sockets-and-tcp/