Upcoming SlideShare
Loading in …5
×

# Ijnsa050213

176 views
143 views

Published on

A Novel Secure Cosine Similarity Computation Scheme with Malicious Adversaries

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

• Be the first to like this

No Downloads
Views
Total views
176
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Ijnsa050213

1. 1. International Journal of Network Security & Its Applications (IJNSA), Vol.5, No.2, March 2013 A NOVEL SECURE COSINE SIMILARITY COMPUTATION SCHEME WITH MALICIOUS ADVERSARIES Dexin Yang1,Chunjing Lin2 and Bo Yang3 1 Department of Information,Guangzhou City Polytechnic,Guangzhou,China,510405 2 Department of Information, Guangdong Baiyun Institute, Guangzhou,China,510460 3 School of Computer Science, Shaanxi Normal University, 710062, byang@snnu.edu.cnABSTRACTSimilarity coefficients play an important role in many aspects.Recently,several schemes were proposed,butthese schemes aimed to compute the similarity coefficients of binary data. In this paper, a novel schemewhich can compute the coefficients of integer is proposed. To the best knowledge of us,this is the firstscheme which canesist malicious adversaries attack.KEYWORDSSimilarity coefficients,Distributed EIGamal encryption,Zero-knowledge proof,Secure two-partycomputation1. INTRODUCTIONCosine similarity is a measure of similarity between two vectors by measuring the cosine of theangle between them. The cosine of 0 is 1, and less than 1 for any other angle; the lowest value ofthe cosine is -1. The cosine of the angle between two vectors thus determines whether two vectorsare pointing in roughly the same direction. Many application domains need this parameter toanalyze data, such as privacy-preserving data mining, biometric matching etc.The functionality of the privacy-preserving cosine similarity for integer data(Denoted by )can be described as follows. Consider has a vector ,has a vector , where . After the computation gets the result thecosine correlative coefficient and gets nothing.Related works Secure two-party computation allows two parties to jointly compute somefunctions with their private inputs, while preserving the privacy of two parties private inputs.Research on the general functionality of secure computation was first proposed in [1]in the semi-honest model. Lately, Goldreich[2], Malkhi[3],Lindell and Pinkas[4,5]extended in the presence ofmalicious adversaries.Even though the general solution of secure multiparty computations has given by Goldreich [6].However, these general solutions are inefficient for practical uses, because these protocols wereconstructed based on the boolean circuit or the arithmetic circuit of the functionality. When thecircuit of the functionality became complex enough, the complexity of this protocol will be tooDOI : 10.5121/ijnsa.2013.5213 171
2. 2. International Journal of Network Security & Its Applications (IJNSA), Vol.5, No.2, March 2013lower to tolerate. Till now, the protocol which can resist to the attacks of malicious adversarieswere the focus works of cryptographers. Therefore, it is necessary to construct the protocol whichcan compute cosine correlative coefficient of two vectors in the malicious model.Kikuchi,Hiroaki et al.[7] gave the first protocol to compute two vectors cosine correlativecoefficient based on zero knowledge proof of range and applied this protocol to biometricauthentication. This protocol is based on zero-knowledge proofs and Fujisaki-Okamotocommitments[8]. Recently, K.S.Wong et al.[9] proposed a new protocol which can compute thesimilarity coefficient of two binary vectors in the presence of malicious adversaries. Later, Bozhang et al.[10] pointed out this scheme is not secure, and another scheme which can overcomethe shortage of Wongs scheme is proposed.Our results In this paper, a new protocol which can compute the cosine correlative coefficient isproposed. Our protocol can resist the attacks of malicious adversaries, and we give the standardsimulation-based security proof.Our main technical tools include distributed EIGamal encryption[11] and zero-knowledge proofsof knowledge. The main property of distributed EIGamal encryption is that the parties mustcooperate while in decrypting stage because each party has partial decrypt key.2.PRELIMINARIES2.1 Cosine Correlative SimilarityLet be two dimensional integer vectors.We consider the cosine similarity between and ,which will be evaluated in privacy-preservingin later section.Definition 1 A cosine correlation is a similarity between and defined asFor normalization ,the cosine correlation can be simplified as where is a norm of .The proposed scheme in this paper is focus on computing the cosine correlation coefficient of twonormalization vectors and .2.2 Distributed ElGamal EncryptionElGamal encryption [11] is a probabilistic and homomorphic public-key crypto system. Let and be two large primes such that divides . denotes unique multiplicative subgroup oforder . All computations in the remainder of this paper are modulo unless otherwise noted. Theprivate key is , and the public key is is a generator). A message isencrypted by computing the ciphertext tuple where is an arbitrary randomnumber in , chosen by the encrypter. 172
3. 3. International Journal of Network Security & Its Applications (IJNSA), Vol.5, No.2, March 2013A message is decrypted by computingElGamal is homomorphic , as the component-wise product of two ciphertextsrepresents an encryption of the plaintexts product .A distributed EIGamal Encryption system[12]is a public-key cryptosysytem whichkey generation algorithm [13]and decryption algorithm is as follows:Distributes key generation: Each participant chooses at random and publishes alongwith a zero-knowledge proof of knowledge of s discrete logarithm. The public keyis , the private key is . This requires multiplications, but thecomputational cost of multiplications is usually negligible in contrast to exponentiations.Distributed decryption: Given an encrypted message , each participantpublishes and proves its correctness by showing the equality of logarithms of and .The plaintext can be derived by computing . Like key generation, decryption can beperformed in a constant number of rounds, requiring multiplications and one exponentiation.Same as Bo zhang et al.[10],we also use an additively homomorphic variation of EIGamalEncryption with distributed decryption over a group in which DDH is hard,i.e, .2.3 Zero Knowledge ProofIn order to obtain security against malicious adversaries, the participants are required to prove thecorrectness of each protocol step. Zero knowledge proof is a primitive in cryptography.In fact, the proposed protocols can be proven correct by only using protocols. A protocolis a three move interaction protocol. In this paper, there are four protocols are used as follows.We denote these associated functionalities by Next, we simplydescribe the associated zero-knowledge protocols: . The prover can prove to the verifier that he knows the knowledge of thesolution to a discrete logarithm. . The prover can prove to the verifier that the solutions of two discrete logarithm problemsare equal. 173
4. 4. International Journal of Network Security & Its Applications (IJNSA), Vol.5, No.2, March 2013 . The prover can prove to the verifier that the generation of EIGamal encryption is valid. . The prover can prove to the verifier that the ciphertext of EIGamal encryption is valid3.THE PROPOSED SCHEMEIn this section,we give out the protocol( ) which computes the coefficient of two integervectors in the presence of malicious adversaries. The ideal functionality of coefficient is asfollows:where denotes gets nothing after the protocol execution, denotes the cosine coefficientbetween two vectors . In the ideal model, sends his private input to the third trustedparty(TTP), similarly sends his private input to TTP. Finally, TTP sends back to ,and nothing to .The building blocks of our protocol include distributed EIGamal encryption and zero-knowledgeproofs.The reason we choose distributed EIGamal encryption rather than original EIGamalencryption is the distributed EIGamal encryption is less complexity in protocol. Fig 1 The proposed schemeThe protocol( )(Fig 1) is as follows: 174
5. 5. International Journal of Network Security & Its Applications (IJNSA), Vol.5, No.2, March 2013 The input of is a dimensional integer vector .Similarly, sinput is .-Auxiliary Inputs: Both parties have the security parameter .-The protocol:1. engage in the protocol to generate the public key ,and the private key ,shared by respectively.2. computes: ,and sends .The partiesrun the zero-knowledge proof of knowledge ,allowing to verify that the ciphertextis valid.3.Upon receiving the from ,the computes: , )where .and sends to party .The parties run thezero-knowledge proof of knowledge ,allowing to verify that the ciphertext isvalid.4.Upon receiving the from , computes using his private keyas: ,and send to .The parties run the zero-knowledge proof of knowledge ,allowing to verify that is valid.5.Upon receiving the from , decrypts and obtains , where is the cosinecoefficient of the two vectors .At last, evaluates as follows.(a)If ,then ;(b)If ,then .4. SECURITY ANALYSISTheorem 1 Assume that are as described in section 2 and that is the EIGamal scheme. The correctly evaluates the consine coefficient of two -dimension variables in the presence of malicious adversaries.The proof of this theorem 1 is obviously.Theorem 2 Assume that are as described in section 2 and that is the EIGamal scheme. The securely evaluates the consine coefficient of two-dimension variables in the presence of malicious adversaries. 175
6. 6. International Journal of Network Security & Its Applications (IJNSA), Vol.5, No.2, March 2013Proof:We prove this theorem in the hybrid model,where a third trusted party is introduced tocompute the ideal functionality . As usual, we analyze twocases as is corrupted and is corrupted separately. is corrupted. Assume that is corrupted by adversary with the auxiliary input in thereal model. we construct a simulator ,who runs in the ideal model with the third trusted partycomputing the functionality works as follows.1. is given s input and auxiliary input, and invokes on these values.2. first emulates the trusted party for as follows. It first two random elements , and hands and the public key .3. receives from encryptions and s input for the trusted party for , then define s inputs as .4.Then sends to the trusted party to compute to complete the simulation in the idealmodel. Let be the returned value from the trusted party.5.Next randomly chooses conditioned on that the cosine coefficient equalsto . completes the execution as the honest party would on inputs .6.If at any step, sends an invalid message, aborts sends to the trusted party forOtherwise, it outputs whatever does.The difference between the above simulation and the real hybrid model is that who does nothave the real s input ,simulates following steps with the randomly chosen under the condition that theoutput of them are the same.The computationally distinguishability of them can be deduced fromthe sematic security of EIGamal encryption. In other words, if can distinguish the simulationfrom the real execution,we can construct a distinguisher to attack the semantic security ofEIGamal encryption. is corrupted. The proof of this part is similar with above.We construct a simulator in theideal model, based on the real adversary in the real model. works as follows.1. is given s input and auxiliary input, and invokes on these values.2. first emulates the trusted party for as follows. It first two random elements , and hands and the public key .3. randomly chooses , then encrypts them using the public key.4.Next, sends the ciphertexts to , and proves to that all the ciphertexts is valid using .5. receives from ciphertexts and s input to the trusted party for , then defines s inputs as .6.The completes the next step as the honest .7.If at any step, sends an invalid message, aborts sends to the trusted party for .Otherwise sends to the trusted party computing ,and outputs whatever does.Similar to the case is corrupted, the difference between the simulation and the real model is 176
7. 7. International Journal of Network Security & Its Applications (IJNSA), Vol.5, No.2, March 2013that uses s input. However, is encrypted by the public key of a semantic securityEIGamal encryption. Same as the above,the analysis of this simulation distribution can be assuredby the definition of zero-knowledge proof and semantic security of a public-key encryption. In summary, we complete the proof of in the presence of malicious adversaries.5. CONCLUSIONSimilarity coefficients (also known as coefficients of association) are important measurementtechniques used to quantify the extent to which objects resemble one another. There are varioussimilarity coefficients which can be used in different fields. Cosine similarity is a measure ofsimilarity between two vectors by measuring the cosine of the angle between them. In this paper,a new scheme which can compute the cosine correlative of two integer vectors in the presence ofmalicious adversaries.ACKNOWLEDGEMENTSThis work is supported by the National Natural Science Foundation of China under Grant Thiswork is supported by the National Natural Science Foundation of China under Grant 61173164and 61272436, and the Natural Science Foundation of Guangdong Province under Grants10351806001000000.REFERENCES[1] Andrew Chi-Chih Yao. How to generate and exchange secrets. In Proceedings of the 27th Annual Symposium on Foundations of Computer Science, SFCS 86, pages162-167, Washington, DC, USA, 1986. IEEE Computer Society.[2] O. Goldreich, S. Micali, and A. Wigderson. How to play any mental game. In Proceedings of the nineteenth annual ACM symposium on Theory of computing,STOC 87, pages 218-229, New York, NY, USA, 1987. ACM.[3] Dahlia Malkhi, Noam Nisan, Benny Pinkas, and Yaron Sella. Fairplay—a secure two-party computation system. In Proceedings of the 13th conference on USENIX Security Symposium - Volume 13, SSYM04, pages 20-20, Berkeley, CA,USA, 2004. USENIX Association.[4] Yehuda Lindell and Benny Pinkas. An efficient protocol for secure two-party computation in the presence of malicious adversaries. In Proceedings of the 26th annual international conference on Advances in Cryptology, EUROCRYPT 07, pages 52-78, Berlin, Heidelberg, 2007. Springer-Verlag.[5] Yehuda Lindell and Benny Pinkas. Secure two-party computation via cut-and-choose oblivious transfer. In Proceedings of the 8th conference on Theory of cryptography, TCC11, pages 329-346, Berlin, Heidelberg, 2011. Springer-Verlag.[6] O.Goldreich. Secure multi-party computation (working draft),1998.http://citeseer.ist.psu.edu/goldreich98secure.html.[7] Hiroaki Kikuchi, Kei Nagai, Wakaha Ogata, and Masakatsu Nishigaki. Privacy-preserving similarity evaluation and application to remote biometrics authentication. Soft Comput., 14(5):529-536, December 2009.[8] S.Wong and M.H.Kim. Privacy-preserving similarity coe_cents for binary data. Computers and Mathematics with Applications, 2012.http://dx.doi.org/10.1016/j.camwa.2012.02.028.[9] Eiichiro Fujisaki and Tatsuaki Okamoto. Statistical zero knowledge protocols to prove modular polynomial relations. In Proceedings of the 17th Annual International Cryptology Conference on Advances in Cryptology, CRYPTO 97, pages 16-30,London, UK, UK, 1997. Springer-Verlag.[10] Bo Zhang and Fangguo Zhang. Secure similarity coefficients computation with malicious adversaries. Cryptology ePrint Archive, Report 2012/202, 2012.http://eprint.iacr.org/.[11] Taher El Gamal. A public key cryptosystem and a signature scheme based on discrete logarithms. In Proceedings of CRYPTO 84 on Advances in cryptology,pages 10-18, New York, NY, USA, 1985. Springer-Verlag New York, Inc. 177
8. 8. International Journal of Network Security & Its Applications (IJNSA), Vol.5, No.2, March 2013[12] Felix Brandt. Efficient cryptographic protocol design based on distributed el gamal encryption. In Proceedings of the 8th international conference on Information Security and Cryptology, ICISC05, pages 32-47, Berlin, Heidelberg, 2006. Springer-Verlag.[13] Torben P. Pedersen. Non-interactive and information-theoretic secure verifiable secret sharing. In Proceedings of the 11th Annual International Cryptology Conference on Advances in Cryptology, CRYPTO 91, pages 129-140, London, UK, UK,1992. Springer-Verlag.AuthorsDexin Yang received his Ph.D. degree at South China Agricultural University. Now he is alecturer of Guangzhou City Polytechnic. His main research topics are cryptographyandinformation security.Bo Yang received his Ph.D. degree at Xidian University in China. Now he is a professor ofShaanxi Normal University. His main research topics are cryptography and information security.Chunjing Lin received his master degree at Xidian University in China. Now he is a professor ofGuangdong Baiyun Institute. His main research topics are embedded system and informationprocessing 178