Private Range Query by Perturbation and Matrix Based Encryption

2,379 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,379
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Private Range Query by Perturbation and Matrix Based Encryption

  1. 1. Private Range Queryby Perturbation and Matrix Based Encryption Junpei Kawamoto and Masatoshi Yoshikawa Kyoto University, Japan
  2. 2. Sep. 27, 2011 Private Range Query by Perturbation and Matrix Based Encryption 2Cloud database and its security• Recent research topics about security of cloud computing • Mainly focusing on service providers • How to analyze data without privacy problems (PPDM) • How to share data and manage encryption keys • How to execute queries over encrypted data web Recently focused User Client Service Provider• Less studies about compromise from queries • But, queries (i.e. what a user searched for) have important information about the user. • Security model about this problem was introduced only recently.
  3. 3. Sep. 27, 2011 Private Range Query by Perturbation and Matrix Based Encryption 3Purpose and basic notions• Private (range) query • We focus range queries, which include exact match queries as a special case. • obtains data without exposing any information about what the users requested to third persons including service providers.• We do not perfectly believe in service providers • Actually, service providers are unlikely to become an attacker but… • Servers could be fallen by attackers or stolen physically • Users can’t know the actual life of their data stored in servers. We should make a database service which doesn’t ask users to believe in service providers.• We assume the scheme of databases is (Key, Value) • Users request queries over only the Key attribute
  4. 4. Sep. 27, 2011 Private Range Query by Perturbation and Matrix Based Encryption 4Related work In our method, clients• Encrypted databases transform queries, too. • To avoid leaks all data are encrypted by clients • Main topic is how to handle queries over encrypted data 1-to-1 mapping (hash function, etc.) 15:00 4hwr2g 15:00 “4hwr2g” ~ or 15:12 teg2b1 15:12 “teg2b1” many-to-1 mapping (k-anonymizer, etc.) 14:45 15:00 15:00 15:00 ~ 15:00 15:12 15:12 They achieve some kind of private query but not enough!
  5. 5. Sep. 27, 2011 Private Range Query by Perturbation and Matrix Based Encryption 5Frequency Analysis Attack (FAA)• Attackers who know the distribution of queries could guess plain queries from transformed ones. mapping q q* Dist. of plain queries Dist. of transformed queries 1-to-1 mapping (eg. hashing) Many-to-1 mapping (eg. avg) q* q* Dist. of transformed queries Dist. of transformed queries
  6. 6. Sep. 27, 2011 Private Range Query by Perturbation and Matrix Based Encryption 6Key idea for protecting FAA• Using 1-to-many mapping to make the dist. of transformed queries different from the original distributions Tk1(15:00) Tq1(15:00-15:12) 15:00 15:00 Tk2(15:00) ~ Tq2(15:00-15:12) 15:12 q q* Dist. of plain queries mapping Dist. of transformed queries To ensure this properties, we add perturbations to queries and then encrypt them.
  7. 7. Sep. 27, 2011 Private Range Query by Perturbation and Matrix Based Encryption 7Inner Product Predicate (IPP) method• Employs polynomials f(k) as queries to add perturbations • Query [a, b] is described as f(k) ≤ 0 with perturbation r. f(k) NOT match f(k) match -r’ 0 a b k -r 0 k a b Different r produces different query.• Uses matrix based encryption • Matrix based encryption enables query processing w/o decryption • Query f(k) ≤ 0 are expressed by vector q, k as q・k ≤ 0 • Encryption key is a regular matrix M • q and k are encrypted as Mtq and M-1k • The inner product is computed as Mtq・M-1k = qtMM-1k = q・k canceled
  8. 8. Sep. 27, 2011 Private Range Query by Perturbation and Matrix Based Encryption 8Inner Product Predicate (IPP) method• Perturbation-added polynomials f(k) f(k) • fr(k) = (k – a)(k – b)(k + r) perturbation• Vector form of attr. values and queries -r 0 a b k • Key vector k = (k3, k2, k, 1)t • Query vector q = (1, r–a–b, ab–ar–br, abr)t Different r produces • The inner-product is q・k = (k – a)(k – b)(k + r) different query.• Encrypting both vectors Keymatrix Mt q ・ M-1 k = qt M M-1 k = q ・ k Encrypted query Inner product can be computed Encrypted attr. value w/o decryption• IPP method also adds perturbation to attr. values • For details, please see our paper.
  9. 9. Sep. 27, 2011 Private Range Query by Perturbation and Matrix Based Encryption 9 Scheme of IPP method • Adding tuples Transformed tuple: (Tkr(k), v) where Tkr(k) = M-1(k3, k2, k, 1)tNew tuple: (k, v) Store (Tkr(k), v) web User Client Service Provider • Searching tuples Transformed query: Tq(a ≤ k ≤ b) where Tq(a ≤ k ≤ b) = Mt(–1, a+b–r, ar+br–ab, –abr)tQuery: a ≤ k ≤ b Compute web inner-products for all tuples User Client Service Provider Server’s computational cost is O(n) (n: the number of tuples)
  10. 10. Sep. 27, 2011 Private Range Query by Perturbation and Matrix Based Encryption 10Comparison of necessary memory size Plain Transformed Key attribute values lK 12lK + 4(lφ + 3lm + lrk) Queries 2lK 8lK + 4(ld + lm + lrq) • lk: bit length of key attribute values • lφ: bit length of perturbations for key attribute values • ld: bit length of perturbations for queries • lm: bit length of encryption keys • lrk, lrm: bit length of random values used to encryption• Summary • Attribute values requires 12 times larger cost than plain case. • Queries requires four times larger cost than plain case.
  11. 11. Sep. 27, 2011 Private Range Query by Perturbation and Matrix Based Encryption 11Experimental evaluations• We have conducted to evaluate • The correlations between dist. of plain queries and transformed ones is low enough. • Query proc. time is O(n) with the number of tuples n.• Common conditions • All programs are implemented in Python (2.6.4). • Experiments were performed on one 2.66GHz processor virtual machine with 512MB running on Virtual Box. • We chose parameters of IPP method as lK = lφ = lm = lrk = lrp = 32. • default size in many programming language
  12. 12. Sep. 27, 2011 Private Range Query by Perturbation and Matrix Based Encryption 12Exp. 1: Correlations of queries• Query set • 1,000 queries which requested [a, a + 100] (a : 1, 2, ・ ・ ・ , 1000). A range query [500, 600] is mapped to 3.0×1013 Transformed queries This graph shows only 1st elem. of query vectors Query vectors were distributed in wide range without depending the plain values. Left side of plain range queries• Coefficient of correlations: 0.014679
  13. 13. Sep. 27, 2011 Private Range Query by Perturbation and Matrix Based Encryption 13Exp. 2: Query processing time• Conditions • Five databases which had different numbers of tuples • Requesting random one million queries to each database the query proc. time is according to O(n) with the number of tuples n ×2 ×2
  14. 14. Sep. 27, 2011 Private Range Query by Perturbation and Matrix Based Encryption 14Open problems• Reducing computational cost of servers. • O(n) is min. cost because if servers could prune candidate tuples, it means servers, somehow, know what users request. • There is a trade off between security and computational cost.• Attackers may guess the plain queries and attribute values by gathering and analyzing results of queries. • However, in general, each result of queries consists many tuples. • Gathering the results needs much more storage space. • We suppose that it is also necessary to argue about effectiveness of attacks for the results of querying.
  15. 15. Sep. 27, 2011 Private Range Query by Perturbation and Matrix Based Encryption 15Conclusion• We introduce a new private query. • Transformation algorithms are probabilistic. • Provide 1-to-many mapping for attribute values and queries. • The computational cost is O(n). • Low correlation between transformed distributions and plain ones. • IPP method is against the frequency analysis attack• Future work • Reducing computational cost of servers. • Considering another attack for query results. Thank you for your attention!

×