Frequency-based Constraint Relaxation
for Private Query Processing in Cloud Databases
Junpei Kawamoto (Kyushu University, ...
Cloud services and Privacy
• Cloud services as cloud databases.
• Sometimes people want to keep what they request to datab...
Private query processing
• Methodologies to obtain data w/o exposing queries.
• Several protocols such as cPIR† & bbPIR†† ...
Three ideas of private queries
• We introduce three ideas for our discussion;
• Search intention: what users hope to obtai...
Existing private querying protocols
• Most existing protocols impose two constraints:
1. Queries are encoded in such a way...
• There are cases in which we do not need to retrieve the
entire database to sufficiently obscure search intentions.
• It ...
Database model & frequency
• Database model
• Database D consists of n items: D = {t1, t2, …, tn}.
• Users request x-th it...
Definition: Query Risk
• A measure of exposure risk for private queries.
• The query risk of search intention x and handle...
Definition: Privateness
• The maximum risk of complete protocols:
• We assume query risks should be less than or equals to...
Definition: Query processing cost
• Query processing costs on servers.
• We evaluate them by the size of handled sets.
• T...
The problem
• Find handled set H such that, for search intention x:
1. ,
2. ,
3. minimize Cost(x),
4. if multiple solution...
The protocol
• Our protocol employs some existing protocol (PIR, etc.).
• We assume the frequencies are public information...
• Dataset
• Query logs from †.
• Sampled 100,000 songs and 1,800,145 queries.
Evaluation
May 6, 2014 Frequency-based Const...
Evaluation
• Comparison of query risks.
• In most cases, risks are bigger than those of the complete protocols.
• Do not e...
Evaluation
• Comparison of query costs.
• Our relaxation methodologies reduce costs in most cases.
• The average cost is 6...
Conclusion
• We introduced a frequency-based constraint relaxation
methodology for private queries.
• We relaxed constrain...
Acknowledgement
• This work is partly supported by
• The Nakajima Foundation,
• Artificial Intelligence Research Promotion...
Upcoming SlideShare
Loading in …5
×

Frequency-based Constraint Relaxation for Private Query Processing in Cloud Databases

356 views
250 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
356
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Frequency-based Constraint Relaxation for Private Query Processing in Cloud Databases

  1. 1. Frequency-based Constraint Relaxation for Private Query Processing in Cloud Databases Junpei Kawamoto (Kyushu University, Japan) Patricia L. Gillett (École Polytechnique de Montréal)
  2. 2. Cloud services and Privacy • Cloud services as cloud databases. • Sometimes people want to keep what they request to databases secret. May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 2 Find restaurants near by current location x. Location based services Want to read article x Forum sites
  3. 3. Private query processing • Methodologies to obtain data w/o exposing queries. • Several protocols such as cPIR† & bbPIR†† are introduced. May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 3 Cloud Database Current location x query res. Encode x Compute query results w/o decoding queries †Kushilevitz, E. and Ostrovsky, R.: Replication Is Not Needed: Single Database, Computationally- Private Information Retrieval, Proc. of the 38th Annual Symposium on Foundations of Computer Science, pp. 364-373, 1997. E.g. Location Based Services ††Wang, S., Agrawal, D., and Abbadi, A.: Generalizing PIR for Practical Private Retrieval of Public Data, Proc. of the 24th Annual IFIP WG 11.3 Working Conference on Data and Applications Security and Privacy, pp. 1-16, 2010.
  4. 4. Three ideas of private queries • We introduce three ideas for our discussion; • Search intention: what users hope to obtain from cloud services, • Query: request users send to servers to obtain data, • Handled set: data set which servers must check to compute the results. May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 4 Location x Want information associated with location x (search intention) query res. Cloud DB Must check these items to compute the result(Handled set) Handled set for x
  5. 5. Existing private querying protocols • Most existing protocols impose two constraints: 1. Queries are encoded in such a way that servers can handle query processes but cannot actually decode queries; 2. Servers are made to check all data in the databases when computing any query result. • (2) means servers cannot distinguish any data in the DB. • Servers spend O(n) computational cost executing each query, where the database has n entries. May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 5 Cloud DB check all data any time query
  6. 6. • There are cases in which we do not need to retrieve the entire database to sufficiently obscure search intentions. • It may be enough to hide where I am in downtown. • What area is enough to ensure our privacy? • E.g. • Servers may guess the search intention is x with high probability. • We should consider frequency of search intentions. Constraint Relaxation May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 6 shop x shop a shop b shop c Popular place Unpopular places Specified by a handled set a handled set
  7. 7. Database model & frequency • Database model • Database D consists of n items: D = {t1, t2, …, tn}. • Users request x-th item tx (The search intention is x). • Handled set H(x) • The item set servers must check to compute the results of a given query associated with search intention x. • Frequency of search intentions • Freq(x) denotes the frequency of search intention x. • We assume Freq(x) is normalized. May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 7 Want tx (search intention) query res. DB H(x) Must check these items to compute the result(Handled set)
  8. 8. Definition: Query Risk • A measure of exposure risk for private queries. • The query risk of search intention x and handled set H(x) is • Conditional probability that the search intention is x given that we know the handled set H used. • The more frequent x is, the higher the risk becomes. May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 8   )( )(Freq )(Freq ))(|(Risk xHy y x xHx frequenc y E.g. • Risk(3|{1,2,3,4,5}) = 3/9 • Risk(1|{1,2,3,4,5}) = 1/9 • Risk(1|{1,2}) = 1/3
  9. 9. Definition: Privateness • The maximum risk of complete protocols: • We assume query risks should be less than or equals to the maximum risk of complete protocols: • Complete protocols are widely accepted that is the max. risk is, too. • Our approach will also be considered private if it satisfies this condition. May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 9 ))(|(Riskmax))(|(Risk completerelaxed yHyxHx Dy  ))(|(Riskmax complete xHx Dx i.e. w/o relaxation Our handled set Handled set for existing protocols
  10. 10. Definition: Query processing cost • Query processing costs on servers. • We evaluate them by the size of handled sets. • The cost of search intention x is . May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 10 |)(|)(Cost xHx  query res. DB H(x) H(x) query res. DB vs.
  11. 11. The problem • Find handled set H such that, for search intention x: 1. , 2. , 3. minimize Cost(x), 4. if multiple solutions have equal cost, chose the one maximizes • We solve this problem by Dynamic Programming. • O(n) algorithm but details are in our paper. • We also extend these problem and algorithm to • range queries in 1D data, • exact match queries in 2D data. May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 11 x Î H(x) )|(Riskmax))(|(Risk DyxHx Dy   )( )(FreqxHy y
  12. 12. The protocol • Our protocol employs some existing protocol (PIR, etc.). • We assume the frequencies are public information. • User: whose search intension is x, 1. Compute optimized handled set H(x) using query frequencies. 2. Compute a private query for x assuming a DB has only items in H(x). 3. Send the query and H(x) to the cloud server. • The cloud server receiving the query, 1. Consider a sub-database consists of items in H(x). 2. Process the received query and return the result to the user. May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 12 query res. H(x) : sub-DB Cost: O(|H(x)|)
  13. 13. • Dataset • Query logs from †. • Sampled 100,000 songs and 1,800,145 queries. Evaluation May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 13 †http://www.dtic.upf.edu/~ocelma/MusicRecommendationDataset/lastfm-1K.html Frequency of search intention x. x-axis: search intentions x y-axis: # of times users requested item tx
  14. 14. Evaluation • Comparison of query risks. • In most cases, risks are bigger than those of the complete protocols. • Do not exceed the maximum risks of the complete protocol. May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 14 relaxed (avg.) is computed by   Dx xHxx ))(|(Risk)(Freq min. max. avg. comp. 5.6×10−7 3.9×10−3 1.0×10−5 relaxed 3.4×10−3 3.9×10−3 3.5×10−3 Query risk of search intention x.
  15. 15. Evaluation • Comparison of query costs. • Our relaxation methodologies reduce costs in most cases. • The average cost is 6.5% that of complete protocols. May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 15 relaxed (avg.) is computed by Freq(x)´Cost(x) xÎD å min. max. avg. comp. 100000 100000 100000 relaxed 2 100000 6417 Query cost of search intention x.
  16. 16. Conclusion • We introduced a frequency-based constraint relaxation methodology for private queries. • We relaxed constraint (2) of the complete protocols so that only a subset of the database is retrieved for each query. • We evaluated our proposal using a real dataset from . • Our protocol can reduce computational costs in servers in most cases, • The risk of a query being exposed is not bigger than the maximum risk in complete protocols. May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 16 Servers must check all data in the databases when computing any query result.
  17. 17. Acknowledgement • This work is partly supported by • The Nakajima Foundation, • Artificial Intelligence Research Promotion Foundation, • Grant-in-Aid for Young Scientists (B) (26730065), Japan Society for the Promotion of Science (JSPS). May 6, 2014 Frequency-based Constraint Relaxation for Private Query Processing 17 Thank you for your attention!

×