Sigmod11 outsource shortest path


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Queries ?
  • Sigmod11 outsource shortest path

    1. 1. Neighborhood-Privacy Protected Shortest Distance Computing in Cloud Jun Gao , Jeffrey Yu Xu, Ruoming Jin, Jiashuai Zhou, Tengjiao Wang, Dongqing Yang 14 Jun, 2011, Greece, SIGMOD 2012
    2. 2. Outline <ul><li>Motivation </li></ul><ul><li>Related work </li></ul><ul><li>Our solution </li></ul><ul><ul><li>1-neighborhood-d-radius graph </li></ul></ul><ul><ul><li>Graph transformation with exact answer </li></ul></ul><ul><ul><li>Graph transformation with approximate answer </li></ul></ul><ul><li>Experiment </li></ul><ul><li>Conclusion & Future work </li></ul>
    3. 3. Graph data management in cloud Co a uthor Network , from <ul><li>Graph data applications </li></ul><ul><ul><li>Social network, knowledge network... </li></ul></ul><ul><li>Time consuming graph operations </li></ul><ul><ul><li>The shortest distance computing takes O(n 2 ) </li></ul></ul><ul><ul><li>The breadth-first-search requires O(n+m) </li></ul></ul><ul><ul><li>...... </li></ul></ul>Cloud Computing <ul><li>Advantage of cloud computing </li></ul><ul><ul><li>High computational power </li></ul></ul><ul><ul><li>Easy maintenance </li></ul></ul><ul><ul><li>Easy re-provisioning of resources </li></ul></ul><ul><ul><li>…… </li></ul></ul>Can we use the cloud serve to manage graph data , such as to answer shortest distance ?
    4. 4. Security issues in graph outsourcing <ul><li>Attacks on outsourced graph </li></ul><ul><ul><li>Structural Pattern Attack </li></ul></ul><ul><ul><ul><li>Use sub-graph to re-identify the target part </li></ul></ul></ul><ul><ul><li>Reconstruction Attack </li></ul></ul><ul><ul><ul><li>Recover the original graph from outsourced one. </li></ul></ul></ul><ul><li>Security leakage </li></ul><ul><ul><li>Regulation of sensitive data violated </li></ul></ul><ul><ul><li>Untrusted answers produced by cloud server </li></ul></ul>We have to strike a balance between the security and the computational cost saving using cloud server
    5. 5. Framework of graph outsourcing <ul><li>A reasonable security model on outsourced graph </li></ul><ul><li>An efficient method to transform the original graph into the outsourced graph </li></ul><ul><li>(3) An approach to rewrite the query and combine the results </li></ul>Client Side Original Graph Graph Transformation Link graph Results Result Combination Cloud Server Outsourced Graph Query Evaluation Query Rewriting Query (2) (1) (3)
    6. 6. Outline <ul><li>Motivation </li></ul><ul><li>Related work </li></ul><ul><li>Our solution </li></ul><ul><ul><li>1-neighborhood-d-radius graph </li></ul></ul><ul><ul><li>Graph transformation with exact answer </li></ul></ul><ul><ul><li>Graph transformation with approximate answer </li></ul></ul><ul><li>Experiment </li></ul><ul><li>Conclusion & Future work </li></ul>
    7. 7. Structural Anonymization <ul><ul><li>Structural anonymization in publishing </li></ul></ul><ul><ul><ul><li>1-neighborhood [icde 08], k-degree [sigmod08], k-automorphism [vldb 09], k- isomorphism [sigmod10], etc </li></ul></ul></ul><ul><ul><ul><li>Using the least amount of modifications of the original graph </li></ul></ul></ul>Original graph 4-isomorphism Attacker’s query find 4 sub-graphs No shortest distance preservation No consideration of edge weight
    8. 8. Feature preservation graph transformation <ul><li>Eigenvalue preservation [sdm 08] </li></ul><ul><ul><li>Random add/remove/switch edges </li></ul></ul><ul><ul><li>Theoretically prove that the eigenvalue can be preserved. </li></ul></ul><ul><li>Shortest path preservation [icde 10] </li></ul><ul><ul><li>Express the shortest path preservation by inequality rules </li></ul></ul><ul><ul><li>Use line programming to find a solution to such rules </li></ul></ul><ul><ul><li>Requires O(dn 2 ) rules in all shortest path preservation </li></ul></ul>No support of exact distance computing No explicit security guarantee
    9. 9. Shortest distance index <ul><li>Multiple-level index [tkde98] </li></ul><ul><ul><li>Select nodes to build a higher level graph </li></ul></ul><ul><ul><li>Exploit the shortest paths at a higher level graph to guide the path searching at a lower level </li></ul></ul><ul><li>Landmark index [cikm 09, jacm 09] </li></ul><ul><ul><li>Select landmark nodes and build the shortest path </li></ul></ul><ul><ul><li>Exploit the triangle inequality rules to estimate the distance </li></ul></ul><ul><li>2-HOP index [soda 02] </li></ul><ul><ul><li>Annotate incoming and outgoing labels on each node </li></ul></ul><ul><ul><li>Compute the distance between two nodes with the intersection </li></ul></ul>No security consideration
    10. 10. Outline <ul><li>Motivation </li></ul><ul><li>Related work </li></ul><ul><li>Our solution </li></ul><ul><ul><li>1-neighborhood-d-radius graph </li></ul></ul><ul><ul><li>Graph transformation with exact answer </li></ul></ul><ul><ul><li>Graph transformation with approximate answer </li></ul></ul><ul><li>Experiment </li></ul><ul><li>Conclusion & Future work </li></ul>
    11. 11. 1- Neighborhood-d-Radius Graph <ul><li>Intuition </li></ul><ul><ul><li>Protect the neighborhood information and the close relationship between nodes. </li></ul></ul><ul><li>Privacy protection </li></ul><ul><ul><li>Find empty meaningful results for any query pattern </li></ul></ul>( 1-neighborhood ): for any node pair u and v ∈ Vo, (u, v) ∉ E ( d-radius ): for any node pair u and v ∈ Vo, δ G (u, v) >= d. Original graph Attacker’s query 2-radius graph
    12. 12. 1-Neighborhood-d-Radius Graph too strong? <ul><li>Can we hide the neighbors and relationship with distance less than d, and add direct edges among others? </li></ul><ul><li>No, using triangle inequality rules will find the “hidden” edges </li></ul><ul><ul><li>Reconstruction Attack </li></ul></ul>Original graph non-2-radius graph
    13. 13. Utilization: Shortest Distance Computation <ul><li>Given a node pair u and v, the shortest distance can be discovered with </li></ul>…… u v
    14. 14. Graph Transformation Problem <ul><li>Given a graph G = (V,E) and d, the graph transformation produces outsourced graphs G o = {G 1, ...G j } , and a local link graph G l, which achieves the following objectives: </li></ul><ul><ul><li>Security </li></ul></ul><ul><ul><ul><li>Each outsourced graph is a 1-neighborhood-d-radius graph; </li></ul></ul></ul><ul><ul><li>Utility </li></ul></ul><ul><ul><ul><li>The union of G o and G l can answer the shortest distance in the original graph; </li></ul></ul></ul><ul><ul><li>Local computational cost </li></ul></ul><ul><ul><ul><li>The space cost of G l and the cost of the shortest distance computation on the client side are minimized. </li></ul></ul></ul>
    15. 15. Naive Method <ul><li>Steps </li></ul><ul><ul><li>Enumerate different forms of the candidate solutions </li></ul></ul><ul><ul><ul><li>One local link graph and outsourced graphs. </li></ul></ul></ul><ul><ul><li>Find the one with the minimal space cost of local graph. </li></ul></ul><ul><li>Searching space </li></ul><ul><ul><li>The nodes in a outsourced graph are a sub-set of the these in original graph, and the different forms of outsourced graph can be O(2 n ) </li></ul></ul><ul><ul><li>The brute force strategy will lead to exponential time cost </li></ul></ul>
    16. 16. Greedy Method <ul><li>Basic idea </li></ul><ul><ul><li>Generate more “ expressive ” outsourced graph which can answer more shortest paths. </li></ul></ul><ul><ul><ul><li>Edges in link graph can be reused so that the space cost of link graph is reduced </li></ul></ul></ul><ul><li>Challenges </li></ul><ul><ul><li>How to find “expressive” outsourced nodes? </li></ul></ul><ul><ul><li>How to build d-radius graph from the select nodes? </li></ul></ul><ul><li>Steps </li></ul><ul><ul><li>1. Enumerate all shortest paths, find possible candidate outsourced nodes, and assign benefit on nodes </li></ul></ul><ul><ul><li>2. Generate outsourced graphs according to node benefit </li></ul></ul>
    17. 17. Step 1: Enumerate shortest path and benefit assignment <ul><li>Candidate outsourced node pair </li></ul><ul><ul><li>node pair (x,y) can be used to answer shortest distance between (u,v) </li></ul></ul><ul><ul><li>(x,y) should meet d-radius. </li></ul></ul><ul><ul><li>x is close to u, y is close to v </li></ul></ul><ul><li>Benefit function </li></ul><ul><ul><li>Record the frequency of a node (or node pair) which can be outsourced </li></ul></ul>
    18. 18. Step 2: Generate one outsourced graph <ul><li>Node selection </li></ul><ul><ul><li>The node which is with the next maximal benefit and is not in any cluster, can be selected </li></ul></ul><ul><ul><li>Build a d-radius cluster for the selected node </li></ul></ul><ul><li>Edge building </li></ul><ul><ul><li>The edge weight is the shortest distance between cluster centers </li></ul></ul>
    19. 19. Graph transformation with approximate answer <ul><li>Graph transformation with exact answer at least requires enumeration of all shortest paths. </li></ul><ul><li>Approximate distance can be acceptable in many domains </li></ul><ul><li>Approximate distance can be measured by </li></ul><ul><li>Basic idea </li></ul><ul><ul><li>Transform graph to achieve α = 1 and a given average additive error β ? </li></ul></ul><ul><li>Main steps </li></ul><ul><ul><li>Construct outsourced graph in a relaxed way </li></ul></ul><ul><ul><li>Estimate the average additive error </li></ul></ul>
    20. 20. Relaxed outsourced graph construction <ul><li>Select outsourced nodes randomly. </li></ul><ul><li>Relax edge weight assignment </li></ul><ul><ul><li>Build k shortest path trees </li></ul></ul><ul><ul><li>In each tree, link the outsourced node with its lowest ancestor as the edge. </li></ul></ul>
    21. 21. Estimation of average additive error <ul><li>The error for distance query (u,v) varies according to whether u and v have been outsourced </li></ul><ul><li>β can be computed as follows: </li></ul><ul><ul><li>We estimate the percentage of each category with the random node selection assumption </li></ul></ul><ul><ul><li>The average additive error can be estimated by sampling </li></ul></ul>
    22. 22. Heuristic outsourced node selection <ul><li>Single outsourced graph </li></ul><ul><ul><li>Degree based construction </li></ul></ul><ul><ul><ul><li>First select the node with the higher degree </li></ul></ul></ul><ul><ul><li>Cluster size based construction </li></ul></ul><ul><ul><ul><li>First select the node with more nodes in its cluster </li></ul></ul></ul><ul><li>Multiple outsourced graphs </li></ul><ul><ul><li>Avoid outsourcing the same graph. </li></ul></ul>
    23. 23. Outline <ul><li>Motivation </li></ul><ul><li>Related work </li></ul><ul><li>Our solution </li></ul><ul><ul><li>1-neighborhood-d-radius graph </li></ul></ul><ul><ul><li>Graph transformation with exact answer </li></ul></ul><ul><ul><li>Graph transformation with approximate answer </li></ul></ul><ul><li>Experiment </li></ul><ul><li>Conclusion & Future work </li></ul>
    24. 24. Experiment <ul><li>Measures: </li></ul><ul><ul><li>transformation time cost </li></ul></ul><ul><ul><li>space cost of link graph </li></ul></ul><ul><ul><li>average additive error </li></ul></ul><ul><ul><li>local overhead ratio= </li></ul></ul><ul><li>Competitor </li></ul><ul><ul><li>LP-based Edge weight anonymization in ICDE 2010 </li></ul></ul><ul><li>Datasets: </li></ul>Time cost with cloud server Time cost without cloud server
    25. 25. Results related with exact answers <ul><li>Scalability </li></ul><ul><ul><li>Better than LP based method </li></ul></ul><ul><li>Impact of increase of d </li></ul><ul><ul><li>Strengthen security of outsourced graphs </li></ul></ul><ul><ul><li>Increase the transformation time cost, the space cost of the link graph </li></ul></ul>
    26. 26. Results related with exact answers (cont.) <ul><li>Benefit function </li></ul><ul><ul><li>Vertex pair based method works better </li></ul></ul><ul><li>Local overhead ratio </li></ul><ul><ul><li>Very low </li></ul></ul><ul><ul><li>Goes down with the increase of graph size </li></ul></ul>
    27. 27. Results related with approximate answers <ul><li>Scalability </li></ul><ul><ul><li>Support large graph </li></ul></ul><ul><li>Impact of increase of error bound </li></ul><ul><ul><li>Decrease of space cost and time cost in outsourcing </li></ul></ul>
    28. 28. Results related with approximate answers(cont.) <ul><li>Additive error bound </li></ul><ul><ul><li>Achieves the given additive error quite well </li></ul></ul><ul><li>Local overhead ratio </li></ul><ul><ul><li>Declines with the increase of nodes </li></ul></ul>
    29. 29. Outline <ul><li>Motivation </li></ul><ul><li>Related work </li></ul><ul><li>Our solution </li></ul><ul><ul><li>1-neighborhood-d-radius graph </li></ul></ul><ul><ul><li>Graph transformation with exact answer </li></ul></ul><ul><ul><li>Graph transformation with approximate answer </li></ul></ul><ul><li>Experiment </li></ul><ul><li>Conclusion & Future work </li></ul>
    30. 30. Conclusion & Future work <ul><li>Conclusion: </li></ul><ul><ul><li>A 1-neighbourhood-d- radius security model </li></ul></ul><ul><ul><li>A greedy method to transform graph with exact answer </li></ul></ul><ul><ul><li>A method to transform graph with approximate answer </li></ul></ul><ul><ul><li>Extensive experimental results on real and synthetic data </li></ul></ul><ul><li>Future work: </li></ul><ul><ul><li>More graph operations. </li></ul></ul><ul><ul><li>Stronger security model </li></ul></ul><ul><ul><li>Incremental graph outsourcing over dynamic graphs </li></ul></ul>
    31. 31. <ul><ul><li>[email_address] </li></ul></ul>