Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Privacy-Preserving Schema Reuse

444 views

Published on

Authors: Nguyen Quoc Viet Hung, Do Son Thanh, Nguyen Thanh Tam, and Karl Aberer; EPFL, Switzerland

  • Be the first to comment

  • Be the first to like this

Privacy-Preserving Schema Reuse

  1. 1. Privacy-Preserving Schema Reuse Nguyen Quoc Viet Hung, Do Son Thanh, Nguyen Thanh Tam, and Karl Aberer EPFL, Switzerland
  2. 2. Schema Reuse Query Output Contribute Query Output Contribute schema.org factual.com Traditional approach: shows all original schemas Our approach: shows an anonymized (unified) schema DASFAA Security, privacy & trust DASFAA | 04.2014 2
  3. 3. Motivation • Schema Reuse offers many benefits: – Reduce development complexity: • New schemas require small modifications  copy and adapt existing schemas • Large repositories exist: schema.org, freebase.com, factual.com, niem.gov – Increase the interoperability: • Share common standard • But, privacy needs to be considered: – Leak schema information  Potential attack (e.g. SQL injection) – Maintain competitiveness: some parts of schemas are the source of revenue and business strategy. DASFAA Security, privacy & trust DASFAA | 04.2014 3
  4. 4. Challenges • How to define privacy constraints? • How to define an anonymized schema from multiple schemas? • How to define a utility function for a certain anonymized schema? • How to find an anonymized schema that satisfies privacy constraints and maximizes the utility function? Query Anonymized Schema Privacy constraints Contributors Our approach: shows an anonymized (unified) schema DASFAA Security, privacy & trust DASFAA | 04.2014 4
  5. 5. Challenge 1 – Define privacy constraints • Need to identify two elements – Sensitive information • Attributes – Privacy requirement • Prevent leaking provenance of sensitive attributes • Use presence constraint: A presence constraint ߛ is a triple ൏ ݏ, ܦ, ߠ ൐, where ݏ is a schema, ܦ is a set of attributes, and ߠ is a specified threshold. An anonymized schema ܵ෡ satisfies the presence constraint ߛ if ܲݎ ܦ ∈ ݏ ܵ෡ ሻ ൑ ߠ. DASFAA Security, privacy & trust DASFAA | 04.2014 5
  6. 6. Challenge 2 – Define anonymized schema • How to define “anonymized schema” given a set of schemas – Enough information to understand but not overwhelming • Anonymized schema contains a set of “abstract” attributes – Abstract attribute is a set similar attributes … Original schemas Name Num Name CC Holder CC {Name, Holder} {CC, Num} Anonymized schema Abstract attribute DASFAA Security, privacy & trust DASFAA | 04.2014 6
  7. 7. Challenge 3 – Define utility function • How to define utility function for a certain “anonymized schema” – Importance: sum of popularity of attributes • A schema that contains more popular attributes is better • An attribute that appears in more schemas is more popular – Completeness: number of abstract attributes • The more abstract attributes, the better Let Σ be the set of all possible anonymized schemas. The utility function ݑ: Σ → Թ measures a mount of information of each anonymized schema. ? ൌ ݅݉݌݋ݎݐܽ݊ܿ݁ ܵመ ൅ ݓ݄݁݅݃ݐ ∗ ܿ݋݉݌݈݁ݐ݁݊݁ݏݏሺܵመ ሻ {Holder} {CC} Utility function: ݑ ܵመ {Holder} {Name, Holder} {CC, Num} Importance Completeness S1 S2 S3 DASFAA Security, privacy & trust DASFAA | 04.2014 7
  8. 8. Challenge 4 – Optimization problem (1) Maximizing Anonymized Schema Given a schema group ܵ and a set of privacy constraints ߁, construct an anonymized schema ܵ∗ such that ܵ∗ satisfies all constraints ߁ and has the utility value. • NP‐Hard problem … DASFAA Security, privacy & trust DASFAA | 04.2014 8
  9. 9. Challenge 4 – Optimization problem (2) • Problem modeling – Schema group: Affinity matrix – Anonymized schema: Affinity instance • Affinity instance is an affinity matrix with some empty cells ݏଵ a1 a2 Affinity matrix Anonymized schema DASFAA Security, privacy & trust DASFAA | 04.2014 9 b1 b2 c1 c2 a1 b1 c1 a2 b2 c2 {a1, b1} {a2, b2,c2} a1 b1 a2 b2 c2 a1 b1 c1 b2 … = = Affinity instance {a1, b1,c1} ݏ { b2} ଶ ݏଷ  Need to find an affinity instance satisfying privacy constraints and having highest utility value
  10. 10. Challenge 4 – Optimization problem (4) • Overall solution: – Meta‐heuristic with 2 steps • Greedy algorithm: find a possible solution • Randomized local search: find optimal solution – Improve performance • Divide and conquer: partition the set of constraints into independent sets  satisfy each set independently DASFAA Security, privacy & trust DASFAA | 04.2014 10
  11. 11. Experiments - Setting Datasets: • Real data: 117 schemas • Synthetic data: vary the number of schemas and the number of attributes Evaluation Metrics: – Utility loss: measures the amount of utility reduction w.r.t the existence of privacy constraints • Δݑ ൌ ௨∅ି௨౳ ௨∅ where u∅ is utility without constraints, ݑ୻ is utility with a set of constraints Γ – Privacy loss: measures the amount of disagreement between actual privacy ܲ ൌ ሼ௜ ݌ሽ and expected privacy Θ ൌ ሼ௜ ߠሽ. • Δ݌ ൌ ܭܮ ܲ ∥ Θ ൌ Σ ݌௜ log ௣೔ ఏ೔ ௜ DASFAA Security, privacy & trust DASFAA | 04.2014 11
  12. 12. Experiments – Computation Time • 100 schemas, 50 attributes, 1500 constraints  running time is about 6s Computation Time (log2 of msec.) DASFAA Security, privacy & trust DASFAA | 04.2014 12
  13. 13. Experiment – Privacy & Utility • Validate the trade‐off between privacy and utility • Evaluation procedure – Relax constraint: increase privacy threshold θ to 1 ൅ ݎ ߠ , ݎ is relaxing ratio • Observation – The higher privacy you enforce, the more the utility loss. Both utility loss and privacy loss are normalized to [0,1] Δݑ ൌ Δݑ െ ݉݅݊Δ௨ ݉ܽݔΔ௨ െ ݉݅݊Δ௨ Δ݌ ൌ Δ݌ െ ݉݅݊Δ௣ ݉ܽݔΔ௣ െ ݉݅݊Δ௣ DASFAA Security, privacy & trust DASFAA | 04.2014 13
  14. 14. Conclusion  Introduced schema reuse with privacy constraints  Defined privacy constraints  Defined an anonymized schema from multiple schemas  Defined a utility function for a certain anonymized schema  Constructed an anonymized schema that satisfies privacy constraints and maximizes the utility function DASFAA Security, privacy & trust DASFAA | 04.2014 14
  15. 15. Thank you! Questions

×