A Structure Preserving Approach for Securing XML Documents

880 views

Published on

The talk I did at TrustCol 2007, NY, USA

Published in: Technology
  • Be the first to comment

A Structure Preserving Approach for Securing XML Documents

  1. 1. A Structure Preserving Approach for Securing XML Documents TrustCol-2007 The Department of Computer Science Purdue University Mohamed Nabeel nabeel@cs.purdue.edu
  2. 2. Outline • Introduction and Basic Concepts • Annotation and Encoding Scheme • Enforcing and Verifying Security Requirements • Experimental Results • Conclusion and Future Work
  3. 3. Secure Sharing • Hierarchical Data such as XML • Correct Data • Access Control B Bob A E F B C D K L E F G H I J D K L Alice I J
  4. 4. Secure Sharing – Access Control Apply Access Control Policy A B B C D E F G H I J E F Bob K L K L
  5. 5. Secure Sharing – Correct Data Bob Eve has modified B the values E X A Y L B C D Eve E F G H I J B K L E F Eve has dropped K L elements
  6. 6. Why Preserving Structure • Partial access to secured documents • Applying content filters • Querying secured documents Late Processing High Scalability
  7. 7. Message Level Security • P2P vs. E2E – Transport level security (HTTPS, IPSec, etc) is sufficient to provide P2P security – But E2E requires more than TLS – We need message level security P2P Source Intermediary Destination E2E
  8. 8. Typical Distributed Setting • Three-tier architecture Document Source(s) Intermediaries Clients Scalable Systems Message Level Security
  9. 9. XML Node Orderings • Two types of ordering 1.Hierarchical ordering 2.Sibling ordering • What orderings are significant? • What is the relationship between them? • How does schema validation tools treat these orderings?
  10. 10. XML Node Orderings • Is Hierarchical ordering significant? – Yes, It is! • Is Sibling ordering significant? – Depends on the application Two orderings Two-level structural integrity
  11. 11. XML Node Orderings <Review> <Review> <p>Einstein is a <p>Einstein is a <b>genius</b>; <b>ordinary</b>; <b>ordinary</b> <b>genius</b> people may not understand his work.</p> people may not understand his work.</p> </Review> </Review> XSLT XSLT Einstein is a genius; ordinary people may Einstein is a ordinary; genius people may not understand his work. not understand his work. Sibling ordering in document centric applications is significant
  12. 12. XML Node Orderings person table <person> firstname country major <firstname>nabeel</firstname> nabeel sri lanka cs <country>sri lanka</country> <major>cs</major> <person> <person> Class Person { <country>sri lanka</country> String firstname; <firstname>nabeel</firstname> String country; <major>cs</major> String major; <person> }; Sibling ordering in data centric applications may not be significant
  13. 13. Information Leakage Direct Leakage Indirect Leakage No Leakage A A Key K2 B B C D Key K1 B C D E F E F G H I J E F G H I J K L K L K L Bob only knows K1 Hiding the existence No Information Leakage
  14. 14. One Example • Delta-publishing Delta-Message at t2 First Message at t1 Second Message at t2 The smallest unit of change: An Element
  15. 15. Our Approach • Recognize two level-ordering • Provide E2E security for hierarchical data • Reason about security at the smallest possible change • Minimal indirect information leakage
  16. 16. Next • Introduction and Basic Concepts • Annotation and Encoding Scheme • Enforcing and Verifying Security Requirements • Experimental Results • Conclusion and Future Work
  17. 17. XML Document • A Graph G = { V, v, E, f, g} – V = Ve U Va U Vr where Ve = {x | x is an element}, Va = {x | x is an attribute}, Vr = {x | x is a node not in Ve U Va} – v = document root – E = Ee U Ea U Er where Ee = {e | e is an edge representing an element-element connection or a link} , Ea = {e | e is an edge representing an element-attribute connection}, Er = {e | e is an edge not in Ee U Ea but starts from an element} – f:E  L where L = {l | l is a node name or an attribute name or a pre-defined label}, f is called the labeling function – g:(Ve, i)  Ver where g returns the ith child of Ve, Ver = Ve U Vr
  18. 18. XML Document • Example <?xml version=“1.0” encoding=“UTF-8” ?> <quote type =„bid‟> <market>NY</market> v <price cur=„USD‟ size=5m>750</price> quote </quote> bid type USD Circles – elements market price cur Squares – attributes Ellipse - other text text size 5m NY 750
  19. 19. Properties of the Annotation Scheme • Two independent annotation schemes for – Hierarchical ordering and – Sibling ordering • Time complexity = O( height of the XML DOM tree) • Provides the flexibility to incrementally annotate
  20. 20. Concurrent Visitor Pattern
  21. 21. Hierarchical Ordering • Should be able to unambiguously identify parent-child relationships • Annotate each element with its parent HID • Element HID‟s need not be unique • Example: using XPath as HID‟s – Element x is the parent of y – Annotate y with h(XPx || name of y), where h is a collision-resistant hash function and XPx is the XPath of x. XPath sequencing numbers are not used to prevent indirect Information leakage.
  22. 22. Sibling Ordering • Maintain the following condition – Given that elements x and y are siblings and x is to the left of y, seqx < seqy where seqx and seqy are secure random numbers assigned to x and y respectively. Secure random numbers make inferring about hidden elements difficult, thus preventing indirect information leakage.
  23. 23. Encoding Scheme v v quote quote bid bid type type market price USD USD market price cur NY cur content size size text text 5m content 5m NY 750 750 Elements and non-elements Only elements High reduction in |V| and |E| for document-centric applications.
  24. 24. Encoding Scheme • New Graph G‟ = { V‟, v, E‟, f’, g’} • V‟ = V U {x | x is an attribute for ID, seq or content} - Vr • E‟ = E U {e | e is an attribute-element from ID, seq or content} - Er • f‟:V‟  L‟ where L‟ = L U {ID, seq, content} • g‟:{Ve, i}  Ve where Ve consists only of elements
  25. 25. Next • Introduction and Basic Concepts • Annotation and Encoding Scheme • Enforcing and Verifying Security Requirements • Experimental Results • Conclusion and Future Work
  26. 26. Integrity • Two types of integrity – Structural integrity – Content integrity • Introduce a new attribute (signed) • Attribute value = h(E.attrs || E.content) – h – hash function – E.attrs - concatination of attribute name-value pairs of element E – E.content – content of element E • Merkle hash vs. Our approach
  27. 27. Integrity A Content Integrity is B B C D violated E X E F G H I J Y L Sibling Integrity is K L Bob receives.. violated B Completeness Hierarchical B B is violated Integrity is violated E F L F E F L K K E K L
  28. 28. Confidentiality • Content of each element is encrypted • Introduce a new attribute (encrypted) • Attribute value = keys(keyr||keyr (E.attrs || E.content || E.signed)) – keyr – a randomly generated key – keys – shared key – E.attrs – concatination of attribute name-value pairs of element E – E.content – content of element E – E.signed – digital signature computed for E
  29. 29. Verifying and Updating • Each element can be verified independently • Hierarchical and Sibling integrity can be verified independently • Each element can be updated independently • Structure can be updated without affecting the existing values
  30. 30. Example: Updating <?xml version=“1.0” encoding=“UTF-8” ?> <quote type =„bid‟> <market>NY</market> <price cur=„USD‟ size=5m>765</price> v </quote> X quote signed Re-calculate signed and encrypted X encrypted attributes only for this element market price signed X encrypted signed encrypted X X X
  31. 31. Next • Introduction and Basic Concepts • Annotation and Encoding Scheme • Enforcing and Verifying Security Requirements • Experimental Results • Conclusion and Future Work
  32. 32. Global vs. Local Annotation Local Annotation Global Annotation 400 Time taken to annotate (ms) 350 300 250 200 150 100 50 0 1 2 3 4 5 6 7 8 Number of Elements in the XML document (in 500)
  33. 33. Updating XML Document Our Scheme W3C Scheme 800 700 600 Time taken (ms) 500 400 300 200 100 0 1 2 3 4 5 Percentage of the Document Updated
  34. 34. Division of Labor encoding signing encrypting 45000 40000 35000 30000 Time Taken 25000 20000 15000 10000 5000 0 1 2 3 4 5 6 7 8 Number of Elements in the XML Document (in 500)
  35. 35. Outline • Introduction and Basic Concepts • Annotation and Encoding Scheme • Enforcing and Verifying Security Requirements • Implementation and Experimental Results • Conclusion and Future Work
  36. 36. Conclusion and Future Work • We presented an interesting approach to secure XML documents while preserving the structure • We plan to extend the work presented to – Explore ways to reduce the signing time – Explore possible hybrid combinations of our approach and the standard approach • We are planning to publish the library under ASF license
  37. 37. Questions
  38. 38. Thank You!
  39. 39. Merkle Hash
  40. 40. Visitor Pattern
  41. 41. W3C Digital Signature

×