Security on the Web: A Semantic-Aware 
Authorization Framework 
for Secure Data Sharing 
PhD Dissertation Defense 
July 7, 2008 
Amit Jain 
Research Advisor: Dr. Csilla Farkas 
Center for Information Assurance Engineering 
Department of Computer Science & Engineering 
University of South Carolina
Presentation Agenda 
2 
¯ Introduction: 
ª Web From Past to Present 
ª Future Trends 
¯ Background 
ª Research Challenges 
² Security Reliance on XML Syntax 
² XML to RDF ontology mappings 
² RDF Ontology Security 
¯ Proposed Solution 
ª Contribution: Semantic Aware Secure Data Sharing Framework 
ª RDF Authorization Model 
ª Semantic Mappings between XML and domain Ontologies 
ª Authorization Policies derivation for XML data 
¯ Prototype 
¯ Conclusion & Future Work 
¯ References
Introduction 
3 
¯ Introduction: 
ª Web From Past to Present 
ª Future Trends 
¯ Background 
ª Research Challenges 
² Security Reliance on XML Syntax 
² XML to RDF ontology mappings 
² RDF Ontology Security 
¯ Proposed Solution 
ª Contribution: Semantic Aware Secure Data Sharing Framework 
ª RDF Authorization Model 
ª Semantic Mappings between XML and domain Ontologies 
ª Authorization Policies derivation for XML data 
¯ Prototype 
¯ Conclusion & Future Work 
¯ References
State of Security 
¯ Security is usually an afterthought in application 
development 
ª Recent data breaches and ID theft 
ª Unauthorized access to secure information 
¯ Development of an authorization model requires 
ª Knowledge of the data model 
ª Existing access control models 
ª The inadequacies of the existing works 
ª Software Application development trend 
understanding 
4
Security Terms: 
5 
¯ Authentication 
ª "where does this (part of a) message come from?" 
¯ Authorization (access control) 
ª "may this message be disclosed to the 
requesting party?" 
¯ Confidentiality 
ª "who can read this (part of a) message?" 
¯ Integrity 
ª "has this (part of a) message been tampered with?" 
¯ Audit 
ª "what happened?" 
¯ Administration 
ª "how do I manage this?"
Software Application Evolution 
6 
¯ Past Web 
ª Static HTML Web Pages – rendering focus 
ª Data consumed by Humans 
¯ Move towards Web Applications 
ª Web Apps are the trend [1] 
ª Wide audience and reach 
ª Thin clients, only browser required 
¯ Successful trend making applications 
ª Google search engine 
ª Mash Ups 
ª Social Networking Applications (MySpace, Facebook) 
ª Multimedia sharing Applications (YouTube, Slideshare)
Web Applications 
¯ Some Web Architectures 
7 
ª Web Services 
ª Web 2.0 
ª Semantic Web
Future Web Applications 
Characteristics 
¯ From Static to Data Centric & Automated 
¯ Data & Information Sharing 
ª Enterprise applications sharing data on the web 
ª Data exchanged by service oriented applications 
ª Reconciliation/ Interoperation of distributed data 
8 
¯ Web with a meaning 
ª Data & Information annotated with Semantics 
ª Machine-understandable information 
ª Intelligent Software & Agents 
ª Automated usage
Open Data Standards 
¯ Some data sharing initiatives underway 
ª Data portability - taking person’s data and 
friends from one site to another. 
(DataPortability.org) 
ª OpenID- portable identity; single sign-on 
ª OpenSocial - Google initiative for social 
networks, enabling developers to create widgets 
with one set of code; MySpace, Facebook 
ª APML - growing ‘Attention’ standard; Person’s 
Attention Data is all the information online about 
what one reads, writes, shares and consumes 
9
Web Services Architecture 
¯ Loosely coupled 
¯ Application performs a function and exposed to network 
ª Flight Departure Service 
ª Geo-Location Service 
ª Flight Departure Monitoring Service 
ª Data Transformation-Interchange Service 
¯ Web Services advertise & communicate using standard protocols 
ª WSDL (Web Service Description Language) 
ª SOAP (Simple Object Access Protocol) 
¯ Applications assembled from services dynamically 
¯ Uses XML for data format 
¯ Exchange data and results 
¯ Platform Independent & Language Neutral 
10
Web Services 
11 
Registry 
Backend Database 
Web Service 1 Web Service 2
A Web Service Request – XML 
Format 
<?xml version=”1.0”?> 
<SOAP-ENV:Envelope xmlns:SOAP-ENV=”http://schemas.xmlsoap.org/ 
12 
soap/envelope/”> 
<SOAP-ENV:Body> 
<s:GetWeatherForecasxmlns:s=“http:// 
www.WeatherService.com/”> 
<!--Parameters passed with the method call Like ZIP CODE--> 
</s:GetWeatherForecast> 
</SOAP-ENV:Body> 
</SOAP-ENV:Envelope>
Semantic Web 
¯ Machines talking to machines 
¯ Making the Web more 'intelligent’ 
¯ Bottom Up = annotate, metadata, RDF! 
¯ Top Down = Simple 
13 
Image credit: dullhunk 
Top-down: 
• Leverage existing web 
information 
• Apply specific, vertical 
semantic knowledge 
• Deliver the results as a 
consumer-centric web 
app
Semantic Apps 
What is a Semantic App? 
- Not necessarily W3C Semantic Web 
- An app that determines the meaning of text and other data, and then creates 
connections for users 
- Data portability and connectibility are keys (ref: Nova Spivack) 
Example: Calais 
Reuters, the international business 
and financial news giant, launched an 
API called Open Calais in Feb 08. 
The API does a semantic markup on 
unstructured HTML documents - 
recognizing people, places, 
companies, and events. 
Ref: 
Reuters Wants The World To Be 
Tagged; Alex Iskold, ReadWriteWeb, 
Feb 08 
14
More Semantic Apps 
15 
Other Products to watch: 
¯ Twine 
¯ Freeset 
¯ Powerset 
¯ Talis 
¯ TrueKnowledge 
¯ AdaptiveBlue 
¯ TripIt 
¯ Spock 
¯ Quintura 
¯ Hakia 
Ref: 10 Semantic Apps to Watch; Richard MacManus, 
ReadWriteWeb, Nov 07
Semantic Web - Ontologies 
16 
¯ Ontologies 
ª Represent Semantics of data in a domain 
ª Enables Knowledge Management 
ª Consist of resources, their attributes and 
relationships 
ª Languages: 
² Resource Description Framework (RDF) 
² Web Ontology Language (OWL)
Information Management & 
Security Issues 
17 
¯ Information Integration 
ª How to reconcile data from disparate sources? 
¯ Mediation Layer 
ª How to provide a global view of local data 
sources? 
ª Performance, Restructuring, Mapping Issues? 
¯ Security Issues 
ª Securely share data among applications/agents? 
ª Accountability? 
ª Trust among interacting agents?
Research Problem: Data & 
Information Security 
18 
¯ Introduction: 
ª Web From Past to Present 
ª Future Trends 
¯ Background 
ª Research Challenges 
² Security Reliance on XML Syntax 
² XML to RDF ontology mappings 
² RDF Ontology Security 
¯ Proposed Solution 
ª Contribution: Semantic Aware Secure Data Sharing Framework 
ª RDF Authorization Model 
ª Semantic Mappings between XML and domain Ontologies 
ª Authorization Policies derivation for XML data 
¯ Prototype 
¯ Conclusion & Future Work 
¯ References
Framework 
XML – RDF Mappings 
Propagation 
Policy & 
Conflict 
Resolution 
Secure 
XML View 
Generation 
XML Access 
Control 
XML Authorization Component 
RDF Authorization 
Component 
RDF Security 
Cover 
RDF Data 
RDF security 
Policies 
XML Data 
Semantic 
Mappings: 
XML -> RDF
SOA/Corporate Data 
Restructuring 
¯ XML data restructuring / remapping 
ª Requirement in several cases like BI, Dynamic Web 
Service Composition 
ª AquaLogic based on Xquery 
ª May be done on the fly for data interchange with partners 
unseen before 
20 
¯ Corporate Merging 
ª Data from different heterogeneous schemas 
ª How to decide security permissions for them ? 
¯ XML data creation from legacy systems 
ª Nodes with non-meaningful labels may be given 
inconsistent security
Web Service Data Sharing Scenario 
XML XML 
Insurance Company 
21 
Hospital 
XML RDB 
Insurance Company DB 
Health 
Provider
Data Sharing Scenario contd. 
Patient 
Illness Health Records 
XML 1 XML 2 
<WS1, Patient/MedicalData/Illness, TS > <WS2, Patient/Data/HealthRecords/Drug, S > 
23 
Data 
- 
Patient 
Medical Data 
Prescription 
Personal Data 
BirthDate 
SSN 
Diagnosis 
Information 
SSN BirthDate 
Drug 
<WS1, Patient/MedicalData/Prescription, TS> <WS2, Patient/Data/HealthRecords/Diagnosis, S > 
Policy 1 Policy 2
Existing XML Security Methods 
¯ Web Data Security & Assurance 
¯ Several Security Standards Available 
ª XML Signature 
ª XML Encryption 
ª XML Access Control Models 
² Bertino et al, Kudo et al, Damiani et al, etc. 
24 
¯ Web Services Security 
ª WS-Security 
ª SAML 
ª WS-*
Research Problem 1: 
¯ How can XML data be restructured in a way that 
the security policies are unaffected and no data 
leak occurs. 
¯ Intuitive Solution: Use the meaning/semantics of 
the data while restructuring 
¯ Is it possible to find a standardized way to express 
the intended semantics of XML to be used for inter-operation 
25 
and security?
RDF : Information Overload 
Management solution 
¯ Use RDF to represent semantics embedded in XML 
documents. 
¯ RDF (Resource Description Framework): seen as 
the solution for Information Overload 
¯ Provides Universal Sharing (using URI) 
¯ Syntax independent 
¯ Provides Semantics 
¯ Can be used to say anything about anything 
¯ Critical tool in Linking data (Not just the documents) 
between the applications on the data web 
26
Commercial RDF Applications 
¯ NASA uses it extensively for document management 
¯ Companies are moving towards a RDF backed data creation 
and social networking apps 
¯ Millions of data triples asserted to bootstrap the knowledge 
sources (DbPedia, Twine, Freebase) 
ª Freebase stores millions of information entities in RDF 
format [2] 
ª TWINE: RDF based social networking application [3] 
¯ Several Government Agencies are using RDF 
¯ Social Networking Graphs 
ª Each person is a data generation warehouse 
27
Securing RDF 
¯ Commercial RDF applications underscore the 
security need 
¯ Agencies sharing their meta information RDF 
models: Need to securely share partial data 
¯ Providing users with a fine grained access control 
to their own data: Still missing from all the 
applications 
28
RDF characteristics 
¯ Properties provide link between the connected 
entities 
¯ Subclasses, Sub properties need to be considered 
¯ Type to Class relationships 
¯ Same resource may have different security 
requirements in different roles 
¯ Entailments need to be considered 
29
Database Integration 
¯ Usually deals with the view of legacy databases 
integrated into a view 
¯ Focuses on the tuples at the instance level 
¯ RDF deals with the conceptual schema 
ª Classes and their properties 
ª Instances 
¯ RDF can be use as a conceptual layer on top of 
database integrated views 
30
RDF Resource Associations 
Business 
Entity 
Commercial 
Corporate 
Start Up 
ExternalFunded 
Fusion 
Capabilities 
contains 
Facilities Fully Funded 
Labs 
Military Research 
Wing 
Research 
Company 
Chemicals 
Consumes 
Consumes 
XYZ Uranium 
subclass 
subclass 
rdf:type 
rdf:type 
S 
S 
S 
S 
P 
P 
P 
P 
RDF Schema 
RDF Instance S 
Toxic 
are 
P 
P
Business 
Entity 
Commercial 
Corporate 
Start Up 
RDF Entailments 
ExternalFunded 
Fusion 
Capabilities 
contains 
S 
S 
S 
Facilities Fully Funded 
Labs 
Military Research 
Wing 
Research 
Company 
S 
Chemicals 
Consumes 
Consumes 
S 
XYZ Uranium 
subclass 
subclass 
rdf:type rdf:type 
S 
S 
S 
P 
P 
P 
P 
P 
P 
P 
P 
RDF Schema 
P S RDF Instance
Research Problem 2 
¯ How can RDF data be assigned authorizations that 
considers its semantic and entailment requirements 
¯ How can I express security requirements for RDF 
ontology data such that the model is 
ª Syntax independent 
ª Considers RDF semantics 
ª Incorporates entailment? 
33
Existing Literature 
34 
¯ RDF security 
ª Qin and Atluri [4]: A concept-level access control for web data, where 
access control is defined on ontological concepts and instances of 
these concepts inherit the access control of the concepts they belong 
to. 
ª Finin et al. [5]: A policy based access control model for RDF data in 
an RDF store. Provides control over the different action modes 
supported by the RDF store like inserting a set of triples, deleting a 
triple, and querying a triple. 
ª Dietzold and Auer [6]: An access control model for RDF Triple Wikis. 
Their model allows the specification of custom rules that can be used 
for securing access to the store. 
ª Kaushik et al. [7]: A logic based policy language for securing full or 
partial ontologies.
Research Problem 3: Semantics 
Definition and Security Derivation 
¯ Based on Previous Research Problems 
ª How to use RDF to provide authorization 
permissions for XML 
¯ Intuitive Solution: Derive mappings between the 
XML and RDF data and use the mappings to 
enforce security policies from RDF domain 
ontologies to secure XML data in a syntax 
independent way? 
35
Existing Literature 
¯ Semantic Data Integration 
36 
ª Xiao & Cruz [8]: 
² An ontology-based approach for integration of 
heterogeneous XML sources. 
² Converts XML data sources into a RDF ontology. 
Local ontologies are merged to create a global 
ontology. 
ª Several Engineering based works 
² Gloze [9] 
² WEESA [10] 
² Try to induct semantics in XML based on discovery.
Characteristics & Shortcomings of 
Current Security Methods 
¯ XML is the de facto standard 
ª No semantics embedded in XML 
ª XML Security Methods Syntax Oriented 
ª No Data or application semantics 
ª Syntax tampering allows unauthorized access 
37 
¯ No Association Protection 
ª Data appearing together might need to be classified 
¯ No Entailment Consideration 
ª Data entailments may allow incorrect security labeling 
¯ No Protection Mechanism for Syntax Independent 
Ontologies
Proposed Research: Semantic Aware 
Secure Data Sharing Framework 
38 
¯ Introduction: 
ª Web From Past to Present 
ª Future Trends 
¯ Background 
ª Research Challenges 
² Security Reliance on XML Syntax 
² XML to RDF ontology mappings 
² RDF Ontology Security 
¯ Proposed Solution 
ª Contribution: Semantic Aware Secure Data Sharing Framework 
ª RDF Authorization Model 
ª Semantic Mappings between XML and domain Ontologies 
ª Authorization Policies derivation for XML data 
¯ Prototype 
¯ Conclusion & Future Work 
¯ References
Research Goal: Semantic Aware 
XML Access Control Model 
¯ Develop a Comprehensive S ecurity Framework 
ª Works in open and distributed environment settings 
ª Use of data & application semantics for data sharing 
ª Flexible Security Policies (Fine Granularity) 
ª Provides Security on Metadata (Semantics) 
ª Semantically enhancement of XML web data 
ª Provides access control independent of the XML data 
syntax 
ª Has Properties like completeness and consistency 
39
RDF Security Policy 
¯ Typical Security Policy Components: 
ª (Subject, Object, Privilege/Security label) 
ª <s,o,±pri/sl> 
¯ RDF security object / pattern [x, y, z] 
¯ Security classification ([x, y, z], TS) 
¯ Security Objects Subsumption 
¯ Association protection 
¯ Fine grained – Individual elements/two elements 
¯ Policies in RDF format 
40
Pattern Mapping 
¯ Pattern mapping from an RDF triple to a group of 
triples 
ª Generates Security Cover 
ª Conflict Resolution 
ª Consistent security labeling 
41
Security Cover 
¯ Materialized view of the secure RDF/S database 
¯ Consists of pairs (t,sl) 
42 
ª Minimal 
ª Complete
RDF Security Architecture 
43 
Simple Conflict Data 
Resolution 
Inference Engine 
Inference Rules 
(JENA) 
Inference Conflict 
Resolution 
RDF/S 
Security 
Cover 
Entailed 
Security Cover 
RDF/S Native Database 
Rules Security 
Policies 
Policies Database 
Analysis 
History 
Querying & Security Monitor 
Query 
Denial 
Answer 
Forwarded Query 
Returned Query Results
Mapping a Default Policy 
44 
¯ Default policy 
² ([x, y, z], TS) 
(Student, rdfs:subClassOf, Person) 
(University, rdfs:subClassOf, GovAgency) 
(studiesAt, rdfs:domain, Student) 
(studiesAt, rdfs:range,University) 
(John, studiesAt, USC)
Simple Conflict Resolution 
¯ Subsuming patterns have less restrictive security 
classifications 
¯ Based on the “more restrictive takes precedence” 
resolution 
45
Conflict resolution: Pattern 
Mapping 
46 
¯ Conflict Resolution 
ª ([Student, studiesAt, University], P) 
ª ([John, studiesAt, USC], S) 
P 
S 
( [John, studiesAt, USC], 
S)
RDF Reification 
S S S 
studiesAt 
John USC 
rdf:subject 
confidence madeOnDate 
madeBy 
rdf:type 
stmt1 
rdf:Statement 
rdf:object 
rdf:predicate 
High 07/07/2008 
Mark 
S 
S 
S 
S 
P 
P
RDF/S entailment 
¯ RDF/S triples entailment 
ª Inference rules application on data to infer new 
triples 
ª Generated triples are assigned security labels 
ª Inference Conflict Resolution 
48
Inference Conflict Resolution 
¯ The generated triple may already exist 
49 
ª A higher security 
¯ The policy may require existing triple be classified 
at a higher level
Semantic Mappings: From XML to 
RDF Ontologies 
¯ Establish mappings between XML & RDF 
¯ Semantic Enhancement of XML using the 
mappings 
¯ Good database design entails an ER schema as 
the starting point 
¯ Uses ER as the intermediate semantic 
representation model 
¯ Define mappings between an XML data and ER 
conceptual Model 
50
Conceptual 
Schema 1 
Relational 
Database 
Schema 1 
α1 α2 α3 α4 α5 αn 
XML1 
Conceptual 
Schema 2 
Relational 
Database 
Schema 2 
XML2 XML3 
XML4 XML5 
Conceptual 
Schema m 
Relational 
Database 
Schema m 
XMLn
Equivalence Classes 
¯ Foreign keys in a relation point to the Primary 
Key entity 
¯ They should be mapped only once 
¯ Equivalence classes consist of the primary key, 
foreign keys pairs, relation schema and relational 
attributes 
52
Mapping Properties 
¯ Structure Preserving Tags (SPT) 
¯ Element ordering and cardinality constraints 
¯ Mapping Function: Many to One 
¯ The mappings are 
ª Complete: contains one pair (vi,CEi) for every 
node vi of the XML schema tree X 
ª Consistent: does not have two pairs (vi,CEi) and 
(vj ,CEj) such that vi = vj and CEi != CEj , i.e., 
there is a single XML node corresponding to an 
equivalence class. 
53
Conceptual 
Schema 1 
Relational 
Database 
Schema 1 
α1 α2 α3 α4 α5 αn 
XML1 
Conceptual 
Schema 2 
Relational 
Database 
Schema 2 
XML2 XML3 
XML4 XML5 
Conceptual 
Schema m 
Relational 
Database 
Schema m 
XMLn 
Federated Schema/ 
Meta-Ontology 
β1 β2 
βn
Mappings 
¯ Let X be the XML schema tree and O be the RDF ontology. The XML to 
RDF mappings are defined in the following way: 
• An XML leaf node vi is mapped to a RDF property p Є P, : p → (c1, 
c2) such that the datatype of the leaf node element { Ri.ai } 
corresponds to the object datatype RDF Class c2 Є C, 
• A non-leaf node vi with sub element nodes is mapped to a RDF class 
c Є C. 
• A non-leaf node vi with sub element nodes is mapped to a RDF 
property p Є P such that μ(vj) = cj , μ(vk) = ck and (p) = (cj , ck). Here 
vj and vk are the ancestor and descendant of node vi, respectively. 
• A pair of XML nodes (vi, vj) is mapped to an RDF triple [s, p, o] where 
ed(vi, vj) is an unlabeled edge in XML tree and (p) = rdf:type, 
rdfs:subClassOf, or rdfs:subPropertyOf. 
55
Mapping Properties 
¯ Structure Preserving Classes 
¯ XML-RDF mappings are 
ª Complete: contains one pair (vi, ci) or (vi, pi) for 
every XML node vi of the XML schema tree, i.e., 
each node is associated with an RDF class or 
property. 
ª Consistent: does not have two pairs (vi, ri) and 
(vj , rj) where ri is either an RDF property pi or 
class ci such that (vi = vj) and (ri != rj), i.e., for an 
equivalence class there is a single corresponding 
RDF class or property. 
56
Example : XML Ontology 
Mappings 
has 
RDF ONTOLOGY μ2 
57 
has 
has 
Address Person 
Date of Birth 
Patient 
μ1 
Unique 
Identifier 
Works_for 
Company 
Hospital Business 
has 
isa 
isa isa 
has Name 
has 
Health 
Records 
contains 
has 
Diseases 
Patient 
- 
Medical Data 
DOB 
Illness 
Patient 
Data 
Information 
BirthDate 
SSN 
Drug 
Prescription 
Prescription 
Disease 
Personal 
Data 
PID 
Records 
XML 1 XML 2
XML Authorizations : Policy 
derivation for XML Data 
¯ Apply XML to RDF mappings on RDF 
Authorizations to derive simple XML access control 
policies 
¯ Generated XML access control permissions have 
properties: 
58 
ª consistency, and 
ª completeness
XML authorizations contd. 
¯ XML policies are generated in the form of a pair 
with XPATH and a security label. 
¯ XML access control models can use it as an input 
for more fine grained policies 
¯ Use of meta policies like conflict resolution and 
propagation policies 
59
Mapping Example continued 
Patient 
Medical 
Data 
has Patient 
Medicines 
Taken 
Data 
Health 
Records 
Person 
has 
Medical 
Records 
contains 
Diseases 
TS TS TS TS 
Illness Prescription Diagnosis Drug 
XML 1 XML 2
Prototype 
61 
¯ Java 1.6 for platform 
¯ Jena 2.5 
ª Java RDF API for reading, writing, and 
manipulating RDF data 
¯ NG4J 
ª Named Graph for Jena 
ª Jena extension for providing a provenance to 
RDF triples 
ª Security Labels are stored as the context
RAF contd. 
¯ Apache Derby for storing RDF data 
ª Java based Relational database 
ª Schema managed through Jena 
62 
¯ Jena Rule Reasoner 
ª Inferencing model for entailment generation 
ª Applies the RDF/S entailment rule 
ª Can be used for applying business rules 
¯ ISAVIZ 
ª Graph Library for RDF/S display as a graph
RAF Admin 
¯ Load RDF files, RDFS files and policy files 
ª Multiple schema, instance and policies 
63 
¯ Execute 
ª Pattern mapping 
ª Security cover generation 
¯ Run the entailment and apply security labels 
¯ Display graphical display of the ontologies 
¯ Launch SPARQL query interface
RAF Prototype 
64
RAF contd. 
¯ SPARQL Query Interface 
ª SPARQL RDF query protocol for querying RDF 
data 
² Type query or choose from pre-built queries 
ª Users given MAC security clearances 
ª Username, password authentication 
ª Results displayed based on user security 
clearance 
¯ System messages display to warn of conflicts 
65
SPARQL Querying : Public 
66
Querying: Secret Clearance 
67
Querying : TopSecret clearance. 
68
Dissertation Contributions 
¯ Architecture for Semantic Aware Access Control Model for 
XML data 
¯ Formal properties of the Semantic Aware Access Control 
Model 
¯ Authorization Framework for securing RDF Data 
ª RDF security policy to RDF/S data mapping algorithm 
ª RDF Entailment procedure & algorithm to check for illegal 
inferences 
ª Formal Properties of RDF Authorization model 
¯ Semantically Enhancement of XML data 
ª XML to RDF ontology mapping definitions 
ª XML to RDF correspondences properties 
¯ XML Authorization Derivation 
ª Algorithm for propagating XML authorizations 
69
Conclusion 
¯ Web is tending towards real time automated data 
collaboration 
¯ Secure data sharing is a challenge 
¯ Inclusion of Semantics can help 
¯ Provide security for the XML data semantics 
represented by Ontologies 
¯ Map the XML data to the domain Ontologies 
¯ Use the mappings and ontology security policies to 
create authorization permissions for XML data 
70
Future Direction 
¯ Extend the model to handle Updates 
¯ Use of business rules for entailment 
¯ Extend the prototype to generate the XML 
authorization derivations 
¯ Performance and results with large data-sets 
¯ Use Policy Languages like Rei, Protune for a totally 
distributed ontology based authorization system 
¯ Comparison of Security Policies in XML format 
71
Future Direction 
¯ Authorization model for OWL 
ª Semantics of OWL properties 
¯ Extend the mapping from XML to other structure or 
semi structured data 
¯ Extend the direct mapping to more realistic 
scenario: 
ª property links, subclass and subproperty links 
between the mapped entities 
72
Mappings in a Global Scenario 
Ont1 Ont2 
WS1 
Data 
Global Mapping sever 
WS3 
Data 
υ1 
υ2 
Ontn 
υ 
m 
WSm 
Data 
υ2 
WS2 
Data 
Domain 
Ontologies 
Exchanged 
Web Data 
Security
My Publications 
¯ “ From XML to RDF: Syntax, Semantics, Security and Integrity” (with 
C. Farkas, V. Gowadia, and D. Roy), In Proceedings of IFIP TC-11 
WG 11.1 & WG 11.5 Joint Working Conference on Security 
Management, Integrity in Info Systems, Fairfax, Virginia, 2005 
¯ “Semantic-Aware Data Protection in Web Services” (with C. 
Farkas, D. Wijesekera, A. Singhal and B. Thuraisingham), In 
Proceedings of IEEE Workshop on Web Service Security, Oakland, 
California, 2006. 
74 
¯ “ 
Secure Resource Description Framework: an Access Control 
Model” (with C. Farkas), In Proceedings of SACMAT06, ACM 
Symposium on Access Control Models And Technologies), Lake- 
Tahoe, California, 2006. 
¯ "RDF Authorization Framework: Secure Data Sharing for Web 
Services", (with C. Farkas), Under Journal Revision. 
¯ “Secure Semantic Based Data Sharing in XML Web Services”, 
(with C. Farkas, D. Wijesekera, A. Singhal and B. Thuraisingham) 
Under Journal Review
References 
1. “Some Trends in Web Application Development”, Jazayeri, Mehdi. In 
Proceedings of : Future of Software Engineering, 2007. FOSE '07, USA, 
2. “Freebase data dumps”, Metaweb Technologies. http:// 
download.freebase.com/datadumps/, 2008 
3. “TWINE: The Smartest Way To Organize, Share and Discover Information 
About Your Interests”, Radar Networks, 2008. http://www.twine.com/. 
4. “Concept-level access control for the Semantic Web”. Li Qin and V. Atluri. 
In XMLSEC ’03: Proceedings of the 2003 ACM workshop on XML security, 
New York, NY, USA, 2003. ACM Press. 
5. “Policy-Based Access Control for an RDF Store”. P Reddivari, T. Finin, 
and A. Joshi. In Proceedings of the IJCAI-07 Workshop on Semantic Web 
for Collaborative Knowledge Acquisition, January 2007. 
6. “Access control on RDF triple stores from a semantic Wiki perspective”. S. 
Dietzold and S. Auer. volume 183 of CEUR Workshop Proceedings ISSN 
1613-0073, June 2006. 
7. “Policy-based dissemination of partial web-ontologies”. S. Kaushik, D. 
Wijesekera, and P. Ammann. In SWS ’05: Proceedings of the 2005 
workshop on SWS, New York, NY, USA, 2005. ACM Press. 
75
References 
8. “Integrating and Exchanging XML Data Using Ontologies”. H. Xiao, I. 
Cruz., J. Data Semantics VI 2006, 67-89. 
9. “Gloze: XML to RDF and back again”, Steve Battle, First Jena User 
76 
Conference, 2006. 
10. “WEESA: Web engineering for semantic web applications”, Gerald 
Reif, Harald Gall, and Mehdi Jazayeri, In Proceedings of the 14th 
International Conference on World Wide Web, pages 722–729, New 
York, NY, USA, 2005. ACM Press.
Questions 
77
Appendix 
Definitions 
78
RDF - Patterns 
¯ An RDF pattern pt, is a triple represented as pt = [r, 
p, v], where each component of the pattern is 
either 
ª A data constant such that r є R, p є PR, and v є 
R U L, or 
ª The symbol ”-” representing the empty element 
of the triple, or 
ª A variable represented as a symbol starting 
with ?, corresponding to any value for the triple 
element 
79
RDF Security Policy 
¯ The security policy SP is a set of pairs SP = 
{sp1, . . . , spn} U {spdef} such that every spi has the 
form (pti, sli) and λ(pti) = sli where pti is an RDF 
pattern, sli is a security label in SL and λ is the 
security labeling function. spdef = (ptdef , sldef ) 
represents the default policy where ptdef = [?x1, ? 
x2, ?x3] is a pattern with all variables and sldef is the 
default security label such that sldef !≥ sli & sldef !≤ sli 
for any sli in SL. 
80
RDF Pattern Mapping 
¯ Let pt = [r, p, v] and pt′ = [r′, p′, v′] be two RDF patterns and 
R be the set of Resources. Let ST and DT be the RDF 
Schema and Instance respectively. For all pattern elements 
e and e′ where e is either r,p, or v and e′ is either r’,p’, or v’ 
respectively, the pattern mapping ν: pt → pt′ is defined as: 
ª ν maps a variable e to a resource e′ Є R. 
ª ν preserves all constants (i.e., (c) = c), where c is a 
constant 
ª ν maps an empty element “-” to 
81 
² an empty element “-”. 
² a variable e′ 
² a constant e′ in ST U DT 
ª ν maps a constant e in DT (data instance) to a constant 
e’ such that e = e′, i.e., it is an identity mapping
Security Cover 
¯ Security Cover: Security Cover is a finite set SC = 
{s1, s2, . . . , sn} where si = (ti, sli), ti is an RDF/S 
triple and sli є SL is a security label. Given a set SC 
of security objects of this form, an SC is 
ª Minimal, that is no two objects (t,sl) and (t’, sl’) 
exist such that t = t’ 
ª Complete, i.e., there is no pair (t,sl) where sl is 
empty 
82
Security Policy Properties 
¯ The Security Policy is complete, that is, every triple 
in the security cover gets a security label, i.e., ∀ti Є 
DT U ST , there is a (ti, sli) є SC, where SC is the 
security cover. 
¯ The Security Policy is consistent, 
83
Conflict Policy Mapping 
¯ Let pt1, . . . , ptk be the RDF patterns and sl1, . . . , 
slk be their security labels, respectively. Let ν1, . . . , 
vk be the mappings from pt1, . . . , ptk to an RDF 
triple t. The security label sl of a triple t is defined as 
least upper bound,i.e., sl = LUB[sl1, . . . , slk] and 
the corresponding security object is (t, sl). 
84
XML-ER : Equivalence Class 
¯ Given a relational database schema RS, an 
equivalence class CE is defined as follows: 
A member el of CE is 
ª a set of a single relation name {Ri} such that Ri 
is a relation name in RS or 
ª a set of attribute names {R1.a1, . . . ,Rn.an} such 
that for all Ri.ai, (i = 1, . . . , n), Ri is a relation 
name in RS, aj Є sort(Ri), and there is a foreign 
key constraint between any two or more 
attributes in CE. 
85
Mapping Rule Set 
¯ Let X = (V,E, θ,L) be an XML schema tree, O = (C, 
P,δ ,≤) be an RDF ontology schema and μ : X → O 
be a mapping function. A mapping rule set Mxo 
containing XML to RDF components’ 
correspondences is defined as Mxo = {(x1, r1) . . . 
(xk, rk)} such that xi is either an XML node vi Є V or 
a pair of nodes (vi, vj) and ri is either an RDF class 
ci Є C, an RDF property pi Є P or an RDF triple ti = 
[si, pi, oi]. 
86
SOAP Request 
GET /stock HTTP/1.1 Host: www.kbcafe.com 
<?xml version="1.0"?> 
<soap:Envelope xmlns:soap="http://www.w3.org/2001/12/soap-envelope" 
xmlns:m="http://www.kbcafe.com/stock"> 
<soap:Header> 
<m:DeveloperKey>1234</DeveloperKey> 
</soap:Header> 
<soap:Body> 
<m:GetStockPrice> 
<m:StockName>HUMC</m:StockName> 
<m:QuoteTime>EST</m:QuoteTime> 
<m:Exchange>NYSE,NASDAQ</m:Exchange> 
</m:GetStockPrice> 
</soap:Body> 
</soap:Envelope> 
87
SOAP Response 
HTTP/1.1 200 OK 
<?xml version="1.0"?> 
<soap:Envelope xmlns:soap="http://www.w3.org/2001/12/soap-envelope" 
xmlns:m="http://www.kbcafe.com/stock"> 
<soap:Body> 
<m:GetStockPriceResponse> 
88 
<m:Price> 
<m:Value>27.66</m:Value> 
<m:QuoteTime>12:46PM</m:QuoteTime> 
<m:Exchange>NYSE</m:Exchange> 
</m:Price> 
</m:GetStockPriceResponse> 
</soap:Body> 
</soap:Envelope>
REST Request 
GET /stock?StockName=HUMC HTTP/1.1 
Host: www.kbcafe.com 
89
REST Response 
90 
HTTP/1.1 200 OK 
<?xml version="1.0"?> 
<m:Price xmlns:m="http://www.kbcafe.com/stock"> 
<m:Value>27.66</m:Value> 
<m:QuoteTime>12:46PM</m:QuoteTime> 
<m:Exchange>NYSE</m:Exchange> 
</m:Price>
XML Semantic Normal Form (SNF) 
¯ XML Semantic Normal Form represents the 
meaning of XML data in a document 
¯ Applications can convert exchanged XML 
documents into their standard semantic form and 
compare them 
¯ Since structurally different but semantically similar 
documents would have an equivalent SNF, their 
authorization policies would be similar 
¯ Properties of the XML Semantic Normal Form 
¯ Algorithm to convert an XML document in its 
Semantic Normal Form 
91

Semantic Security : Authorization on the Web with Ontologies

  • 1.
    Security on theWeb: A Semantic-Aware Authorization Framework for Secure Data Sharing PhD Dissertation Defense July 7, 2008 Amit Jain Research Advisor: Dr. Csilla Farkas Center for Information Assurance Engineering Department of Computer Science & Engineering University of South Carolina
  • 2.
    Presentation Agenda 2 ¯ Introduction: ª Web From Past to Present ª Future Trends ¯ Background ª Research Challenges ² Security Reliance on XML Syntax ² XML to RDF ontology mappings ² RDF Ontology Security ¯ Proposed Solution ª Contribution: Semantic Aware Secure Data Sharing Framework ª RDF Authorization Model ª Semantic Mappings between XML and domain Ontologies ª Authorization Policies derivation for XML data ¯ Prototype ¯ Conclusion & Future Work ¯ References
  • 3.
    Introduction 3 ¯Introduction: ª Web From Past to Present ª Future Trends ¯ Background ª Research Challenges ² Security Reliance on XML Syntax ² XML to RDF ontology mappings ² RDF Ontology Security ¯ Proposed Solution ª Contribution: Semantic Aware Secure Data Sharing Framework ª RDF Authorization Model ª Semantic Mappings between XML and domain Ontologies ª Authorization Policies derivation for XML data ¯ Prototype ¯ Conclusion & Future Work ¯ References
  • 4.
    State of Security ¯ Security is usually an afterthought in application development ª Recent data breaches and ID theft ª Unauthorized access to secure information ¯ Development of an authorization model requires ª Knowledge of the data model ª Existing access control models ª The inadequacies of the existing works ª Software Application development trend understanding 4
  • 5.
    Security Terms: 5 ¯ Authentication ª "where does this (part of a) message come from?" ¯ Authorization (access control) ª "may this message be disclosed to the requesting party?" ¯ Confidentiality ª "who can read this (part of a) message?" ¯ Integrity ª "has this (part of a) message been tampered with?" ¯ Audit ª "what happened?" ¯ Administration ª "how do I manage this?"
  • 6.
    Software Application Evolution 6 ¯ Past Web ª Static HTML Web Pages – rendering focus ª Data consumed by Humans ¯ Move towards Web Applications ª Web Apps are the trend [1] ª Wide audience and reach ª Thin clients, only browser required ¯ Successful trend making applications ª Google search engine ª Mash Ups ª Social Networking Applications (MySpace, Facebook) ª Multimedia sharing Applications (YouTube, Slideshare)
  • 7.
    Web Applications ¯Some Web Architectures 7 ª Web Services ª Web 2.0 ª Semantic Web
  • 8.
    Future Web Applications Characteristics ¯ From Static to Data Centric & Automated ¯ Data & Information Sharing ª Enterprise applications sharing data on the web ª Data exchanged by service oriented applications ª Reconciliation/ Interoperation of distributed data 8 ¯ Web with a meaning ª Data & Information annotated with Semantics ª Machine-understandable information ª Intelligent Software & Agents ª Automated usage
  • 9.
    Open Data Standards ¯ Some data sharing initiatives underway ª Data portability - taking person’s data and friends from one site to another. (DataPortability.org) ª OpenID- portable identity; single sign-on ª OpenSocial - Google initiative for social networks, enabling developers to create widgets with one set of code; MySpace, Facebook ª APML - growing ‘Attention’ standard; Person’s Attention Data is all the information online about what one reads, writes, shares and consumes 9
  • 10.
    Web Services Architecture ¯ Loosely coupled ¯ Application performs a function and exposed to network ª Flight Departure Service ª Geo-Location Service ª Flight Departure Monitoring Service ª Data Transformation-Interchange Service ¯ Web Services advertise & communicate using standard protocols ª WSDL (Web Service Description Language) ª SOAP (Simple Object Access Protocol) ¯ Applications assembled from services dynamically ¯ Uses XML for data format ¯ Exchange data and results ¯ Platform Independent & Language Neutral 10
  • 11.
    Web Services 11 Registry Backend Database Web Service 1 Web Service 2
  • 12.
    A Web ServiceRequest – XML Format <?xml version=”1.0”?> <SOAP-ENV:Envelope xmlns:SOAP-ENV=”http://schemas.xmlsoap.org/ 12 soap/envelope/”> <SOAP-ENV:Body> <s:GetWeatherForecasxmlns:s=“http:// www.WeatherService.com/”> <!--Parameters passed with the method call Like ZIP CODE--> </s:GetWeatherForecast> </SOAP-ENV:Body> </SOAP-ENV:Envelope>
  • 13.
    Semantic Web ¯Machines talking to machines ¯ Making the Web more 'intelligent’ ¯ Bottom Up = annotate, metadata, RDF! ¯ Top Down = Simple 13 Image credit: dullhunk Top-down: • Leverage existing web information • Apply specific, vertical semantic knowledge • Deliver the results as a consumer-centric web app
  • 14.
    Semantic Apps Whatis a Semantic App? - Not necessarily W3C Semantic Web - An app that determines the meaning of text and other data, and then creates connections for users - Data portability and connectibility are keys (ref: Nova Spivack) Example: Calais Reuters, the international business and financial news giant, launched an API called Open Calais in Feb 08. The API does a semantic markup on unstructured HTML documents - recognizing people, places, companies, and events. Ref: Reuters Wants The World To Be Tagged; Alex Iskold, ReadWriteWeb, Feb 08 14
  • 15.
    More Semantic Apps 15 Other Products to watch: ¯ Twine ¯ Freeset ¯ Powerset ¯ Talis ¯ TrueKnowledge ¯ AdaptiveBlue ¯ TripIt ¯ Spock ¯ Quintura ¯ Hakia Ref: 10 Semantic Apps to Watch; Richard MacManus, ReadWriteWeb, Nov 07
  • 16.
    Semantic Web -Ontologies 16 ¯ Ontologies ª Represent Semantics of data in a domain ª Enables Knowledge Management ª Consist of resources, their attributes and relationships ª Languages: ² Resource Description Framework (RDF) ² Web Ontology Language (OWL)
  • 17.
    Information Management & Security Issues 17 ¯ Information Integration ª How to reconcile data from disparate sources? ¯ Mediation Layer ª How to provide a global view of local data sources? ª Performance, Restructuring, Mapping Issues? ¯ Security Issues ª Securely share data among applications/agents? ª Accountability? ª Trust among interacting agents?
  • 18.
    Research Problem: Data& Information Security 18 ¯ Introduction: ª Web From Past to Present ª Future Trends ¯ Background ª Research Challenges ² Security Reliance on XML Syntax ² XML to RDF ontology mappings ² RDF Ontology Security ¯ Proposed Solution ª Contribution: Semantic Aware Secure Data Sharing Framework ª RDF Authorization Model ª Semantic Mappings between XML and domain Ontologies ª Authorization Policies derivation for XML data ¯ Prototype ¯ Conclusion & Future Work ¯ References
  • 19.
    Framework XML –RDF Mappings Propagation Policy & Conflict Resolution Secure XML View Generation XML Access Control XML Authorization Component RDF Authorization Component RDF Security Cover RDF Data RDF security Policies XML Data Semantic Mappings: XML -> RDF
  • 20.
    SOA/Corporate Data Restructuring ¯ XML data restructuring / remapping ª Requirement in several cases like BI, Dynamic Web Service Composition ª AquaLogic based on Xquery ª May be done on the fly for data interchange with partners unseen before 20 ¯ Corporate Merging ª Data from different heterogeneous schemas ª How to decide security permissions for them ? ¯ XML data creation from legacy systems ª Nodes with non-meaningful labels may be given inconsistent security
  • 21.
    Web Service DataSharing Scenario XML XML Insurance Company 21 Hospital XML RDB Insurance Company DB Health Provider
  • 23.
    Data Sharing Scenariocontd. Patient Illness Health Records XML 1 XML 2 <WS1, Patient/MedicalData/Illness, TS > <WS2, Patient/Data/HealthRecords/Drug, S > 23 Data - Patient Medical Data Prescription Personal Data BirthDate SSN Diagnosis Information SSN BirthDate Drug <WS1, Patient/MedicalData/Prescription, TS> <WS2, Patient/Data/HealthRecords/Diagnosis, S > Policy 1 Policy 2
  • 24.
    Existing XML SecurityMethods ¯ Web Data Security & Assurance ¯ Several Security Standards Available ª XML Signature ª XML Encryption ª XML Access Control Models ² Bertino et al, Kudo et al, Damiani et al, etc. 24 ¯ Web Services Security ª WS-Security ª SAML ª WS-*
  • 25.
    Research Problem 1: ¯ How can XML data be restructured in a way that the security policies are unaffected and no data leak occurs. ¯ Intuitive Solution: Use the meaning/semantics of the data while restructuring ¯ Is it possible to find a standardized way to express the intended semantics of XML to be used for inter-operation 25 and security?
  • 26.
    RDF : InformationOverload Management solution ¯ Use RDF to represent semantics embedded in XML documents. ¯ RDF (Resource Description Framework): seen as the solution for Information Overload ¯ Provides Universal Sharing (using URI) ¯ Syntax independent ¯ Provides Semantics ¯ Can be used to say anything about anything ¯ Critical tool in Linking data (Not just the documents) between the applications on the data web 26
  • 27.
    Commercial RDF Applications ¯ NASA uses it extensively for document management ¯ Companies are moving towards a RDF backed data creation and social networking apps ¯ Millions of data triples asserted to bootstrap the knowledge sources (DbPedia, Twine, Freebase) ª Freebase stores millions of information entities in RDF format [2] ª TWINE: RDF based social networking application [3] ¯ Several Government Agencies are using RDF ¯ Social Networking Graphs ª Each person is a data generation warehouse 27
  • 28.
    Securing RDF ¯Commercial RDF applications underscore the security need ¯ Agencies sharing their meta information RDF models: Need to securely share partial data ¯ Providing users with a fine grained access control to their own data: Still missing from all the applications 28
  • 29.
    RDF characteristics ¯Properties provide link between the connected entities ¯ Subclasses, Sub properties need to be considered ¯ Type to Class relationships ¯ Same resource may have different security requirements in different roles ¯ Entailments need to be considered 29
  • 30.
    Database Integration ¯Usually deals with the view of legacy databases integrated into a view ¯ Focuses on the tuples at the instance level ¯ RDF deals with the conceptual schema ª Classes and their properties ª Instances ¯ RDF can be use as a conceptual layer on top of database integrated views 30
  • 31.
    RDF Resource Associations Business Entity Commercial Corporate Start Up ExternalFunded Fusion Capabilities contains Facilities Fully Funded Labs Military Research Wing Research Company Chemicals Consumes Consumes XYZ Uranium subclass subclass rdf:type rdf:type S S S S P P P P RDF Schema RDF Instance S Toxic are P P
  • 32.
    Business Entity Commercial Corporate Start Up RDF Entailments ExternalFunded Fusion Capabilities contains S S S Facilities Fully Funded Labs Military Research Wing Research Company S Chemicals Consumes Consumes S XYZ Uranium subclass subclass rdf:type rdf:type S S S P P P P P P P P RDF Schema P S RDF Instance
  • 33.
    Research Problem 2 ¯ How can RDF data be assigned authorizations that considers its semantic and entailment requirements ¯ How can I express security requirements for RDF ontology data such that the model is ª Syntax independent ª Considers RDF semantics ª Incorporates entailment? 33
  • 34.
    Existing Literature 34 ¯ RDF security ª Qin and Atluri [4]: A concept-level access control for web data, where access control is defined on ontological concepts and instances of these concepts inherit the access control of the concepts they belong to. ª Finin et al. [5]: A policy based access control model for RDF data in an RDF store. Provides control over the different action modes supported by the RDF store like inserting a set of triples, deleting a triple, and querying a triple. ª Dietzold and Auer [6]: An access control model for RDF Triple Wikis. Their model allows the specification of custom rules that can be used for securing access to the store. ª Kaushik et al. [7]: A logic based policy language for securing full or partial ontologies.
  • 35.
    Research Problem 3:Semantics Definition and Security Derivation ¯ Based on Previous Research Problems ª How to use RDF to provide authorization permissions for XML ¯ Intuitive Solution: Derive mappings between the XML and RDF data and use the mappings to enforce security policies from RDF domain ontologies to secure XML data in a syntax independent way? 35
  • 36.
    Existing Literature ¯Semantic Data Integration 36 ª Xiao & Cruz [8]: ² An ontology-based approach for integration of heterogeneous XML sources. ² Converts XML data sources into a RDF ontology. Local ontologies are merged to create a global ontology. ª Several Engineering based works ² Gloze [9] ² WEESA [10] ² Try to induct semantics in XML based on discovery.
  • 37.
    Characteristics & Shortcomingsof Current Security Methods ¯ XML is the de facto standard ª No semantics embedded in XML ª XML Security Methods Syntax Oriented ª No Data or application semantics ª Syntax tampering allows unauthorized access 37 ¯ No Association Protection ª Data appearing together might need to be classified ¯ No Entailment Consideration ª Data entailments may allow incorrect security labeling ¯ No Protection Mechanism for Syntax Independent Ontologies
  • 38.
    Proposed Research: SemanticAware Secure Data Sharing Framework 38 ¯ Introduction: ª Web From Past to Present ª Future Trends ¯ Background ª Research Challenges ² Security Reliance on XML Syntax ² XML to RDF ontology mappings ² RDF Ontology Security ¯ Proposed Solution ª Contribution: Semantic Aware Secure Data Sharing Framework ª RDF Authorization Model ª Semantic Mappings between XML and domain Ontologies ª Authorization Policies derivation for XML data ¯ Prototype ¯ Conclusion & Future Work ¯ References
  • 39.
    Research Goal: SemanticAware XML Access Control Model ¯ Develop a Comprehensive S ecurity Framework ª Works in open and distributed environment settings ª Use of data & application semantics for data sharing ª Flexible Security Policies (Fine Granularity) ª Provides Security on Metadata (Semantics) ª Semantically enhancement of XML web data ª Provides access control independent of the XML data syntax ª Has Properties like completeness and consistency 39
  • 40.
    RDF Security Policy ¯ Typical Security Policy Components: ª (Subject, Object, Privilege/Security label) ª <s,o,±pri/sl> ¯ RDF security object / pattern [x, y, z] ¯ Security classification ([x, y, z], TS) ¯ Security Objects Subsumption ¯ Association protection ¯ Fine grained – Individual elements/two elements ¯ Policies in RDF format 40
  • 41.
    Pattern Mapping ¯Pattern mapping from an RDF triple to a group of triples ª Generates Security Cover ª Conflict Resolution ª Consistent security labeling 41
  • 42.
    Security Cover ¯Materialized view of the secure RDF/S database ¯ Consists of pairs (t,sl) 42 ª Minimal ª Complete
  • 43.
    RDF Security Architecture 43 Simple Conflict Data Resolution Inference Engine Inference Rules (JENA) Inference Conflict Resolution RDF/S Security Cover Entailed Security Cover RDF/S Native Database Rules Security Policies Policies Database Analysis History Querying & Security Monitor Query Denial Answer Forwarded Query Returned Query Results
  • 44.
    Mapping a DefaultPolicy 44 ¯ Default policy ² ([x, y, z], TS) (Student, rdfs:subClassOf, Person) (University, rdfs:subClassOf, GovAgency) (studiesAt, rdfs:domain, Student) (studiesAt, rdfs:range,University) (John, studiesAt, USC)
  • 45.
    Simple Conflict Resolution ¯ Subsuming patterns have less restrictive security classifications ¯ Based on the “more restrictive takes precedence” resolution 45
  • 46.
    Conflict resolution: Pattern Mapping 46 ¯ Conflict Resolution ª ([Student, studiesAt, University], P) ª ([John, studiesAt, USC], S) P S ( [John, studiesAt, USC], S)
  • 47.
    RDF Reification SS S studiesAt John USC rdf:subject confidence madeOnDate madeBy rdf:type stmt1 rdf:Statement rdf:object rdf:predicate High 07/07/2008 Mark S S S S P P
  • 48.
    RDF/S entailment ¯RDF/S triples entailment ª Inference rules application on data to infer new triples ª Generated triples are assigned security labels ª Inference Conflict Resolution 48
  • 49.
    Inference Conflict Resolution ¯ The generated triple may already exist 49 ª A higher security ¯ The policy may require existing triple be classified at a higher level
  • 50.
    Semantic Mappings: FromXML to RDF Ontologies ¯ Establish mappings between XML & RDF ¯ Semantic Enhancement of XML using the mappings ¯ Good database design entails an ER schema as the starting point ¯ Uses ER as the intermediate semantic representation model ¯ Define mappings between an XML data and ER conceptual Model 50
  • 51.
    Conceptual Schema 1 Relational Database Schema 1 α1 α2 α3 α4 α5 αn XML1 Conceptual Schema 2 Relational Database Schema 2 XML2 XML3 XML4 XML5 Conceptual Schema m Relational Database Schema m XMLn
  • 52.
    Equivalence Classes ¯Foreign keys in a relation point to the Primary Key entity ¯ They should be mapped only once ¯ Equivalence classes consist of the primary key, foreign keys pairs, relation schema and relational attributes 52
  • 53.
    Mapping Properties ¯Structure Preserving Tags (SPT) ¯ Element ordering and cardinality constraints ¯ Mapping Function: Many to One ¯ The mappings are ª Complete: contains one pair (vi,CEi) for every node vi of the XML schema tree X ª Consistent: does not have two pairs (vi,CEi) and (vj ,CEj) such that vi = vj and CEi != CEj , i.e., there is a single XML node corresponding to an equivalence class. 53
  • 54.
    Conceptual Schema 1 Relational Database Schema 1 α1 α2 α3 α4 α5 αn XML1 Conceptual Schema 2 Relational Database Schema 2 XML2 XML3 XML4 XML5 Conceptual Schema m Relational Database Schema m XMLn Federated Schema/ Meta-Ontology β1 β2 βn
  • 55.
    Mappings ¯ LetX be the XML schema tree and O be the RDF ontology. The XML to RDF mappings are defined in the following way: • An XML leaf node vi is mapped to a RDF property p Є P, : p → (c1, c2) such that the datatype of the leaf node element { Ri.ai } corresponds to the object datatype RDF Class c2 Є C, • A non-leaf node vi with sub element nodes is mapped to a RDF class c Є C. • A non-leaf node vi with sub element nodes is mapped to a RDF property p Є P such that μ(vj) = cj , μ(vk) = ck and (p) = (cj , ck). Here vj and vk are the ancestor and descendant of node vi, respectively. • A pair of XML nodes (vi, vj) is mapped to an RDF triple [s, p, o] where ed(vi, vj) is an unlabeled edge in XML tree and (p) = rdf:type, rdfs:subClassOf, or rdfs:subPropertyOf. 55
  • 56.
    Mapping Properties ¯Structure Preserving Classes ¯ XML-RDF mappings are ª Complete: contains one pair (vi, ci) or (vi, pi) for every XML node vi of the XML schema tree, i.e., each node is associated with an RDF class or property. ª Consistent: does not have two pairs (vi, ri) and (vj , rj) where ri is either an RDF property pi or class ci such that (vi = vj) and (ri != rj), i.e., for an equivalence class there is a single corresponding RDF class or property. 56
  • 57.
    Example : XMLOntology Mappings has RDF ONTOLOGY μ2 57 has has Address Person Date of Birth Patient μ1 Unique Identifier Works_for Company Hospital Business has isa isa isa has Name has Health Records contains has Diseases Patient - Medical Data DOB Illness Patient Data Information BirthDate SSN Drug Prescription Prescription Disease Personal Data PID Records XML 1 XML 2
  • 58.
    XML Authorizations :Policy derivation for XML Data ¯ Apply XML to RDF mappings on RDF Authorizations to derive simple XML access control policies ¯ Generated XML access control permissions have properties: 58 ª consistency, and ª completeness
  • 59.
    XML authorizations contd. ¯ XML policies are generated in the form of a pair with XPATH and a security label. ¯ XML access control models can use it as an input for more fine grained policies ¯ Use of meta policies like conflict resolution and propagation policies 59
  • 60.
    Mapping Example continued Patient Medical Data has Patient Medicines Taken Data Health Records Person has Medical Records contains Diseases TS TS TS TS Illness Prescription Diagnosis Drug XML 1 XML 2
  • 61.
    Prototype 61 ¯Java 1.6 for platform ¯ Jena 2.5 ª Java RDF API for reading, writing, and manipulating RDF data ¯ NG4J ª Named Graph for Jena ª Jena extension for providing a provenance to RDF triples ª Security Labels are stored as the context
  • 62.
    RAF contd. ¯Apache Derby for storing RDF data ª Java based Relational database ª Schema managed through Jena 62 ¯ Jena Rule Reasoner ª Inferencing model for entailment generation ª Applies the RDF/S entailment rule ª Can be used for applying business rules ¯ ISAVIZ ª Graph Library for RDF/S display as a graph
  • 63.
    RAF Admin ¯Load RDF files, RDFS files and policy files ª Multiple schema, instance and policies 63 ¯ Execute ª Pattern mapping ª Security cover generation ¯ Run the entailment and apply security labels ¯ Display graphical display of the ontologies ¯ Launch SPARQL query interface
  • 64.
  • 65.
    RAF contd. ¯SPARQL Query Interface ª SPARQL RDF query protocol for querying RDF data ² Type query or choose from pre-built queries ª Users given MAC security clearances ª Username, password authentication ª Results displayed based on user security clearance ¯ System messages display to warn of conflicts 65
  • 66.
  • 67.
  • 68.
    Querying : TopSecretclearance. 68
  • 69.
    Dissertation Contributions ¯Architecture for Semantic Aware Access Control Model for XML data ¯ Formal properties of the Semantic Aware Access Control Model ¯ Authorization Framework for securing RDF Data ª RDF security policy to RDF/S data mapping algorithm ª RDF Entailment procedure & algorithm to check for illegal inferences ª Formal Properties of RDF Authorization model ¯ Semantically Enhancement of XML data ª XML to RDF ontology mapping definitions ª XML to RDF correspondences properties ¯ XML Authorization Derivation ª Algorithm for propagating XML authorizations 69
  • 70.
    Conclusion ¯ Webis tending towards real time automated data collaboration ¯ Secure data sharing is a challenge ¯ Inclusion of Semantics can help ¯ Provide security for the XML data semantics represented by Ontologies ¯ Map the XML data to the domain Ontologies ¯ Use the mappings and ontology security policies to create authorization permissions for XML data 70
  • 71.
    Future Direction ¯Extend the model to handle Updates ¯ Use of business rules for entailment ¯ Extend the prototype to generate the XML authorization derivations ¯ Performance and results with large data-sets ¯ Use Policy Languages like Rei, Protune for a totally distributed ontology based authorization system ¯ Comparison of Security Policies in XML format 71
  • 72.
    Future Direction ¯Authorization model for OWL ª Semantics of OWL properties ¯ Extend the mapping from XML to other structure or semi structured data ¯ Extend the direct mapping to more realistic scenario: ª property links, subclass and subproperty links between the mapped entities 72
  • 73.
    Mappings in aGlobal Scenario Ont1 Ont2 WS1 Data Global Mapping sever WS3 Data υ1 υ2 Ontn υ m WSm Data υ2 WS2 Data Domain Ontologies Exchanged Web Data Security
  • 74.
    My Publications ¯“ From XML to RDF: Syntax, Semantics, Security and Integrity” (with C. Farkas, V. Gowadia, and D. Roy), In Proceedings of IFIP TC-11 WG 11.1 & WG 11.5 Joint Working Conference on Security Management, Integrity in Info Systems, Fairfax, Virginia, 2005 ¯ “Semantic-Aware Data Protection in Web Services” (with C. Farkas, D. Wijesekera, A. Singhal and B. Thuraisingham), In Proceedings of IEEE Workshop on Web Service Security, Oakland, California, 2006. 74 ¯ “ Secure Resource Description Framework: an Access Control Model” (with C. Farkas), In Proceedings of SACMAT06, ACM Symposium on Access Control Models And Technologies), Lake- Tahoe, California, 2006. ¯ "RDF Authorization Framework: Secure Data Sharing for Web Services", (with C. Farkas), Under Journal Revision. ¯ “Secure Semantic Based Data Sharing in XML Web Services”, (with C. Farkas, D. Wijesekera, A. Singhal and B. Thuraisingham) Under Journal Review
  • 75.
    References 1. “SomeTrends in Web Application Development”, Jazayeri, Mehdi. In Proceedings of : Future of Software Engineering, 2007. FOSE '07, USA, 2. “Freebase data dumps”, Metaweb Technologies. http:// download.freebase.com/datadumps/, 2008 3. “TWINE: The Smartest Way To Organize, Share and Discover Information About Your Interests”, Radar Networks, 2008. http://www.twine.com/. 4. “Concept-level access control for the Semantic Web”. Li Qin and V. Atluri. In XMLSEC ’03: Proceedings of the 2003 ACM workshop on XML security, New York, NY, USA, 2003. ACM Press. 5. “Policy-Based Access Control for an RDF Store”. P Reddivari, T. Finin, and A. Joshi. In Proceedings of the IJCAI-07 Workshop on Semantic Web for Collaborative Knowledge Acquisition, January 2007. 6. “Access control on RDF triple stores from a semantic Wiki perspective”. S. Dietzold and S. Auer. volume 183 of CEUR Workshop Proceedings ISSN 1613-0073, June 2006. 7. “Policy-based dissemination of partial web-ontologies”. S. Kaushik, D. Wijesekera, and P. Ammann. In SWS ’05: Proceedings of the 2005 workshop on SWS, New York, NY, USA, 2005. ACM Press. 75
  • 76.
    References 8. “Integratingand Exchanging XML Data Using Ontologies”. H. Xiao, I. Cruz., J. Data Semantics VI 2006, 67-89. 9. “Gloze: XML to RDF and back again”, Steve Battle, First Jena User 76 Conference, 2006. 10. “WEESA: Web engineering for semantic web applications”, Gerald Reif, Harald Gall, and Mehdi Jazayeri, In Proceedings of the 14th International Conference on World Wide Web, pages 722–729, New York, NY, USA, 2005. ACM Press.
  • 77.
  • 78.
  • 79.
    RDF - Patterns ¯ An RDF pattern pt, is a triple represented as pt = [r, p, v], where each component of the pattern is either ª A data constant such that r є R, p є PR, and v є R U L, or ª The symbol ”-” representing the empty element of the triple, or ª A variable represented as a symbol starting with ?, corresponding to any value for the triple element 79
  • 80.
    RDF Security Policy ¯ The security policy SP is a set of pairs SP = {sp1, . . . , spn} U {spdef} such that every spi has the form (pti, sli) and λ(pti) = sli where pti is an RDF pattern, sli is a security label in SL and λ is the security labeling function. spdef = (ptdef , sldef ) represents the default policy where ptdef = [?x1, ? x2, ?x3] is a pattern with all variables and sldef is the default security label such that sldef !≥ sli & sldef !≤ sli for any sli in SL. 80
  • 81.
    RDF Pattern Mapping ¯ Let pt = [r, p, v] and pt′ = [r′, p′, v′] be two RDF patterns and R be the set of Resources. Let ST and DT be the RDF Schema and Instance respectively. For all pattern elements e and e′ where e is either r,p, or v and e′ is either r’,p’, or v’ respectively, the pattern mapping ν: pt → pt′ is defined as: ª ν maps a variable e to a resource e′ Є R. ª ν preserves all constants (i.e., (c) = c), where c is a constant ª ν maps an empty element “-” to 81 ² an empty element “-”. ² a variable e′ ² a constant e′ in ST U DT ª ν maps a constant e in DT (data instance) to a constant e’ such that e = e′, i.e., it is an identity mapping
  • 82.
    Security Cover ¯Security Cover: Security Cover is a finite set SC = {s1, s2, . . . , sn} where si = (ti, sli), ti is an RDF/S triple and sli є SL is a security label. Given a set SC of security objects of this form, an SC is ª Minimal, that is no two objects (t,sl) and (t’, sl’) exist such that t = t’ ª Complete, i.e., there is no pair (t,sl) where sl is empty 82
  • 83.
    Security Policy Properties ¯ The Security Policy is complete, that is, every triple in the security cover gets a security label, i.e., ∀ti Є DT U ST , there is a (ti, sli) є SC, where SC is the security cover. ¯ The Security Policy is consistent, 83
  • 84.
    Conflict Policy Mapping ¯ Let pt1, . . . , ptk be the RDF patterns and sl1, . . . , slk be their security labels, respectively. Let ν1, . . . , vk be the mappings from pt1, . . . , ptk to an RDF triple t. The security label sl of a triple t is defined as least upper bound,i.e., sl = LUB[sl1, . . . , slk] and the corresponding security object is (t, sl). 84
  • 85.
    XML-ER : EquivalenceClass ¯ Given a relational database schema RS, an equivalence class CE is defined as follows: A member el of CE is ª a set of a single relation name {Ri} such that Ri is a relation name in RS or ª a set of attribute names {R1.a1, . . . ,Rn.an} such that for all Ri.ai, (i = 1, . . . , n), Ri is a relation name in RS, aj Є sort(Ri), and there is a foreign key constraint between any two or more attributes in CE. 85
  • 86.
    Mapping Rule Set ¯ Let X = (V,E, θ,L) be an XML schema tree, O = (C, P,δ ,≤) be an RDF ontology schema and μ : X → O be a mapping function. A mapping rule set Mxo containing XML to RDF components’ correspondences is defined as Mxo = {(x1, r1) . . . (xk, rk)} such that xi is either an XML node vi Є V or a pair of nodes (vi, vj) and ri is either an RDF class ci Є C, an RDF property pi Є P or an RDF triple ti = [si, pi, oi]. 86
  • 87.
    SOAP Request GET/stock HTTP/1.1 Host: www.kbcafe.com <?xml version="1.0"?> <soap:Envelope xmlns:soap="http://www.w3.org/2001/12/soap-envelope" xmlns:m="http://www.kbcafe.com/stock"> <soap:Header> <m:DeveloperKey>1234</DeveloperKey> </soap:Header> <soap:Body> <m:GetStockPrice> <m:StockName>HUMC</m:StockName> <m:QuoteTime>EST</m:QuoteTime> <m:Exchange>NYSE,NASDAQ</m:Exchange> </m:GetStockPrice> </soap:Body> </soap:Envelope> 87
  • 88.
    SOAP Response HTTP/1.1200 OK <?xml version="1.0"?> <soap:Envelope xmlns:soap="http://www.w3.org/2001/12/soap-envelope" xmlns:m="http://www.kbcafe.com/stock"> <soap:Body> <m:GetStockPriceResponse> 88 <m:Price> <m:Value>27.66</m:Value> <m:QuoteTime>12:46PM</m:QuoteTime> <m:Exchange>NYSE</m:Exchange> </m:Price> </m:GetStockPriceResponse> </soap:Body> </soap:Envelope>
  • 89.
    REST Request GET/stock?StockName=HUMC HTTP/1.1 Host: www.kbcafe.com 89
  • 90.
    REST Response 90 HTTP/1.1 200 OK <?xml version="1.0"?> <m:Price xmlns:m="http://www.kbcafe.com/stock"> <m:Value>27.66</m:Value> <m:QuoteTime>12:46PM</m:QuoteTime> <m:Exchange>NYSE</m:Exchange> </m:Price>
  • 91.
    XML Semantic NormalForm (SNF) ¯ XML Semantic Normal Form represents the meaning of XML data in a document ¯ Applications can convert exchanged XML documents into their standard semantic form and compare them ¯ Since structurally different but semantically similar documents would have an equivalent SNF, their authorization policies would be similar ¯ Properties of the XML Semantic Normal Form ¯ Algorithm to convert an XML document in its Semantic Normal Form 91