SlideShare a Scribd company logo
1 of 46
Download to read offline
Data Validation with
               OWL Integrity Constraints

                                   Evren Sirin, CTO
                                  Clark & Parsia, LLC
                                evren@clarkparsia.com



                                          1
Wednesday, September 22, 2010
Who are we?
         • Clark & Parsia is a semantic software startup!
               – HQ in Washington, DC & office in Boston
         • Provides software development and integration
           services
         • Specializing in Semantic Web, web services, and
           advanced AI technologies for federal and
           enterprise customers!
                                  http://clarkparsia.com/
                                  Twitter: @candp
                                                            2


Wednesday, September 22, 2010
Overview
         • Data validation with OWL
               – Representation and validation of integrity constraints
         • Use cases
               – Examples, issues, workarounds
         • OWL Integrity Constraints
               – Syntax, semantics, validation
         • Comparison with other approaches
               – Epistemic DLs, Epistemic QLs, Rules
         • Implementation and performance
                                                                  3


Wednesday, September 22, 2010
Some Applications
         • Customer and product data
               – Find which customer would be interested in buying a
                 certain product
         • System and component descriptions
               – Configure components to build a desired system
         • Workforce and employee data
               – Locate employees with desired expertise
         • Patient history and drug data
               – Detect and prevent potentially harmful drug interactions

                                                                  4


Wednesday, September 22, 2010
Common Theme
         • There is data and lots of it!
         • Adding semantics to the data helps a lot
               – Sometimes simple taxonomies, but other times,
                 complex ontologies
         • We have complete knowledge about the domain
         • Errors in the data cause problems
               – Failures in applications, errors in decision making,
                 potential loss of revenue, security vulnerabilities, etc.


                                                                   5


Wednesday, September 22, 2010
Data Validation
         • Fundamental!data management!problem
               – Verify data integrity and correctness!
               – Enforce validity of updates!
         • Relevant in many scenarios
               – Storing data for stand-alone applications
               – Exchanging data in distributed settings
         • Solved (to some degree) in RDBMSs
               – Harder to achieve as data semantics increase and/or
                 more expressive integrity conditions are required
                                                              6


Wednesday, September 22, 2010
Disclaimer
         • Data validity not important for every use case
               – Invalid data may be fine for an application
               – Invalidity may even be a requirement
         • Focus of this talk is cases where data consistency
           and integrity are crucial




                                                              7


Wednesday, September 22, 2010
Building Semantic Apps
         • Represent data as RDF triples
               – First step for accomplishing data integration and
                 analysis
         • Enrich data with more semantics (RDFS, OWL)
               – Infer implicit information from explicit assertions
         • Ensure data validity
               – Detect errors in the data
         • Do something cool with the data
               – Obviously...
                                                                 8


Wednesday, September 22, 2010
Reasoning Example
         • Input ontology
                       # Every supervisor is an employee
                       Supervisor subClassOf Employee
                       # Person0853 is a manager
                       Person085 type Supervisor
         • Output inferences
                       # Person0853 is an employee
                       Person085 type Employee


                                                           9


Wednesday, September 22, 2010
Reasoning Example
         • Input ontology
                       # Every supervisor is an employee
                                                           Schema
                       Supervisor subClassOf Employee
                       # Person0853 is a manager
                       Person085 type Supervisor
         • Output inferences
                       # Person0853 is an employee
                       Person085 type Employee


                                                               9


Wednesday, September 22, 2010
Reasoning Example
         • Input ontology
                       # Every supervisor is an employee
                                                           Schema
                       Supervisor subClassOf Employee
                       # Person0853 is a manager
                       Person085 type Supervisor           Instance data
         • Output inferences
                       # Person0853 is an employee
                       Person085 type Employee


                                                                9


Wednesday, September 22, 2010
Validating RDF Data
         • Common misunderstanding
               – RDFS/OWL is to RDF what XML Schema is to XML
               – Describe integrity conditions in RDFS or OWL
                     • Typing constraints - RDFS domain/range
                     • Participation constraints - OWL some values restrictions
                     • Uniqueness constraints - OWL cardinality restriction
               – Use a reasoner to find inconsistencies
         • Problem:!Open World Assumption


                                                                        10


Wednesday, September 22, 2010
Closed vs. Open World
         • Two different views on truth:
               – CWA: Any statement that is not known to be true is false
               – OWA: A statement is false only if it is known to be false
         • Used in different contexts
               – Databases use CWA because (typically) they contain!
                 complete information
               – Ontologies use OWA because (typically) they don't...
                 that is, they contain!incomplete information
         • Data validation results significantly different when
           using CWA instead of OWA
                                                                  11


Wednesday, September 22, 2010
Typing Constraint
            • Only managers can supervise employees
            • Input ontology
                 o   supervises domain Supervisor
                 o   Person085 supervises Person173

                                         OWA                           CWA
          !Consistent                     true                         false
                                Infer that                  Assume that
          !Reason               Person085 type Supervisor   Person085 type not Supervisor




                                                                               12


Wednesday, September 22, 2010
Participation Constraint
            • Each supervisor must supervise at least
              one employee
            • Input axioms
                  o   Supervisor subClassOf supervises some Employee
                  o   Person085 type Supervisor

                                         OWA                       CWA
              Consistent                     true                   false

                                Infer that                 Assume that
              Reason            Person085 supervises _:b   Person085 supervises _:b
                                _:b type Employee          does not exist

                                                                         13


Wednesday, September 22, 2010
Uniqueness Constraint
            • Employees can have at most one supervisor
            • Input axioms
                  o   supervises InverseFunctional
                  o   Person085 supervises Person173
                  o   Person632 supervises Person173


                                         OWA                          CWA
         Consistent                       true                         false
                                                           Assume that
                                Infer that
         Reason                 Person085 sameAs Person632
                                                           Person085 sameAs Person632
                                                           does not hold



                                                                               14


Wednesday, September 22, 2010
Workarounds for CW
         • Manually close the world
               – Declare all individuals different from each other
               – Count existing property values and add a max
                 cardinality restriction
               – Make all disjointness statements explicit and add
                 negated types to individuals
         • Drawbacks
               – Can be computationally expensive
               – Likely to be error-prone

                                                               15


Wednesday, September 22, 2010
Problem Summary
         • Definitions in an OWL schema may have two
           purposes
               – Infer new statements
               – Check if existing statements are valid
         • Using OWA for validation is undesirable
               – Not always but in many cases
         • In a problem domain we may have:
               – Complete knowledge about some parts of the domain
               – Incomplete knowledge about the other parts
                                                           16


Wednesday, September 22, 2010
Integrity Constraint
                               Solution
         • We defined an alternative semantics for OWL
               – Integrity Constraint (IC) semantics use CWA
               – Can be combined with regular inference axioms
         • Ontology developer chooses which axioms will
           be!interpreted with...
               – OWA - regular OWL axiom, or
               – CWA - integrity constraint



                                                            17


Wednesday, September 22, 2010
IC Extension
         • Syntax specification
               – How do we syntactically say an axiom is an IC and
                 not a regular OWL axiom?
         • Semantics specification
               – How do we exactly interpret an IC?
         • Validation algorithm
               – Given the semantics how do we check for IC
                 violations?


                                                              18


Wednesday, September 22, 2010
IC Syntax
         • Similar approach to using owl:imports
         • Define a new annotation property in a new
           namespace

                                Ont1 owl:imports Ont2
                                Ont1 ic:imports IC1

         • Backward compatible, requires minimum change
           in tools
                                                        19


Wednesday, September 22, 2010
Use Case: SKOS
         • Simple Knowledge Organization System (SKOS)
         • SKOS provides a model for expressing the basic
           structure and content of concept schemes
               – Thesauri, classification schemes, subject heading lists,
                 taxonomies, folksonomies, etc.
         • SKOS data model specification
               – Informal (Text): http://www.w3.org/TR/skos-reference/
               – Formal (OWL): http://www.w3.org/2004/02/skos/core.rdf


                                                                    20


Wednesday, September 22, 2010
SKOS Example
       # SKOS reference ontology that contains inference rules
       skos:broaderTransitive Transitive                       skos-reference.ttl
       skos:broaderTransitive subPropertyOf skos:broader

       # NL constraints from SKOS specification expressed as ICs
       skos:related propertyDisjointWith skos:broaderTransitive skos-constraints.ttl


         # SKOS data that violates the SKOS data model
         [] a owl:Ontology ; owl:imports skos-reference.ttl ;
                              ic:imports skos-constraints.ttl .   skos-invalid.ttl

         A skos:broader B ; skos:related C .
         B skos:broader C .

             IC validation requires OWL reasoning                    21


Wednesday, September 22, 2010
Another SKOS Example
        # SKOS-XL ontology with a cardinality restriction
        skosxl:Label subClassOf                               skos-xl.ttl
                         skosxl:literalForm cardinality 1

         # SKOS data that violates the SKOS data model when
         # SKOS ontology is imported as ICs as well
         [] a owl:Ontology ; owl:imports skos-xl.ttl ;
                              ic:imports skos-xl.ttl .        skos-invalid.tll

         A skosxl:labelRelation LabelA
         LabelA type skosxl:Label .


               Same ontology can be both
        a regular OWL import and an IC import                    22


Wednesday, September 22, 2010
IC Semantics
         • OWL semantics based on model theory
               – Similar to First Order Logic
               – Formal, precise, and unambiguous
         • IC semantics specification
               – Extends OWL model theory
               – Change couple basic definitions, everything else
                 follows
         • Details published in technical papers
               – We are submitting a W3C member submission soon
                                                              23


Wednesday, September 22, 2010
IC Interpretations
         • A regular OWL interpretation I = ( ΔI, ΔD, ⋅C, ⋅OP, ⋅DP, ⋅I,
           ⋅DT, ⋅LT, ⋅FA) is a 9-tuple
            – ⋅C is the class interpretation function that assigns to each class
                  C ∈ VC a subset (C)C ⊆ ΔI
         • An OWL IC interpretation Γ = ( ΔI, ΔD, I, U, ⋅C, ⋅OP, ⋅DP, ⋅I,
           ⋅DT, ⋅LT, ⋅FA) is a 11-tuple where I and elements of U are
           regular OWL interpretations
            – (C)C = {xI | x ∈ VI and for each Uj ∈ U we have that xIUj
              ∈ (C)Cj }.
         • More details available in references at the end
                                                                        24


Wednesday, September 22, 2010
Other Approaches
         •   IC semantics of Motik et al. [WWW2007]
         •   Epistemic DLs
         •   Epistemic QLs
         •   Rules with negation as failure operators




                                                        25


Wednesday, September 22, 2010
Validation Algorithm
         • An automated translation!algorithm
         • Automatically maps an OWL IC to a SPARQL
           query
               – Query must be evaluated with OWL entailment regime
         • ICs can be mapped to RIF rules too
               – Use SPARQL and Datalog correspondence
               – RIF defines Negation-as-Failure operator
         • Many different implementation possibilities
         • Off-the-shelf tools can be used for IC validation
                                                            26


Wednesday, September 22, 2010
SPARQL Translation
                 Supervisor subClassOf supervises some Employee



                                SELECT * {
                                   ?x type Supervisor.
                                   NOT EXISTS {
                                      ?x supervises ?y.
                                      ?y type Employee.
                                   }
                                }

                                                            27


Wednesday, September 22, 2010
RIF Translation
                 Supervisor subClassOf supervises some Employee



                                Forall ?x ?y (
                                  invalid() :- And (
                                     ?x[type -> Supervisor]
                                     Naf And (
                                        ?x[supervises -> ?y]
                                        ?y[type -> Employee] )))


                                                                   28


Wednesday, September 22, 2010
Solution Summary
         • Separate ICs from regular OWL ICs
               – No new syntax
               – Import-based mechanism
         • Alternative semantics for ICs
               – Extends OWL model theory
               – Provides the meanings of ICs formally
         • Validation algorithm
               – Translate ICs to another formalism
               – SPARQL or RIF engines can be used
                                                         29


Wednesday, September 22, 2010
Explanations
         • Explanations for positive atoms well-understood
               – Smallest subset of the ontology that entails the atom
               – Precise & laconic explanation
               – Lemma generation
         • Explanation for IC violation are tricky
               – Need to explain negation (i.e. missing values)
               – Lemma generation even more crucial
         • Simple solution
               – Explanation represented as a tree where each node
                 represents an existing or missing axiom
                                                                    30


Wednesday, September 22, 2010
Explanation Example (1)
     VIOLATION: A violates related propertyDisjointWith broaderTransitive
           INFERRED: A related C
           INFERRED: A broaderTransitive C




                                                             31


Wednesday, September 22, 2010
Explanation Example (1)
     VIOLATION: A violates related propertyDisjointWith broaderTransitive
           INFERRED: A related C
                 ASSERTED: A related C
           INFERRED: A broaderTransitive C




                                                             32


Wednesday, September 22, 2010
Explanation Example (1)
     VIOLATION: A violates related propertyDisjointWith broaderTransitive
           INFERRED: A related C
                 ASSERTED: A related C
           INFERRED: A broaderTransitive C
                 ASSERTED: A broader B
                 ASSERTED: B broader C
                 ASSERTED: broader subPropertyOf broaderTransitive
                 ASSERTED: broaderTransitive Transitive



                                                              33


Wednesday, September 22, 2010
Explanation Example (2)
     VIOLATION: A violates Label subClassOf literalForm cardinality 1
           INFERRED: A type Label
           INFERRED: A labelRelation LabelA
           NOT INFERRED: LabelA literalForm




            Missing values are represented as
                                                          34


Wednesday, September 22, 2010
Explanation Example (2)
     VIOLATION: A violates Label subClassOf literalForm cardinality 1
           INFERRED: A type Label
                 ASSERTED: A type Label
           INFERRED: A labelRelation LabelA
                 ASSERTED: A labelRelation LabelA
           NOT INFERRED: LabelA literalForm
                 NOT ASSERTED: LabelA literalForm




                                                          35


Wednesday, September 22, 2010
Performance
         • Using ICs can improve performance!
         • Expressive OWL reasoning is not easy
         • Profiles of OWL defined for tractable reasoning
               – OWL 2 QL, OWL 2 EL, OWL 2 RL
               – Less expressive but more efficient
         • Modeling some OWL axioms as ICs may reduce
           the expressivity where OWL reasoning is used


                                                     36


Wednesday, September 22, 2010
Prototype
         • Pellet IC validator
               –   Translates ICs into SPARQL queries automatically
               –   Executes SPARQL queries with Pellet
               –   Query results show constraint violations
               –   Automatically explain constraint violations
         • Free download
               – http://clarkparsia.com/pellet/icv



                                                               37


Wednesday, September 22, 2010
Code Example
         // create an inferencing model using Pellet reasoner
         InfModel dataModel = ModelFactory.createInfModel(r);

         // load the schema and instance data to Pellet
         dataModel.read( "file:data.rdf" );
         dataModel.read( "file:schema.owl" );

         // Create the IC validator and associate it with the dataset
         JenaICValidator validator = new JenaICValidator(dataModel);

         // Load the constraints into the IC validator
         validator.getConstraints().read("file:constraints.owl");

         // Get the constraint violations
         Iterator<ConstraintViolation> violations =
                                               validator.getViolations();

                                                                      38


Wednesday, September 22, 2010
Next Steps
         • W3C Member submission for IC semantics
         • Robust IC validator implementation
               – Incremental validation
               – Multi-threaded validation
         • Support for IC editing
         • Integration with PelletDb
               – Scalable reasoning + validation


                                                   39


Wednesday, September 22, 2010
References
          • Evren Sirin, Michael Smith, Evan Wallace
            Opening, Closing Worlds - On Integrity Constraints
            OWL: Experiences and Directions Workshop
            (OWLED '08), October 2008.
          • Evren Sirin, Jiao Tao
            Towards Integrity Constraints in OWL
            OWL: Experiences and Directions Workshop
            (OWLED '09), October 2009.
          • Jiao Tao, Evren Sirin, Jie Bao, Deborah L. McGuinness
            Integrity Constraints in OWL
            To AppearThe 24th AAAIConference on Artificial
            Intelligence (AAAI '10), July 2010.

                                                          40


Wednesday, September 22, 2010
Questions




                                            41


Wednesday, September 22, 2010
Other Approaches
         •   IC semantics of Motik et al. [WWW2007]
         •   Epistemic DLs
         •   Epistemic QLs
         •   Rules




                                                      42


Wednesday, September 22, 2010
IC Semantics of Motik et al.
         • Same motivation and similar approach
               – Separate ICs for regular OWL axioms
               – Use ICs for validation only
         • Semantics based on outer skolemization on first order
           formula and entailment in minimal Herbrand models
         • Several features of the semantics made it unsuitable
           for us
               – ICs can be satisfied by existential variables
               – Disjunction can cause false positives

                                                                43


Wednesday, September 22, 2010
Epistemic DLs
         • DLs extended with epistemic operator K
         • ICs can be represented as epistemic queries over a
           regular KB
               – Aligns with Reiter’s original characterization of ICs
         • Example:
               – KSupervisor subClassOf Ksupervises some KEmployee
         • Only major differences
               – We are not using K operator explicitly
               – No Unique Name Assumption in OWL ICs
                                                                  44


Wednesday, September 22, 2010

More Related Content

Similar to RR2010 Keynote

Validating Linked Data with OWL
Validating Linked Data with OWLValidating Linked Data with OWL
Validating Linked Data with OWLClark & Parsia LLC
 
DCSF19 Containerized Databases for Enterprise Applications
DCSF19 Containerized Databases for Enterprise ApplicationsDCSF19 Containerized Databases for Enterprise Applications
DCSF19 Containerized Databases for Enterprise ApplicationsDocker, Inc.
 
Nephos Technologies - Extending Security to the Cloud - Cloud Expo 2013
Nephos Technologies - Extending Security to the Cloud - Cloud Expo 2013Nephos Technologies - Extending Security to the Cloud - Cloud Expo 2013
Nephos Technologies - Extending Security to the Cloud - Cloud Expo 2013Lee Biggenden
 
Nephos technologies lee_biggenden_c_expo13_v2.0
Nephos technologies lee_biggenden_c_expo13_v2.0Nephos technologies lee_biggenden_c_expo13_v2.0
Nephos technologies lee_biggenden_c_expo13_v2.0lbiggenden
 
RightScale Webinar: Security Monitoring in the Cloud: How RightScale Does It
RightScale Webinar: Security Monitoring in the Cloud: How RightScale Does ItRightScale Webinar: Security Monitoring in the Cloud: How RightScale Does It
RightScale Webinar: Security Monitoring in the Cloud: How RightScale Does ItRightScale
 
Objectivity/DB: A Multipurpose NoSQL Database
Objectivity/DB: A Multipurpose NoSQL DatabaseObjectivity/DB: A Multipurpose NoSQL Database
Objectivity/DB: A Multipurpose NoSQL DatabaseInfiniteGraph
 
2009.10.22 S308460 Cloud Data Services
2009.10.22 S308460  Cloud Data Services2009.10.22 S308460  Cloud Data Services
2009.10.22 S308460 Cloud Data ServicesJeffrey T. Pollock
 
Building a Lightweight Discovery Interface for China's Patents@NYC Solr/Lucen...
Building a Lightweight Discovery Interface for China's Patents@NYC Solr/Lucen...Building a Lightweight Discovery Interface for China's Patents@NYC Solr/Lucen...
Building a Lightweight Discovery Interface for China's Patents@NYC Solr/Lucen...OpenSource Connections
 
Provenance Management to Enable Data Sharing
Provenance Management to Enable Data SharingProvenance Management to Enable Data Sharing
Provenance Management to Enable Data SharingUniversity of Arizona
 
Introduction to Machine Learning for Oracle Database Professionals
Introduction to Machine Learning for Oracle Database ProfessionalsIntroduction to Machine Learning for Oracle Database Professionals
Introduction to Machine Learning for Oracle Database ProfessionalsAlex Gorbachev
 
Internal Investigations and the Cloud
Internal Investigations and the CloudInternal Investigations and the Cloud
Internal Investigations and the CloudDan Michaluk
 
Enterprise Machine Learning Governance
Enterprise Machine Learning Governance Enterprise Machine Learning Governance
Enterprise Machine Learning Governance Terence Siganakis
 
IIW-11 Beyond Attribute Exchange
IIW-11 Beyond Attribute ExchangeIIW-11 Beyond Attribute Exchange
IIW-11 Beyond Attribute ExchangeJayUnger
 
EDF2012: The Web of Data and its Five Stars
EDF2012: The Web of Data and its Five StarsEDF2012: The Web of Data and its Five Stars
EDF2012: The Web of Data and its Five StarsRichard Cyganiak
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Lucidworks (Archived)
 
What Drove Wordnik Non-Relational?
What Drove Wordnik Non-Relational?What Drove Wordnik Non-Relational?
What Drove Wordnik Non-Relational?DATAVERSITY
 
Building a Data Driven Culture and AI Revolution With Gregory Little | Curren...
Building a Data Driven Culture and AI Revolution With Gregory Little | Curren...Building a Data Driven Culture and AI Revolution With Gregory Little | Curren...
Building a Data Driven Culture and AI Revolution With Gregory Little | Curren...HostedbyConfluent
 
Self-service consumption Data Catalog
Self-service consumption Data CatalogSelf-service consumption Data Catalog
Self-service consumption Data CatalogDenodo
 
Searching Chinese Patents Presentation at Enterprise Data World
Searching Chinese Patents Presentation at Enterprise Data WorldSearching Chinese Patents Presentation at Enterprise Data World
Searching Chinese Patents Presentation at Enterprise Data WorldOpenSource Connections
 
Agile Data Rationalization for Operational Intelligence
Agile Data Rationalization for Operational IntelligenceAgile Data Rationalization for Operational Intelligence
Agile Data Rationalization for Operational IntelligenceInside Analysis
 

Similar to RR2010 Keynote (20)

Validating Linked Data with OWL
Validating Linked Data with OWLValidating Linked Data with OWL
Validating Linked Data with OWL
 
DCSF19 Containerized Databases for Enterprise Applications
DCSF19 Containerized Databases for Enterprise ApplicationsDCSF19 Containerized Databases for Enterprise Applications
DCSF19 Containerized Databases for Enterprise Applications
 
Nephos Technologies - Extending Security to the Cloud - Cloud Expo 2013
Nephos Technologies - Extending Security to the Cloud - Cloud Expo 2013Nephos Technologies - Extending Security to the Cloud - Cloud Expo 2013
Nephos Technologies - Extending Security to the Cloud - Cloud Expo 2013
 
Nephos technologies lee_biggenden_c_expo13_v2.0
Nephos technologies lee_biggenden_c_expo13_v2.0Nephos technologies lee_biggenden_c_expo13_v2.0
Nephos technologies lee_biggenden_c_expo13_v2.0
 
RightScale Webinar: Security Monitoring in the Cloud: How RightScale Does It
RightScale Webinar: Security Monitoring in the Cloud: How RightScale Does ItRightScale Webinar: Security Monitoring in the Cloud: How RightScale Does It
RightScale Webinar: Security Monitoring in the Cloud: How RightScale Does It
 
Objectivity/DB: A Multipurpose NoSQL Database
Objectivity/DB: A Multipurpose NoSQL DatabaseObjectivity/DB: A Multipurpose NoSQL Database
Objectivity/DB: A Multipurpose NoSQL Database
 
2009.10.22 S308460 Cloud Data Services
2009.10.22 S308460  Cloud Data Services2009.10.22 S308460  Cloud Data Services
2009.10.22 S308460 Cloud Data Services
 
Building a Lightweight Discovery Interface for China's Patents@NYC Solr/Lucen...
Building a Lightweight Discovery Interface for China's Patents@NYC Solr/Lucen...Building a Lightweight Discovery Interface for China's Patents@NYC Solr/Lucen...
Building a Lightweight Discovery Interface for China's Patents@NYC Solr/Lucen...
 
Provenance Management to Enable Data Sharing
Provenance Management to Enable Data SharingProvenance Management to Enable Data Sharing
Provenance Management to Enable Data Sharing
 
Introduction to Machine Learning for Oracle Database Professionals
Introduction to Machine Learning for Oracle Database ProfessionalsIntroduction to Machine Learning for Oracle Database Professionals
Introduction to Machine Learning for Oracle Database Professionals
 
Internal Investigations and the Cloud
Internal Investigations and the CloudInternal Investigations and the Cloud
Internal Investigations and the Cloud
 
Enterprise Machine Learning Governance
Enterprise Machine Learning Governance Enterprise Machine Learning Governance
Enterprise Machine Learning Governance
 
IIW-11 Beyond Attribute Exchange
IIW-11 Beyond Attribute ExchangeIIW-11 Beyond Attribute Exchange
IIW-11 Beyond Attribute Exchange
 
EDF2012: The Web of Data and its Five Stars
EDF2012: The Web of Data and its Five StarsEDF2012: The Web of Data and its Five Stars
EDF2012: The Web of Data and its Five Stars
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
 
What Drove Wordnik Non-Relational?
What Drove Wordnik Non-Relational?What Drove Wordnik Non-Relational?
What Drove Wordnik Non-Relational?
 
Building a Data Driven Culture and AI Revolution With Gregory Little | Curren...
Building a Data Driven Culture and AI Revolution With Gregory Little | Curren...Building a Data Driven Culture and AI Revolution With Gregory Little | Curren...
Building a Data Driven Culture and AI Revolution With Gregory Little | Curren...
 
Self-service consumption Data Catalog
Self-service consumption Data CatalogSelf-service consumption Data Catalog
Self-service consumption Data Catalog
 
Searching Chinese Patents Presentation at Enterprise Data World
Searching Chinese Patents Presentation at Enterprise Data WorldSearching Chinese Patents Presentation at Enterprise Data World
Searching Chinese Patents Presentation at Enterprise Data World
 
Agile Data Rationalization for Operational Intelligence
Agile Data Rationalization for Operational IntelligenceAgile Data Rationalization for Operational Intelligence
Agile Data Rationalization for Operational Intelligence
 

More from Clark & Parsia LLC

Stardog 1.1: Easier, Smarter, Faster RDF Database
Stardog 1.1: Easier, Smarter, Faster RDF DatabaseStardog 1.1: Easier, Smarter, Faster RDF Database
Stardog 1.1: Easier, Smarter, Faster RDF DatabaseClark & Parsia LLC
 
PelletServer: REST and Semantic Technologies
PelletServer: REST and Semantic TechnologiesPelletServer: REST and Semantic Technologies
PelletServer: REST and Semantic TechnologiesClark & Parsia LLC
 
PelletDb: Scalable Reasoning for Enterprise Semantics
PelletDb: Scalable Reasoning for Enterprise SemanticsPelletDb: Scalable Reasoning for Enterprise Semantics
PelletDb: Scalable Reasoning for Enterprise SemanticsClark & Parsia LLC
 
Automated Planning as a Semantic Technology
Automated Planning as a Semantic TechnologyAutomated Planning as a Semantic Technology
Automated Planning as a Semantic TechnologyClark & Parsia LLC
 
SemTech 2010: Pelorus Platform
SemTech 2010: Pelorus PlatformSemTech 2010: Pelorus Platform
SemTech 2010: Pelorus PlatformClark & Parsia LLC
 

More from Clark & Parsia LLC (9)

Stardog Linked Data Catalog
Stardog Linked Data CatalogStardog Linked Data Catalog
Stardog Linked Data Catalog
 
Stardog 1.1: Easier, Smarter, Faster RDF Database
Stardog 1.1: Easier, Smarter, Faster RDF DatabaseStardog 1.1: Easier, Smarter, Faster RDF Database
Stardog 1.1: Easier, Smarter, Faster RDF Database
 
Stardog talk-dc-march-17
Stardog talk-dc-march-17Stardog talk-dc-march-17
Stardog talk-dc-march-17
 
Terp: An OWL-friendly SPARQL
Terp: An OWL-friendly SPARQLTerp: An OWL-friendly SPARQL
Terp: An OWL-friendly SPARQL
 
PelletServer: REST and Semantic Technologies
PelletServer: REST and Semantic TechnologiesPelletServer: REST and Semantic Technologies
PelletServer: REST and Semantic Technologies
 
PelletDb: Scalable Reasoning for Enterprise Semantics
PelletDb: Scalable Reasoning for Enterprise SemanticsPelletDb: Scalable Reasoning for Enterprise Semantics
PelletDb: Scalable Reasoning for Enterprise Semantics
 
Automated Planning as a Semantic Technology
Automated Planning as a Semantic TechnologyAutomated Planning as a Semantic Technology
Automated Planning as a Semantic Technology
 
Empire: JPA for RDF & SPARQL
Empire: JPA for RDF & SPARQLEmpire: JPA for RDF & SPARQL
Empire: JPA for RDF & SPARQL
 
SemTech 2010: Pelorus Platform
SemTech 2010: Pelorus PlatformSemTech 2010: Pelorus Platform
SemTech 2010: Pelorus Platform
 

Recently uploaded

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 

Recently uploaded (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 

RR2010 Keynote

  • 1. Data Validation with OWL Integrity Constraints Evren Sirin, CTO Clark & Parsia, LLC evren@clarkparsia.com 1 Wednesday, September 22, 2010
  • 2. Who are we? • Clark & Parsia is a semantic software startup! – HQ in Washington, DC & office in Boston • Provides software development and integration services • Specializing in Semantic Web, web services, and advanced AI technologies for federal and enterprise customers! http://clarkparsia.com/ Twitter: @candp 2 Wednesday, September 22, 2010
  • 3. Overview • Data validation with OWL – Representation and validation of integrity constraints • Use cases – Examples, issues, workarounds • OWL Integrity Constraints – Syntax, semantics, validation • Comparison with other approaches – Epistemic DLs, Epistemic QLs, Rules • Implementation and performance 3 Wednesday, September 22, 2010
  • 4. Some Applications • Customer and product data – Find which customer would be interested in buying a certain product • System and component descriptions – Configure components to build a desired system • Workforce and employee data – Locate employees with desired expertise • Patient history and drug data – Detect and prevent potentially harmful drug interactions 4 Wednesday, September 22, 2010
  • 5. Common Theme • There is data and lots of it! • Adding semantics to the data helps a lot – Sometimes simple taxonomies, but other times, complex ontologies • We have complete knowledge about the domain • Errors in the data cause problems – Failures in applications, errors in decision making, potential loss of revenue, security vulnerabilities, etc. 5 Wednesday, September 22, 2010
  • 6. Data Validation • Fundamental!data management!problem – Verify data integrity and correctness! – Enforce validity of updates! • Relevant in many scenarios – Storing data for stand-alone applications – Exchanging data in distributed settings • Solved (to some degree) in RDBMSs – Harder to achieve as data semantics increase and/or more expressive integrity conditions are required 6 Wednesday, September 22, 2010
  • 7. Disclaimer • Data validity not important for every use case – Invalid data may be fine for an application – Invalidity may even be a requirement • Focus of this talk is cases where data consistency and integrity are crucial 7 Wednesday, September 22, 2010
  • 8. Building Semantic Apps • Represent data as RDF triples – First step for accomplishing data integration and analysis • Enrich data with more semantics (RDFS, OWL) – Infer implicit information from explicit assertions • Ensure data validity – Detect errors in the data • Do something cool with the data – Obviously... 8 Wednesday, September 22, 2010
  • 9. Reasoning Example • Input ontology # Every supervisor is an employee Supervisor subClassOf Employee # Person0853 is a manager Person085 type Supervisor • Output inferences # Person0853 is an employee Person085 type Employee 9 Wednesday, September 22, 2010
  • 10. Reasoning Example • Input ontology # Every supervisor is an employee Schema Supervisor subClassOf Employee # Person0853 is a manager Person085 type Supervisor • Output inferences # Person0853 is an employee Person085 type Employee 9 Wednesday, September 22, 2010
  • 11. Reasoning Example • Input ontology # Every supervisor is an employee Schema Supervisor subClassOf Employee # Person0853 is a manager Person085 type Supervisor Instance data • Output inferences # Person0853 is an employee Person085 type Employee 9 Wednesday, September 22, 2010
  • 12. Validating RDF Data • Common misunderstanding – RDFS/OWL is to RDF what XML Schema is to XML – Describe integrity conditions in RDFS or OWL • Typing constraints - RDFS domain/range • Participation constraints - OWL some values restrictions • Uniqueness constraints - OWL cardinality restriction – Use a reasoner to find inconsistencies • Problem:!Open World Assumption 10 Wednesday, September 22, 2010
  • 13. Closed vs. Open World • Two different views on truth: – CWA: Any statement that is not known to be true is false – OWA: A statement is false only if it is known to be false • Used in different contexts – Databases use CWA because (typically) they contain! complete information – Ontologies use OWA because (typically) they don't... that is, they contain!incomplete information • Data validation results significantly different when using CWA instead of OWA 11 Wednesday, September 22, 2010
  • 14. Typing Constraint • Only managers can supervise employees • Input ontology o supervises domain Supervisor o Person085 supervises Person173 OWA CWA !Consistent true false Infer that Assume that !Reason Person085 type Supervisor Person085 type not Supervisor 12 Wednesday, September 22, 2010
  • 15. Participation Constraint • Each supervisor must supervise at least one employee • Input axioms o Supervisor subClassOf supervises some Employee o Person085 type Supervisor OWA CWA Consistent true false Infer that Assume that Reason Person085 supervises _:b Person085 supervises _:b _:b type Employee does not exist 13 Wednesday, September 22, 2010
  • 16. Uniqueness Constraint • Employees can have at most one supervisor • Input axioms o supervises InverseFunctional o Person085 supervises Person173 o Person632 supervises Person173 OWA CWA Consistent true false Assume that Infer that Reason Person085 sameAs Person632 Person085 sameAs Person632 does not hold 14 Wednesday, September 22, 2010
  • 17. Workarounds for CW • Manually close the world – Declare all individuals different from each other – Count existing property values and add a max cardinality restriction – Make all disjointness statements explicit and add negated types to individuals • Drawbacks – Can be computationally expensive – Likely to be error-prone 15 Wednesday, September 22, 2010
  • 18. Problem Summary • Definitions in an OWL schema may have two purposes – Infer new statements – Check if existing statements are valid • Using OWA for validation is undesirable – Not always but in many cases • In a problem domain we may have: – Complete knowledge about some parts of the domain – Incomplete knowledge about the other parts 16 Wednesday, September 22, 2010
  • 19. Integrity Constraint Solution • We defined an alternative semantics for OWL – Integrity Constraint (IC) semantics use CWA – Can be combined with regular inference axioms • Ontology developer chooses which axioms will be!interpreted with... – OWA - regular OWL axiom, or – CWA - integrity constraint 17 Wednesday, September 22, 2010
  • 20. IC Extension • Syntax specification – How do we syntactically say an axiom is an IC and not a regular OWL axiom? • Semantics specification – How do we exactly interpret an IC? • Validation algorithm – Given the semantics how do we check for IC violations? 18 Wednesday, September 22, 2010
  • 21. IC Syntax • Similar approach to using owl:imports • Define a new annotation property in a new namespace Ont1 owl:imports Ont2 Ont1 ic:imports IC1 • Backward compatible, requires minimum change in tools 19 Wednesday, September 22, 2010
  • 22. Use Case: SKOS • Simple Knowledge Organization System (SKOS) • SKOS provides a model for expressing the basic structure and content of concept schemes – Thesauri, classification schemes, subject heading lists, taxonomies, folksonomies, etc. • SKOS data model specification – Informal (Text): http://www.w3.org/TR/skos-reference/ – Formal (OWL): http://www.w3.org/2004/02/skos/core.rdf 20 Wednesday, September 22, 2010
  • 23. SKOS Example # SKOS reference ontology that contains inference rules skos:broaderTransitive Transitive skos-reference.ttl skos:broaderTransitive subPropertyOf skos:broader # NL constraints from SKOS specification expressed as ICs skos:related propertyDisjointWith skos:broaderTransitive skos-constraints.ttl # SKOS data that violates the SKOS data model [] a owl:Ontology ; owl:imports skos-reference.ttl ;                  ic:imports skos-constraints.ttl . skos-invalid.ttl A skos:broader B ; skos:related C . B skos:broader C . IC validation requires OWL reasoning 21 Wednesday, September 22, 2010
  • 24. Another SKOS Example # SKOS-XL ontology with a cardinality restriction skosxl:Label subClassOf skos-xl.ttl skosxl:literalForm cardinality 1 # SKOS data that violates the SKOS data model when # SKOS ontology is imported as ICs as well [] a owl:Ontology ; owl:imports skos-xl.ttl ;                  ic:imports skos-xl.ttl . skos-invalid.tll A skosxl:labelRelation LabelA LabelA type skosxl:Label . Same ontology can be both a regular OWL import and an IC import 22 Wednesday, September 22, 2010
  • 25. IC Semantics • OWL semantics based on model theory – Similar to First Order Logic – Formal, precise, and unambiguous • IC semantics specification – Extends OWL model theory – Change couple basic definitions, everything else follows • Details published in technical papers – We are submitting a W3C member submission soon 23 Wednesday, September 22, 2010
  • 26. IC Interpretations • A regular OWL interpretation I = ( ΔI, ΔD, ⋅C, ⋅OP, ⋅DP, ⋅I, ⋅DT, ⋅LT, ⋅FA) is a 9-tuple – ⋅C is the class interpretation function that assigns to each class C ∈ VC a subset (C)C ⊆ ΔI • An OWL IC interpretation Γ = ( ΔI, ΔD, I, U, ⋅C, ⋅OP, ⋅DP, ⋅I, ⋅DT, ⋅LT, ⋅FA) is a 11-tuple where I and elements of U are regular OWL interpretations – (C)C = {xI | x ∈ VI and for each Uj ∈ U we have that xIUj ∈ (C)Cj }. • More details available in references at the end 24 Wednesday, September 22, 2010
  • 27. Other Approaches • IC semantics of Motik et al. [WWW2007] • Epistemic DLs • Epistemic QLs • Rules with negation as failure operators 25 Wednesday, September 22, 2010
  • 28. Validation Algorithm • An automated translation!algorithm • Automatically maps an OWL IC to a SPARQL query – Query must be evaluated with OWL entailment regime • ICs can be mapped to RIF rules too – Use SPARQL and Datalog correspondence – RIF defines Negation-as-Failure operator • Many different implementation possibilities • Off-the-shelf tools can be used for IC validation 26 Wednesday, September 22, 2010
  • 29. SPARQL Translation Supervisor subClassOf supervises some Employee SELECT * { ?x type Supervisor. NOT EXISTS { ?x supervises ?y. ?y type Employee. } } 27 Wednesday, September 22, 2010
  • 30. RIF Translation Supervisor subClassOf supervises some Employee Forall ?x ?y ( invalid() :- And ( ?x[type -> Supervisor] Naf And ( ?x[supervises -> ?y] ?y[type -> Employee] ))) 28 Wednesday, September 22, 2010
  • 31. Solution Summary • Separate ICs from regular OWL ICs – No new syntax – Import-based mechanism • Alternative semantics for ICs – Extends OWL model theory – Provides the meanings of ICs formally • Validation algorithm – Translate ICs to another formalism – SPARQL or RIF engines can be used 29 Wednesday, September 22, 2010
  • 32. Explanations • Explanations for positive atoms well-understood – Smallest subset of the ontology that entails the atom – Precise & laconic explanation – Lemma generation • Explanation for IC violation are tricky – Need to explain negation (i.e. missing values) – Lemma generation even more crucial • Simple solution – Explanation represented as a tree where each node represents an existing or missing axiom 30 Wednesday, September 22, 2010
  • 33. Explanation Example (1) VIOLATION: A violates related propertyDisjointWith broaderTransitive INFERRED: A related C INFERRED: A broaderTransitive C 31 Wednesday, September 22, 2010
  • 34. Explanation Example (1) VIOLATION: A violates related propertyDisjointWith broaderTransitive INFERRED: A related C ASSERTED: A related C INFERRED: A broaderTransitive C 32 Wednesday, September 22, 2010
  • 35. Explanation Example (1) VIOLATION: A violates related propertyDisjointWith broaderTransitive INFERRED: A related C ASSERTED: A related C INFERRED: A broaderTransitive C ASSERTED: A broader B ASSERTED: B broader C ASSERTED: broader subPropertyOf broaderTransitive ASSERTED: broaderTransitive Transitive 33 Wednesday, September 22, 2010
  • 36. Explanation Example (2) VIOLATION: A violates Label subClassOf literalForm cardinality 1 INFERRED: A type Label INFERRED: A labelRelation LabelA NOT INFERRED: LabelA literalForm Missing values are represented as 34 Wednesday, September 22, 2010
  • 37. Explanation Example (2) VIOLATION: A violates Label subClassOf literalForm cardinality 1 INFERRED: A type Label ASSERTED: A type Label INFERRED: A labelRelation LabelA ASSERTED: A labelRelation LabelA NOT INFERRED: LabelA literalForm NOT ASSERTED: LabelA literalForm 35 Wednesday, September 22, 2010
  • 38. Performance • Using ICs can improve performance! • Expressive OWL reasoning is not easy • Profiles of OWL defined for tractable reasoning – OWL 2 QL, OWL 2 EL, OWL 2 RL – Less expressive but more efficient • Modeling some OWL axioms as ICs may reduce the expressivity where OWL reasoning is used 36 Wednesday, September 22, 2010
  • 39. Prototype • Pellet IC validator – Translates ICs into SPARQL queries automatically – Executes SPARQL queries with Pellet – Query results show constraint violations – Automatically explain constraint violations • Free download – http://clarkparsia.com/pellet/icv 37 Wednesday, September 22, 2010
  • 40. Code Example // create an inferencing model using Pellet reasoner InfModel dataModel = ModelFactory.createInfModel(r); // load the schema and instance data to Pellet dataModel.read( "file:data.rdf" ); dataModel.read( "file:schema.owl" ); // Create the IC validator and associate it with the dataset JenaICValidator validator = new JenaICValidator(dataModel); // Load the constraints into the IC validator validator.getConstraints().read("file:constraints.owl"); // Get the constraint violations Iterator<ConstraintViolation> violations = validator.getViolations(); 38 Wednesday, September 22, 2010
  • 41. Next Steps • W3C Member submission for IC semantics • Robust IC validator implementation – Incremental validation – Multi-threaded validation • Support for IC editing • Integration with PelletDb – Scalable reasoning + validation 39 Wednesday, September 22, 2010
  • 42. References • Evren Sirin, Michael Smith, Evan Wallace Opening, Closing Worlds - On Integrity Constraints OWL: Experiences and Directions Workshop (OWLED '08), October 2008. • Evren Sirin, Jiao Tao Towards Integrity Constraints in OWL OWL: Experiences and Directions Workshop (OWLED '09), October 2009. • Jiao Tao, Evren Sirin, Jie Bao, Deborah L. McGuinness Integrity Constraints in OWL To AppearThe 24th AAAIConference on Artificial Intelligence (AAAI '10), July 2010. 40 Wednesday, September 22, 2010
  • 43. Questions 41 Wednesday, September 22, 2010
  • 44. Other Approaches • IC semantics of Motik et al. [WWW2007] • Epistemic DLs • Epistemic QLs • Rules 42 Wednesday, September 22, 2010
  • 45. IC Semantics of Motik et al. • Same motivation and similar approach – Separate ICs for regular OWL axioms – Use ICs for validation only • Semantics based on outer skolemization on first order formula and entailment in minimal Herbrand models • Several features of the semantics made it unsuitable for us – ICs can be satisfied by existential variables – Disjunction can cause false positives 43 Wednesday, September 22, 2010
  • 46. Epistemic DLs • DLs extended with epistemic operator K • ICs can be represented as epistemic queries over a regular KB – Aligns with Reiter’s original characterization of ICs • Example: – KSupervisor subClassOf Ksupervises some KEmployee • Only major differences – We are not using K operator explicitly – No Unique Name Assumption in OWL ICs 44 Wednesday, September 22, 2010