SlideShare a Scribd company logo
Refactoring Metadata:
                 Finding architectural
                 compatibility through
                structural comparisons
                          Baden Hughes
       Department of Computer Science and Software Engineering
                      The University of Melbourne




May 2-6, 2004          Copyright © 2004 Baden Hughes             1
Agenda
•   Motivation for Refactoring Metadata
•   Setting the Context
•   Identifying Points of Comparison
•   Goals for Structural Comparison
•   Methods for Structural Comparison
•   Refactoring in Practice
•   Principles for Robust Instances
•   Conclusion



May 2-6, 2004     Copyright © 2004 Baden Hughes   2
Motivations for Refactoring
                    Metadata
• The need to addressing the problem of metadata volatility is
  conceptually juxtaposed with the motivation for metadata
  creation
• XML technologies have become pervasive within the
  metadata domain
• Different communities = different standards = different
  degrees of maturity in metadata adoption resulting in
  metadata being (highly?) variable even within an organization
• Systematically determining similarity and difference is the key
  to effective refactoring of metadata
• Automatically determining similarity and difference is the key
  to efficient refactoring of metadata



May 2-6, 2004         Copyright © 2004 Baden Hughes                 3
Setting the Context
• XML-based metadata from natural language engineering
  and digital libraries
• Wide variety of
     –   traditions of metadata development
     –   technologies for metadata implementation
     –   objects described by metadata
     –   granularity of metadata descriptions
• Motivated by interoperability analysis
• Seeking to leverage processes not dissimilar to
  database schema comparisons




May 2-6, 2004            Copyright © 2004 Baden Hughes   4
Identifying Points of Comparison
• Robust instances require both syntactic and semantic
  analysis
• Points of comparison
     –   XML Document Instance
     –   DTDs
     –   Schemata
     –   Namespaces
     –   RDF Instances
     –   Ontologies
• Likely that different methods are required for each
  different input



May 2-6, 2004          Copyright © 2004 Baden Hughes     5
Goals of Structural Comparison
• While validation of XML based metadata does contribute
  to the quality of metadata, it does not necessarily assist
  in determining architectural compatibility
• Systematic, iterative evaluation of metadata
  architectures can contribute to maturity of XML based
  metadata
• Quantifying the degree of syntactic and semantic
  similarity is an important first step in the refactoring
  process – it may in fact demonstrate viability




May 2-6, 2004       Copyright © 2004 Baden Hughes          6
Methods for Structural Comparison
• Different methods for structural comparison
  depending on the input
     –   XML documents: trees
     –   DTDs: regexps and feature structures
     –   XML Namespaces: feature structures
     –   XML Schemas: regexps and graph matching
     –   RDF Instances: graph matching
     –   Ontologies: feature structures and graph matching



May 2-6, 2004          Copyright © 2004 Baden Hughes         7
Tree Matching
• Common conception of an XML document as a tree
  structure
• Tree matching is a widely used IE/IR technique for
  structured data, and is applicable to XML based
  metadata
• Tree matching is largely derivative from pattern
  matching, and is largely independent of syntactic or
  semantic constraints
• While tree matching can provide basic information about
  the similarity of two documents, for architectural
  compatibility a deeper analysis is required



May 2-6, 2004      Copyright © 2004 Baden Hughes            8
Regular Expression Matching
• DTD syntax is derived from regular expressions
• Well known evaluation methodologies for regexps are
  applicable to DTDs
• In contrast to pure syntactic comparison, regexp
  matching allows the discovery of the legal constituents of
  syntactic structures
• Regexp evaluation is is a highly efficient exercise even
  on large metadata collections, and widely implemented
  in common programming languages




May 2-6, 2004       Copyright © 2004 Baden Hughes          9
Feature Structure Matching
• Typed feature structures are widely used for
  deriving controlled vocabularies – XML attribute
  instances are typically able to be reduced to
  typed feature structures for comparison
• Evaluation of the semantic content of feature
  structures is well grounded in formal logic
• Feature structure comparisons can also reveal
  syntactic constraints expressed as
  dimensionality of feature matrices



May 2-6, 2004    Copyright © 2004 Baden Hughes   10
Graph Matching
• Rich XML representations such as RDF can be
  construed as a series of arcs and nodes, allowing the
  adoption of graph theory techniques for the
  determination of isomorphism
• Finding the minimum and maximum common subgraphs
  is a technique which can be used to determine
  architectural compatibility in the syntactic domain
• Graph matching is primarily syntactic, although it can
  also be applied to semantic analysis on sources such as
  ontologies



May 2-6, 2004      Copyright © 2004 Baden Hughes        11
Refactoring in Practice
•   XML Documents
•   DTDs
•   Namespaces
•   Schemata
•   RDF instances
•   Ontologies
•   See http://www.cs.mu.oz.au/~badenh/projects/metadata-comparison
    for demo materials




May 2-6, 2004          Copyright © 2004 Baden Hughes             12
Principles for Robust Instances
• Both syntactic and semantic analysis are required
• Initiate comparisons at the highest level, and proceed
  downwards – higher level incompatibilities are more
  complex to resolve
• Quantifying degree of similarity is extremely important as
  it impacts directly on the complexity of refactoring
  processes
• Accurately identified commonalities at both syntactic and
  semantic levels can be leveraged efficiently




May 2-6, 2004       Copyright © 2004 Baden Hughes         13
Conclusion
• Adopting and permuting a range of techniques
  for structural comparison from a variety of other
  disciplines can lead to efficient methods for
  metadata structural analysis and consequently
  refactoring
• Large scale metadata management requires an
  automated approach to both syntactic and
  semantic evaluation in order to contribute to ROI


May 2-6, 2004    Copyright © 2004 Baden Hughes    14
Acknowledgements
• National Science Foundation Grant
  Number 9910603 (International Standards
  in Language Engineering)
• National Science Foundation Grant
  Number 0317826 (Querying Linguistic
  Databases)



May 2-6, 2004      Copyright © 2004 Baden Hughes   15

More Related Content

What's hot

Automatically converting tabular data to
Automatically converting tabular data toAutomatically converting tabular data to
Automatically converting tabular data to
IJwest
 
Going for GOLD - Adventures in Open Linked Metadata
Going for GOLD - Adventures in Open Linked MetadataGoing for GOLD - Adventures in Open Linked Metadata
Going for GOLD - Adventures in Open Linked Metadata
EDINA, University of Edinburgh
 
Xml based data exchange in the
Xml based data exchange in theXml based data exchange in the
Xml based data exchange in the
IJwest
 
master_thesis_greciano_v2
master_thesis_greciano_v2master_thesis_greciano_v2
master_thesis_greciano_v2
M. Christian Greciano, MSc
 
Paper id 25201463
Paper id 25201463Paper id 25201463
Paper id 25201463
IJRAT
 
Jarrar: Architectural solutions in Data Integration
Jarrar: Architectural solutions in Data IntegrationJarrar: Architectural solutions in Data Integration
Jarrar: Architectural solutions in Data Integration
Mustafa Jarrar
 
Using linguistic analysis to translate
Using linguistic analysis to translateUsing linguistic analysis to translate
Using linguistic analysis to translate
IJwest
 
Development of a new indexing technique for XML document retrieval
Development of a new indexing technique for XML document retrievalDevelopment of a new indexing technique for XML document retrieval
Development of a new indexing technique for XML document retrieval
Amjad Ali
 
Semantic Web Nature
Semantic Web NatureSemantic Web Nature
Semantic Web Nature
Constantin Stan
 
The Data Web and PLM
The Data Web and PLMThe Data Web and PLM
The Data Web and PLM
Koneksys
 
A category theoretic model of rdf ontology
A category theoretic model of rdf ontologyA category theoretic model of rdf ontology
A category theoretic model of rdf ontology
IJwest
 
Jarrar: Data Schema Integration
Jarrar: Data Schema Integration Jarrar: Data Schema Integration
Jarrar: Data Schema Integration
Mustafa Jarrar
 
A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...
A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...
A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...
csandit
 
Website Performance at Client Level
Website Performance at Client LevelWebsite Performance at Client Level
Website Performance at Client Level
Constantin Stan
 
Semantic Conflicts and Solutions in Integration of Fuzzy Relational Databases
Semantic Conflicts and Solutions in Integration of Fuzzy Relational DatabasesSemantic Conflicts and Solutions in Integration of Fuzzy Relational Databases
Semantic Conflicts and Solutions in Integration of Fuzzy Relational Databases
ijsrd.com
 
Heterogeneous fuzzy xml data integration based on structrual and semantic sim...
Heterogeneous fuzzy xml data integration based on structrual and semantic sim...Heterogeneous fuzzy xml data integration based on structrual and semantic sim...
Heterogeneous fuzzy xml data integration based on structrual and semantic sim...
Amir Shokri
 
{Ontology: Resource} x {Matching : Mapping} x {Schema : Instance} :: Compone...
{Ontology: Resource} x {Matching : Mapping} x {Schema : Instance} :: Compone...{Ontology: Resource} x {Matching : Mapping} x {Schema : Instance} :: Compone...
{Ontology: Resource} x {Matching : Mapping} x {Schema : Instance} :: Compone...
Amit Sheth
 

What's hot (17)

Automatically converting tabular data to
Automatically converting tabular data toAutomatically converting tabular data to
Automatically converting tabular data to
 
Going for GOLD - Adventures in Open Linked Metadata
Going for GOLD - Adventures in Open Linked MetadataGoing for GOLD - Adventures in Open Linked Metadata
Going for GOLD - Adventures in Open Linked Metadata
 
Xml based data exchange in the
Xml based data exchange in theXml based data exchange in the
Xml based data exchange in the
 
master_thesis_greciano_v2
master_thesis_greciano_v2master_thesis_greciano_v2
master_thesis_greciano_v2
 
Paper id 25201463
Paper id 25201463Paper id 25201463
Paper id 25201463
 
Jarrar: Architectural solutions in Data Integration
Jarrar: Architectural solutions in Data IntegrationJarrar: Architectural solutions in Data Integration
Jarrar: Architectural solutions in Data Integration
 
Using linguistic analysis to translate
Using linguistic analysis to translateUsing linguistic analysis to translate
Using linguistic analysis to translate
 
Development of a new indexing technique for XML document retrieval
Development of a new indexing technique for XML document retrievalDevelopment of a new indexing technique for XML document retrieval
Development of a new indexing technique for XML document retrieval
 
Semantic Web Nature
Semantic Web NatureSemantic Web Nature
Semantic Web Nature
 
The Data Web and PLM
The Data Web and PLMThe Data Web and PLM
The Data Web and PLM
 
A category theoretic model of rdf ontology
A category theoretic model of rdf ontologyA category theoretic model of rdf ontology
A category theoretic model of rdf ontology
 
Jarrar: Data Schema Integration
Jarrar: Data Schema Integration Jarrar: Data Schema Integration
Jarrar: Data Schema Integration
 
A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...
A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...
A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...
 
Website Performance at Client Level
Website Performance at Client LevelWebsite Performance at Client Level
Website Performance at Client Level
 
Semantic Conflicts and Solutions in Integration of Fuzzy Relational Databases
Semantic Conflicts and Solutions in Integration of Fuzzy Relational DatabasesSemantic Conflicts and Solutions in Integration of Fuzzy Relational Databases
Semantic Conflicts and Solutions in Integration of Fuzzy Relational Databases
 
Heterogeneous fuzzy xml data integration based on structrual and semantic sim...
Heterogeneous fuzzy xml data integration based on structrual and semantic sim...Heterogeneous fuzzy xml data integration based on structrual and semantic sim...
Heterogeneous fuzzy xml data integration based on structrual and semantic sim...
 
{Ontology: Resource} x {Matching : Mapping} x {Schema : Instance} :: Compone...
{Ontology: Resource} x {Matching : Mapping} x {Schema : Instance} :: Compone...{Ontology: Resource} x {Matching : Mapping} x {Schema : Instance} :: Compone...
{Ontology: Resource} x {Matching : Mapping} x {Schema : Instance} :: Compone...
 

Viewers also liked

Week 35 Sponge
Week 35 SpongeWeek 35 Sponge
Week 35 Sponge
Corey Topf
 
Good to Great - Journal
Good to Great - JournalGood to Great - Journal
Good to Great - Journal
Gopal Thiruvenkadam
 
Week 23 Sponges
Week 23 SpongesWeek 23 Sponges
Week 23 Sponges
Corey Topf
 
0708 Usability Test Methodes
0708 Usability Test Methodes0708 Usability Test Methodes
0708 Usability Test MethodesHans Kemp
 
Promozione dei contenuti web sui motori di ricerca - Eupolis Regione Lombardia
Promozione dei contenuti web sui motori di ricerca - Eupolis Regione LombardiaPromozione dei contenuti web sui motori di ricerca - Eupolis Regione Lombardia
Promozione dei contenuti web sui motori di ricerca - Eupolis Regione Lombardia
Claudio Celeghin
 
Elasticity 1
Elasticity 1Elasticity 1
Elasticity 1
Corey Topf
 
Techo may9th
Techo may9thTecho may9th
Techo may9th
Corey Topf
 
For Sale 7914 Skyview St Slideshow
For Sale 7914 Skyview St SlideshowFor Sale 7914 Skyview St Slideshow
For Sale 7914 Skyview St Slideshow
rteam
 
Adult learning
Adult learningAdult learning
Adult learning
Jorge E. Valdez
 
0809 UXD minors graduation information
0809 UXD minors graduation information0809 UXD minors graduation information
0809 UXD minors graduation information
Hans Kemp
 
Gardeners Not Gate Keepers
Gardeners Not Gate KeepersGardeners Not Gate Keepers
Gardeners Not Gate Keepers
calevans
 
Multimodal, Crossmedia, Multi Platform
Multimodal, Crossmedia, Multi PlatformMultimodal, Crossmedia, Multi Platform
Multimodal, Crossmedia, Multi Platform
Hans Kemp
 
Iad2 0910 q1 hoorcollege 4
Iad2 0910 q1 hoorcollege 4Iad2 0910 q1 hoorcollege 4
Iad2 0910 q1 hoorcollege 4Hans Kemp
 
Zappos - Connect 09 - 5-13-09
Zappos - Connect 09 - 5-13-09Zappos - Connect 09 - 5-13-09
Zappos - Connect 09 - 5-13-09
zappos
 
Encoding and Presenting Interlinear Text Using XML Technologies
Encoding and Presenting Interlinear Text Using XML TechnologiesEncoding and Presenting Interlinear Text Using XML Technologies
Encoding and Presenting Interlinear Text Using XML Technologies
Baden Hughes
 
Iad2 0809Q3 Feedback Kwartaalopdracht
Iad2 0809Q3 Feedback KwartaalopdrachtIad2 0809Q3 Feedback Kwartaalopdracht
Iad2 0809Q3 Feedback KwartaalopdrachtHans Kemp
 
1st Trimester Sponges
1st Trimester Sponges1st Trimester Sponges
1st Trimester Sponges
Corey Topf
 
Interactieve datavisualisaties
Interactieve datavisualisatiesInteractieve datavisualisaties
Interactieve datavisualisatiesHans Kemp
 
Actividad 3
Actividad 3Actividad 3

Viewers also liked (20)

Week 35 Sponge
Week 35 SpongeWeek 35 Sponge
Week 35 Sponge
 
Good to Great - Journal
Good to Great - JournalGood to Great - Journal
Good to Great - Journal
 
Week 23 Sponges
Week 23 SpongesWeek 23 Sponges
Week 23 Sponges
 
0708 Usability Test Methodes
0708 Usability Test Methodes0708 Usability Test Methodes
0708 Usability Test Methodes
 
Promozione dei contenuti web sui motori di ricerca - Eupolis Regione Lombardia
Promozione dei contenuti web sui motori di ricerca - Eupolis Regione LombardiaPromozione dei contenuti web sui motori di ricerca - Eupolis Regione Lombardia
Promozione dei contenuti web sui motori di ricerca - Eupolis Regione Lombardia
 
Elasticity 1
Elasticity 1Elasticity 1
Elasticity 1
 
Techo may9th
Techo may9thTecho may9th
Techo may9th
 
For Sale 7914 Skyview St Slideshow
For Sale 7914 Skyview St SlideshowFor Sale 7914 Skyview St Slideshow
For Sale 7914 Skyview St Slideshow
 
Adult learning
Adult learningAdult learning
Adult learning
 
0809 UXD minors graduation information
0809 UXD minors graduation information0809 UXD minors graduation information
0809 UXD minors graduation information
 
Gardeners Not Gate Keepers
Gardeners Not Gate KeepersGardeners Not Gate Keepers
Gardeners Not Gate Keepers
 
Multimodal, Crossmedia, Multi Platform
Multimodal, Crossmedia, Multi PlatformMultimodal, Crossmedia, Multi Platform
Multimodal, Crossmedia, Multi Platform
 
Iad2 0910 q1 hoorcollege 4
Iad2 0910 q1 hoorcollege 4Iad2 0910 q1 hoorcollege 4
Iad2 0910 q1 hoorcollege 4
 
Zappos - Connect 09 - 5-13-09
Zappos - Connect 09 - 5-13-09Zappos - Connect 09 - 5-13-09
Zappos - Connect 09 - 5-13-09
 
Encoding and Presenting Interlinear Text Using XML Technologies
Encoding and Presenting Interlinear Text Using XML TechnologiesEncoding and Presenting Interlinear Text Using XML Technologies
Encoding and Presenting Interlinear Text Using XML Technologies
 
Iad2 0809Q3 Feedback Kwartaalopdracht
Iad2 0809Q3 Feedback KwartaalopdrachtIad2 0809Q3 Feedback Kwartaalopdracht
Iad2 0809Q3 Feedback Kwartaalopdracht
 
s
ss
s
 
1st Trimester Sponges
1st Trimester Sponges1st Trimester Sponges
1st Trimester Sponges
 
Interactieve datavisualisaties
Interactieve datavisualisatiesInteractieve datavisualisaties
Interactieve datavisualisaties
 
Actividad 3
Actividad 3Actividad 3
Actividad 3
 

Similar to Refactoring Metadata:

Creating Linked Data from Relational Databases
Creating Linked Data from Relational DatabasesCreating Linked Data from Relational Databases
Creating Linked Data from Relational Databases
Nikolaos Konstantinou
 
[DSC Europe 23] Djordje Grozdic - Transforming Business Process Automation wi...
[DSC Europe 23] Djordje Grozdic - Transforming Business Process Automation wi...[DSC Europe 23] Djordje Grozdic - Transforming Business Process Automation wi...
[DSC Europe 23] Djordje Grozdic - Transforming Business Process Automation wi...
DataScienceConferenc1
 
Evaluation criteria for nosql databases
Evaluation criteria for nosql databasesEvaluation criteria for nosql databases
Evaluation criteria for nosql databases
Ebenezer Daniel
 
Metadata mapping
Metadata mappingMetadata mapping
Metadata mapping
Vlad Vega
 
Towards an automatic semantic integration of information
Towards an automatic semantic integration of informationTowards an automatic semantic integration of information
Towards an automatic semantic integration of information
tmra
 
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
IEEEMEMTECHSTUDENTSPROJECTS
 
RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
ASIS&T
 
Taxonomy Quality Assessment
Taxonomy Quality AssessmentTaxonomy Quality Assessment
Taxonomy Quality Assessment
Semantic Web Company
 
Open Services for Lifecycle Collaboration (OSLC) - Extending REST APIs to Con...
Open Services for Lifecycle Collaboration (OSLC) - Extending REST APIs to Con...Open Services for Lifecycle Collaboration (OSLC) - Extending REST APIs to Con...
Open Services for Lifecycle Collaboration (OSLC) - Extending REST APIs to Con...
Axel Reichwein
 
22 owl section 1
22 owl    section 122 owl    section 1
22 owl section 1
Sharat Jagannath
 
Geospatial Ontologies and GeoSPARQL Services
Geospatial Ontologies and GeoSPARQL ServicesGeospatial Ontologies and GeoSPARQL Services
Geospatial Ontologies and GeoSPARQL Services
Stephane Fellah
 
Instance Matching Benchmarks in the ERA of Linked Data - ISWC2017
Instance Matching Benchmarks in the ERA of Linked Data - ISWC2017Instance Matching Benchmarks in the ERA of Linked Data - ISWC2017
Instance Matching Benchmarks in the ERA of Linked Data - ISWC2017
Holistic Benchmarking of Big Linked Data
 
REST and Linked Data: a match made for domain driven development?
REST and Linked Data: a match made for domain driven development?REST and Linked Data: a match made for domain driven development?
REST and Linked Data: a match made for domain driven development?
ruyalarcon
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
Giorgos Santipantakis
 
Bridging the gap between the semantic web and big data: answering SPARQL que...
Bridging the gap between the semantic web and big data:  answering SPARQL que...Bridging the gap between the semantic web and big data:  answering SPARQL que...
Bridging the gap between the semantic web and big data: answering SPARQL que...
IJECEIAES
 
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Jenn Riley
 
ICPW2007.Paschke
ICPW2007.PaschkeICPW2007.Paschke
ICPW2007.Paschke
pragmaticweb
 
Making the Conceptual Layer Real via HTTP based Linked Data
Making the Conceptual Layer Real via HTTP based Linked DataMaking the Conceptual Layer Real via HTTP based Linked Data
Making the Conceptual Layer Real via HTTP based Linked Data
Kingsley Uyi Idehen
 
A_Logical_Design_Methodology_for_Relational_Databa.pdf
A_Logical_Design_Methodology_for_Relational_Databa.pdfA_Logical_Design_Methodology_for_Relational_Databa.pdf
A_Logical_Design_Methodology_for_Relational_Databa.pdf
XANDERHERNANDEZ5
 
Mc0077 – advanced database systems
Mc0077 – advanced database systemsMc0077 – advanced database systems
Mc0077 – advanced database systems
Rabby Bhatt
 

Similar to Refactoring Metadata: (20)

Creating Linked Data from Relational Databases
Creating Linked Data from Relational DatabasesCreating Linked Data from Relational Databases
Creating Linked Data from Relational Databases
 
[DSC Europe 23] Djordje Grozdic - Transforming Business Process Automation wi...
[DSC Europe 23] Djordje Grozdic - Transforming Business Process Automation wi...[DSC Europe 23] Djordje Grozdic - Transforming Business Process Automation wi...
[DSC Europe 23] Djordje Grozdic - Transforming Business Process Automation wi...
 
Evaluation criteria for nosql databases
Evaluation criteria for nosql databasesEvaluation criteria for nosql databases
Evaluation criteria for nosql databases
 
Metadata mapping
Metadata mappingMetadata mapping
Metadata mapping
 
Towards an automatic semantic integration of information
Towards an automatic semantic integration of informationTowards an automatic semantic integration of information
Towards an automatic semantic integration of information
 
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
 
RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
RDAP 15: Beyond Metadata: Leveraging the “README” to support disciplinary Doc...
 
Taxonomy Quality Assessment
Taxonomy Quality AssessmentTaxonomy Quality Assessment
Taxonomy Quality Assessment
 
Open Services for Lifecycle Collaboration (OSLC) - Extending REST APIs to Con...
Open Services for Lifecycle Collaboration (OSLC) - Extending REST APIs to Con...Open Services for Lifecycle Collaboration (OSLC) - Extending REST APIs to Con...
Open Services for Lifecycle Collaboration (OSLC) - Extending REST APIs to Con...
 
22 owl section 1
22 owl    section 122 owl    section 1
22 owl section 1
 
Geospatial Ontologies and GeoSPARQL Services
Geospatial Ontologies and GeoSPARQL ServicesGeospatial Ontologies and GeoSPARQL Services
Geospatial Ontologies and GeoSPARQL Services
 
Instance Matching Benchmarks in the ERA of Linked Data - ISWC2017
Instance Matching Benchmarks in the ERA of Linked Data - ISWC2017Instance Matching Benchmarks in the ERA of Linked Data - ISWC2017
Instance Matching Benchmarks in the ERA of Linked Data - ISWC2017
 
REST and Linked Data: a match made for domain driven development?
REST and Linked Data: a match made for domain driven development?REST and Linked Data: a match made for domain driven development?
REST and Linked Data: a match made for domain driven development?
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 
Bridging the gap between the semantic web and big data: answering SPARQL que...
Bridging the gap between the semantic web and big data:  answering SPARQL que...Bridging the gap between the semantic web and big data:  answering SPARQL que...
Bridging the gap between the semantic web and big data: answering SPARQL que...
 
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
 
ICPW2007.Paschke
ICPW2007.PaschkeICPW2007.Paschke
ICPW2007.Paschke
 
Making the Conceptual Layer Real via HTTP based Linked Data
Making the Conceptual Layer Real via HTTP based Linked DataMaking the Conceptual Layer Real via HTTP based Linked Data
Making the Conceptual Layer Real via HTTP based Linked Data
 
A_Logical_Design_Methodology_for_Relational_Databa.pdf
A_Logical_Design_Methodology_for_Relational_Databa.pdfA_Logical_Design_Methodology_for_Relational_Databa.pdf
A_Logical_Design_Methodology_for_Relational_Databa.pdf
 
Mc0077 – advanced database systems
Mc0077 – advanced database systemsMc0077 – advanced database systems
Mc0077 – advanced database systems
 

More from Baden Hughes

Closing the Gap: Data Models for Documentary Linguistics
Closing the Gap: Data Models for Documentary LinguisticsClosing the Gap: Data Models for Documentary Linguistics
Closing the Gap: Data Models for Documentary Linguistics
Baden Hughes
 
Managing Perl Installations: A SysAdmin's View
Managing Perl Installations: A SysAdmin's ViewManaging Perl Installations: A SysAdmin's View
Managing Perl Installations: A SysAdmin's View
Baden Hughes
 
If We're Not There Yet, How Far Do We Have To Go ? Web Metadata at The Univer...
If We're Not There Yet, How Far Do We Have To Go ? Web Metadata at The Univer...If We're Not There Yet, How Far Do We Have To Go ? Web Metadata at The Univer...
If We're Not There Yet, How Far Do We Have To Go ? Web Metadata at The Univer...
Baden Hughes
 
Building Computational Grids with Apple’s Xgrid Middleware
Building Computational Grids with Apple’s Xgrid MiddlewareBuilding Computational Grids with Apple’s Xgrid Middleware
Building Computational Grids with Apple’s Xgrid Middleware
Baden Hughes
 
Functional Requirements for an Interlinear Text Editor
Functional Requirements for an Interlinear Text EditorFunctional Requirements for an Interlinear Text Editor
Functional Requirements for an Interlinear Text Editor
Baden Hughes
 
Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Pro...
Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Pro...Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Pro...
Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Pro...
Baden Hughes
 
Disambiguating Advanced Computing for Humanities Researchers
Disambiguating Advanced Computing for Humanities ResearchersDisambiguating Advanced Computing for Humanities Researchers
Disambiguating Advanced Computing for Humanities Researchers
Baden Hughes
 
Metadata Quality Evaluation: Experience from the Open Language Archives Commu...
Metadata Quality Evaluation: Experience from the Open Language Archives Commu...Metadata Quality Evaluation: Experience from the Open Language Archives Commu...
Metadata Quality Evaluation: Experience from the Open Language Archives Commu...
Baden Hughes
 
Towards a Web Search Service for Minority Language Communities
Towards a Web Search Service for Minority Language CommunitiesTowards a Web Search Service for Minority Language Communities
Towards a Web Search Service for Minority Language Communities
Baden Hughes
 
Change Management and Versioning in Ontologies
Change Management and Versioning in OntologiesChange Management and Versioning in Ontologies
Change Management and Versioning in Ontologies
Baden Hughes
 
Object Reuse and Exchange (ORE) : Experience in the Open Language Archives Co...
Object Reuse and Exchange (ORE) : Experience in the Open Language Archives Co...Object Reuse and Exchange (ORE) : Experience in the Open Language Archives Co...
Object Reuse and Exchange (ORE) : Experience in the Open Language Archives Co...
Baden Hughes
 
The Effects of Cross-Pollination : How non-library mass market services are c...
The Effects of Cross-Pollination : How non-library mass market services are c...The Effects of Cross-Pollination : How non-library mass market services are c...
The Effects of Cross-Pollination : How non-library mass market services are c...
Baden Hughes
 
Why Digitization Increases the Value of Print Collections
Why Digitization Increases the Value of Print CollectionsWhy Digitization Increases the Value of Print Collections
Why Digitization Increases the Value of Print Collections
Baden Hughes
 

More from Baden Hughes (13)

Closing the Gap: Data Models for Documentary Linguistics
Closing the Gap: Data Models for Documentary LinguisticsClosing the Gap: Data Models for Documentary Linguistics
Closing the Gap: Data Models for Documentary Linguistics
 
Managing Perl Installations: A SysAdmin's View
Managing Perl Installations: A SysAdmin's ViewManaging Perl Installations: A SysAdmin's View
Managing Perl Installations: A SysAdmin's View
 
If We're Not There Yet, How Far Do We Have To Go ? Web Metadata at The Univer...
If We're Not There Yet, How Far Do We Have To Go ? Web Metadata at The Univer...If We're Not There Yet, How Far Do We Have To Go ? Web Metadata at The Univer...
If We're Not There Yet, How Far Do We Have To Go ? Web Metadata at The Univer...
 
Building Computational Grids with Apple’s Xgrid Middleware
Building Computational Grids with Apple’s Xgrid MiddlewareBuilding Computational Grids with Apple’s Xgrid Middleware
Building Computational Grids with Apple’s Xgrid Middleware
 
Functional Requirements for an Interlinear Text Editor
Functional Requirements for an Interlinear Text EditorFunctional Requirements for an Interlinear Text Editor
Functional Requirements for an Interlinear Text Editor
 
Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Pro...
Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Pro...Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Pro...
Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Pro...
 
Disambiguating Advanced Computing for Humanities Researchers
Disambiguating Advanced Computing for Humanities ResearchersDisambiguating Advanced Computing for Humanities Researchers
Disambiguating Advanced Computing for Humanities Researchers
 
Metadata Quality Evaluation: Experience from the Open Language Archives Commu...
Metadata Quality Evaluation: Experience from the Open Language Archives Commu...Metadata Quality Evaluation: Experience from the Open Language Archives Commu...
Metadata Quality Evaluation: Experience from the Open Language Archives Commu...
 
Towards a Web Search Service for Minority Language Communities
Towards a Web Search Service for Minority Language CommunitiesTowards a Web Search Service for Minority Language Communities
Towards a Web Search Service for Minority Language Communities
 
Change Management and Versioning in Ontologies
Change Management and Versioning in OntologiesChange Management and Versioning in Ontologies
Change Management and Versioning in Ontologies
 
Object Reuse and Exchange (ORE) : Experience in the Open Language Archives Co...
Object Reuse and Exchange (ORE) : Experience in the Open Language Archives Co...Object Reuse and Exchange (ORE) : Experience in the Open Language Archives Co...
Object Reuse and Exchange (ORE) : Experience in the Open Language Archives Co...
 
The Effects of Cross-Pollination : How non-library mass market services are c...
The Effects of Cross-Pollination : How non-library mass market services are c...The Effects of Cross-Pollination : How non-library mass market services are c...
The Effects of Cross-Pollination : How non-library mass market services are c...
 
Why Digitization Increases the Value of Print Collections
Why Digitization Increases the Value of Print CollectionsWhy Digitization Increases the Value of Print Collections
Why Digitization Increases the Value of Print Collections
 

Recently uploaded

Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
saastr
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
Shinana2
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
Intelisync
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
Pravash Chandra Das
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 

Recently uploaded (20)

Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 

Refactoring Metadata:

  • 1. Refactoring Metadata: Finding architectural compatibility through structural comparisons Baden Hughes Department of Computer Science and Software Engineering The University of Melbourne May 2-6, 2004 Copyright © 2004 Baden Hughes 1
  • 2. Agenda • Motivation for Refactoring Metadata • Setting the Context • Identifying Points of Comparison • Goals for Structural Comparison • Methods for Structural Comparison • Refactoring in Practice • Principles for Robust Instances • Conclusion May 2-6, 2004 Copyright © 2004 Baden Hughes 2
  • 3. Motivations for Refactoring Metadata • The need to addressing the problem of metadata volatility is conceptually juxtaposed with the motivation for metadata creation • XML technologies have become pervasive within the metadata domain • Different communities = different standards = different degrees of maturity in metadata adoption resulting in metadata being (highly?) variable even within an organization • Systematically determining similarity and difference is the key to effective refactoring of metadata • Automatically determining similarity and difference is the key to efficient refactoring of metadata May 2-6, 2004 Copyright © 2004 Baden Hughes 3
  • 4. Setting the Context • XML-based metadata from natural language engineering and digital libraries • Wide variety of – traditions of metadata development – technologies for metadata implementation – objects described by metadata – granularity of metadata descriptions • Motivated by interoperability analysis • Seeking to leverage processes not dissimilar to database schema comparisons May 2-6, 2004 Copyright © 2004 Baden Hughes 4
  • 5. Identifying Points of Comparison • Robust instances require both syntactic and semantic analysis • Points of comparison – XML Document Instance – DTDs – Schemata – Namespaces – RDF Instances – Ontologies • Likely that different methods are required for each different input May 2-6, 2004 Copyright © 2004 Baden Hughes 5
  • 6. Goals of Structural Comparison • While validation of XML based metadata does contribute to the quality of metadata, it does not necessarily assist in determining architectural compatibility • Systematic, iterative evaluation of metadata architectures can contribute to maturity of XML based metadata • Quantifying the degree of syntactic and semantic similarity is an important first step in the refactoring process – it may in fact demonstrate viability May 2-6, 2004 Copyright © 2004 Baden Hughes 6
  • 7. Methods for Structural Comparison • Different methods for structural comparison depending on the input – XML documents: trees – DTDs: regexps and feature structures – XML Namespaces: feature structures – XML Schemas: regexps and graph matching – RDF Instances: graph matching – Ontologies: feature structures and graph matching May 2-6, 2004 Copyright © 2004 Baden Hughes 7
  • 8. Tree Matching • Common conception of an XML document as a tree structure • Tree matching is a widely used IE/IR technique for structured data, and is applicable to XML based metadata • Tree matching is largely derivative from pattern matching, and is largely independent of syntactic or semantic constraints • While tree matching can provide basic information about the similarity of two documents, for architectural compatibility a deeper analysis is required May 2-6, 2004 Copyright © 2004 Baden Hughes 8
  • 9. Regular Expression Matching • DTD syntax is derived from regular expressions • Well known evaluation methodologies for regexps are applicable to DTDs • In contrast to pure syntactic comparison, regexp matching allows the discovery of the legal constituents of syntactic structures • Regexp evaluation is is a highly efficient exercise even on large metadata collections, and widely implemented in common programming languages May 2-6, 2004 Copyright © 2004 Baden Hughes 9
  • 10. Feature Structure Matching • Typed feature structures are widely used for deriving controlled vocabularies – XML attribute instances are typically able to be reduced to typed feature structures for comparison • Evaluation of the semantic content of feature structures is well grounded in formal logic • Feature structure comparisons can also reveal syntactic constraints expressed as dimensionality of feature matrices May 2-6, 2004 Copyright © 2004 Baden Hughes 10
  • 11. Graph Matching • Rich XML representations such as RDF can be construed as a series of arcs and nodes, allowing the adoption of graph theory techniques for the determination of isomorphism • Finding the minimum and maximum common subgraphs is a technique which can be used to determine architectural compatibility in the syntactic domain • Graph matching is primarily syntactic, although it can also be applied to semantic analysis on sources such as ontologies May 2-6, 2004 Copyright © 2004 Baden Hughes 11
  • 12. Refactoring in Practice • XML Documents • DTDs • Namespaces • Schemata • RDF instances • Ontologies • See http://www.cs.mu.oz.au/~badenh/projects/metadata-comparison for demo materials May 2-6, 2004 Copyright © 2004 Baden Hughes 12
  • 13. Principles for Robust Instances • Both syntactic and semantic analysis are required • Initiate comparisons at the highest level, and proceed downwards – higher level incompatibilities are more complex to resolve • Quantifying degree of similarity is extremely important as it impacts directly on the complexity of refactoring processes • Accurately identified commonalities at both syntactic and semantic levels can be leveraged efficiently May 2-6, 2004 Copyright © 2004 Baden Hughes 13
  • 14. Conclusion • Adopting and permuting a range of techniques for structural comparison from a variety of other disciplines can lead to efficient methods for metadata structural analysis and consequently refactoring • Large scale metadata management requires an automated approach to both syntactic and semantic evaluation in order to contribute to ROI May 2-6, 2004 Copyright © 2004 Baden Hughes 14
  • 15. Acknowledgements • National Science Foundation Grant Number 9910603 (International Standards in Language Engineering) • National Science Foundation Grant Number 0317826 (Querying Linguistic Databases) May 2-6, 2004 Copyright © 2004 Baden Hughes 15