SlideShare a Scribd company logo
1 of 7
Download to read offline
STI INNSBRUCK
COMBINING AND EASING THE
ACCESS OF THE ESWC SEMANTIC
WEB DATA
Dieter Fensel, and Alex Oberhauser
STI Innsbruck, University of Innsbruck,
Technikerstraße 21a, 6020 Innsbruck, Austria
firstname.lastname@sti2.at
SEMANTIC TECHNOLOGY INSTITUTE INNSBRUCK
STI INNSBRUCK
Technikerstraße 21a
A - 6020 Innsbruck
Austria
http://www.sti-innsbruck.a
Combining and easing the access of the
ESWC Semantic Web Data
Alex Oberhauser, Dieter Fensel
Semantic Technology Institute (STI)
6020 Innsbruck, Austria
{firstname}.{lastname}@sti2.at
Abstract. The following paper describes the combination of conference
metadata stored on the conference website with Semantic Web Dog Food
metadata retrieved from a paper submission system. The main goal is to
increase the visibility of publicly available research papers and related
conference information.
The outcome of this paper is an alignment algorithm and a related
implementation that supports the alignment process of the two metadata
sets.
Keywords: ESWC, easychair.org, conference, linked open data, metadata
1 Introduction
In order to increase the visibility of research papers and conference information on the web
this paper presents an approach that aligns conference metadata, retrieved from the official
conference website, with Semantic Web Dog Food (SWDF) metadata, retrieved from the
paper submission system easychair.org. Where applicable the SWDF metadata vocabularies
are changed to vocabularies that are understandable by all major search engines1. This step
increases further the discoverable via search engines.
The alignment process is supported by a mapping algorithm and a related
implementation that should ease the combination of the two metadata sets. Additional it
describes how to setup the content management system (CMS) of the conference website in
order to define the mappings from the CMS to RDF/RDFa/Microdata.
This paper describes the technical approach with an implementation in section 3 and the
mapping of the vocabularies in section 4. The last section, section 5, concludes the paper.
2 Motivation
The main motivation behind this approach is to increase the visibility of research publications
by combining the two related, but physically separated metadata sources and publishing them
in a easy accessible form. The first data set is the metadata from the paper submission system
and the second is the metadata provided by the conference website. As shown in Figure 1, the
two data sets are mainly overlapping, but described with different vocabularies. This makes
1 schema.org was developed by Bing, Google, Yahoo! and Yandex
1
the automatic alignment, with current technology, impossible and decreases the visibility and
automatic discoverable. It increases also the effort of an human reader to check if concepts, in
the two sets, are equal.
Figure 1. Overlapping of metadata sets
Additional the import of paper submissions into the conference website was in the past
a time consuming, manual effort that results in most cases to human-only readable data. The
presented approach overcomes these weaknesses with the help of an mapping algorithm that
takes the available information from the Semantic Web Dog Food website and imports them to
the content management system of the related conference website. This approach is possible,
because the Semantic Web Dog Food dataset gets the information from the widely used
conference system easychair.org. Easychair is used by the most conferences for the submit
process.
The import of semantic-enriched information has additional the advantage that it could
be used for the search engine optimization process.
3 Conceptual Metadata Integration
In order to be able to import the data into the CMS and later one back to the Semantic Web Dog
Food (SWDF) a mapping has to be defined. Figure 2 visualizes the different metadata set. For
this purpose the input source (SWDF data dump) is mapped with the help of the following rules:
1. schema.org vocabulary for the mapping from SWDF vocabulary.
2. If no concept in schema.org, fallback to sti2/oc vocabulary.
3. If no concept in sti2/oc vocabulary, fallback to original ontology.
Table 1 and 2 outline a few, representative examples how such a mapping looks like.
Semantic Web Dog Food Drupal Content Type RDF Concept
foaf:Organization Organization schema:Organization
foaf:Person Person schema:Person
swc:ConferenceEvent ConferenceEvent sti2:ESWC
swc:KeynoteTalk KeynoteTalk swc:KeynoteTalk
2
swc:Chair Chair swc:Chair
Table 1. Class mapping examples
Semantic Web Dog Food Drupal Content Type RDF Concept
rdfs:label title schema:name
ical:description body[Body] schema:description
swrc:affiliation affiliation schema:affiliation
ical:summary body[Summary] ical:summary
swc:booktitle booktitle swc:booktitle
Table 2. Property mapping examples
Figure 2. Metadata Sets
The reason why schema.org was chosen for the primary mapping is the use of this
vocabulary by search engines. After the import into the CMS the metadata could be used for
search engine optimization (SEO). The good tool support for microdata makes the metadata
easy publicly accessible. As intermediate step the data has to be imported into the CMS. For
this purpose the SWDF data has to be mapped to the internal representation. This mapping
must match with the internal class structure of the implementation. The CMS internals are not
visible from the outside and have no effect of the end result. Important are only the annotation of
the properties and classes. That is the reason why the ContentTypes in the CMS have the same
name as the concepts in the source vocabulary, but without the namespace.
4 Technical Integration
3
The ESWC to CMS process consists of different steps that are partially supported by an
implementation. The first step is to get the conference data from the Semantic Web Dog Food
platform. For the current implementation the following URL is used: http://data.semanticweb.org/
conference/eswc/2012/complete. During the development the RDF/XML dump was imported to
OWLIM and exported as N3. This newly created dump can be used by the implementation that
takes this data and imports it to the CMS Drupal 7.x. For this purpose the CMS must be setup
correctly. That means that the ContentTypes of the CMS have to match with the ContentTypes
used by the implementation. Another step that must be performed in the CMS is the annotation
of the properties with the right vocabularies. This step is supported by a script that is able to
extract all classes from the dump.
In order to use the conference integration implementation the content management
system Drupal 7.x has to be enriched by the following modules:
RESTful Web Service [2]: This module provides a REST API that is used for the import
process.
RDFx [1]: Enriches the CMS with RDF capabilities for the mapping from internal
properties to RDF vocabularies. In combination with the RESTful Web Services module
it allows the serialization in different formats, e.g. RDF/XML, NTriples, Turtle.
SPARQL [3]: The SPARQL module allows the data extraction with the help of SPARQL
1.0 queries. This extension is helpful for the export of the metadata.
Figure 3. Implementation Architecture Diagram
The related implementation extracts from the gained RDF dump the needed concepts and
export them via REST API into the CMS. For this purpose an in-memory triplestore and
SPARQL is used. Each class instance with related properties is mapped (with the help of
SPARQL SELECT queries) to the code internal classes. Each class represents the mapping
4
from one class from the ontology to one ContentType in the CMS. The instantiated objects
are then converted to an XML message that is understandable by the content management
systems.
Such an XML messages represents the internal ContentType with related fields. The
architecture diagram in Figure 3 describes graphical the import process.
5 Conclusion
The approach presented in this paper outlines a strategy how to improve the visibility of
conference data. For this purpose a basic alignment process was defined and supported by an
implementation that should ease the alignment. The alignment algorithm describes a strategy
how to improve the visibility and could be adapted to other use cases.
As data sources the data from Semantic Web Dog Food and the internal data from the
conference website was used. The algorithm was demonstrated on the base of easychair.org
data and the content management system Drupal 7.x. The implementation was written as Ruby
library. This allows the integration of the implementation into a bigger platform.
A Mapping Definitions
Prefix Namespace
swdf http://data.semanticweb.org/ns/swc/ontology#
foaf http://xmlns.com/foaf/0.1/
swrc http://swrc.ontoware.org/ontology#
oc - (Online Communication Ontology, no URL yet)
Semantic Web Dog Food Drupal Content Type RDF Concept
swdf:ConferenceEvent ConferenceEvent oc:ConferenceSeries (or for
ESWC: sti2:EWSC)
foaf:Organization Organization schema:Organization
foaf:Person Person schema:Person
swdf:KeynoteTalk KeynoteTalk swdf:KeynoteTalk
swrc:InProceedings InProceedings swrc:InProceedings
swdf:Chair Chair swdf:Chair
swdf:Presenter Presenter swdf:Presenter
swrc:Proceedings Proceedings swrc:Proceedings
5
Table 3. Class Mapping
References
1. Drupal RDFx module. http://drupal.org/project/rdfx, accessed: 14.05.2012
2. Drupal RESTful Web Services module. http://drupal.org/project/restws, accessed:
14.05.2012
3. Drupal SPARQL module. http://drupal.org/project/sparql, accessed: 14.05.2012
4. Semantic web dog food. http://data.semanticweb.org/, accessed: 14.05.2012
6

More Related Content

What's hot

Business Intelligence Portfolio
Business Intelligence PortfolioBusiness Intelligence Portfolio
Business Intelligence Portfolioguestc38d4b
 
An Extension of RETRO Framework: Translating SQL Insert, Update and Delete Qu...
An Extension of RETRO Framework: Translating SQL Insert, Update and Delete Qu...An Extension of RETRO Framework: Translating SQL Insert, Update and Delete Qu...
An Extension of RETRO Framework: Translating SQL Insert, Update and Delete Qu...Editor IJCATR
 
Ssis sql ssrs_sp_hb_li
Ssis sql ssrs_sp_hb_liSsis sql ssrs_sp_hb_li
Ssis sql ssrs_sp_hb_liHong-Bing Li
 
Software Systems Modularization
Software Systems ModularizationSoftware Systems Modularization
Software Systems Modularizationchiao-fan yang
 
Browser Extension to Removing Dust using Sequence Alignment and Content Matc...
Browser Extension to Removing Dust using Sequence Alignment and Content  Matc...Browser Extension to Removing Dust using Sequence Alignment and Content  Matc...
Browser Extension to Removing Dust using Sequence Alignment and Content Matc...IRJET Journal
 
Data Management Systems for Government Agencies - with CKAN
Data Management Systems for Government Agencies - with CKANData Management Systems for Government Agencies - with CKAN
Data Management Systems for Government Agencies - with CKANSteven De Costa
 
A Novel Data Extraction and Alignment Method for Web Databases
A Novel Data Extraction and Alignment Method for Web DatabasesA Novel Data Extraction and Alignment Method for Web Databases
A Novel Data Extraction and Alignment Method for Web DatabasesIJMER
 
Drupal, CKAN and Public Data. DrupalGov 08 february 2016
Drupal, CKAN and Public Data. DrupalGov 08 february 2016Drupal, CKAN and Public Data. DrupalGov 08 february 2016
Drupal, CKAN and Public Data. DrupalGov 08 february 2016Steven De Costa
 
SSIS Project Profile
SSIS Project ProfileSSIS Project Profile
SSIS Project Profiletthompson0421
 

What's hot (12)

p27
p27p27
p27
 
Business Intelligence Portfolio
Business Intelligence PortfolioBusiness Intelligence Portfolio
Business Intelligence Portfolio
 
An Extension of RETRO Framework: Translating SQL Insert, Update and Delete Qu...
An Extension of RETRO Framework: Translating SQL Insert, Update and Delete Qu...An Extension of RETRO Framework: Translating SQL Insert, Update and Delete Qu...
An Extension of RETRO Framework: Translating SQL Insert, Update and Delete Qu...
 
Ssis sql ssrs_sp_hb_li
Ssis sql ssrs_sp_hb_liSsis sql ssrs_sp_hb_li
Ssis sql ssrs_sp_hb_li
 
Software Systems Modularization
Software Systems ModularizationSoftware Systems Modularization
Software Systems Modularization
 
Browser Extension to Removing Dust using Sequence Alignment and Content Matc...
Browser Extension to Removing Dust using Sequence Alignment and Content  Matc...Browser Extension to Removing Dust using Sequence Alignment and Content  Matc...
Browser Extension to Removing Dust using Sequence Alignment and Content Matc...
 
Data Management Systems for Government Agencies - with CKAN
Data Management Systems for Government Agencies - with CKANData Management Systems for Government Agencies - with CKAN
Data Management Systems for Government Agencies - with CKAN
 
A Novel Data Extraction and Alignment Method for Web Databases
A Novel Data Extraction and Alignment Method for Web DatabasesA Novel Data Extraction and Alignment Method for Web Databases
A Novel Data Extraction and Alignment Method for Web Databases
 
Rss Join Engine
Rss Join EngineRss Join Engine
Rss Join Engine
 
Universe
UniverseUniverse
Universe
 
Drupal, CKAN and Public Data. DrupalGov 08 february 2016
Drupal, CKAN and Public Data. DrupalGov 08 february 2016Drupal, CKAN and Public Data. DrupalGov 08 february 2016
Drupal, CKAN and Public Data. DrupalGov 08 february 2016
 
SSIS Project Profile
SSIS Project ProfileSSIS Project Profile
SSIS Project Profile
 

Similar to Combining and easing the access of the eswc semantic web data 0

Document Based Data Modeling Technique
Document Based Data Modeling TechniqueDocument Based Data Modeling Technique
Document Based Data Modeling TechniqueCarmen Sanborn
 
CSE681 – Software Modeling and Analysis Fall 2013 Project .docx
CSE681 – Software Modeling and Analysis Fall 2013 Project .docxCSE681 – Software Modeling and Analysis Fall 2013 Project .docx
CSE681 – Software Modeling and Analysis Fall 2013 Project .docxfaithxdunce63732
 
Adcom2006 Full 6
Adcom2006 Full 6Adcom2006 Full 6
Adcom2006 Full 6umavanth
 
Aucfanlab Datalake - Big Data Management Platform -
Aucfanlab Datalake - Big Data Management Platform -Aucfanlab Datalake - Big Data Management Platform -
Aucfanlab Datalake - Big Data Management Platform -Aucfan
 
An Implementation of a New Framework for Automatic Generation of Ontology and...
An Implementation of a New Framework for Automatic Generation of Ontology and...An Implementation of a New Framework for Automatic Generation of Ontology and...
An Implementation of a New Framework for Automatic Generation of Ontology and...IJCSIS Research Publications
 
Sem tech 2011 v8
Sem tech 2011 v8Sem tech 2011 v8
Sem tech 2011 v8dallemang
 
Adopting AnswerModules ModuleSuite
Adopting AnswerModules ModuleSuiteAdopting AnswerModules ModuleSuite
Adopting AnswerModules ModuleSuiteAnswerModules
 
The "Big Data" Ecosystem at LinkedIn
The "Big Data" Ecosystem at LinkedInThe "Big Data" Ecosystem at LinkedIn
The "Big Data" Ecosystem at LinkedInSam Shah
 
The “Big Data” Ecosystem at LinkedIn
The “Big Data” Ecosystem at LinkedInThe “Big Data” Ecosystem at LinkedIn
The “Big Data” Ecosystem at LinkedInKun Le
 
A SEMANTIC BASED APPROACH FOR INFORMATION RETRIEVAL FROM HTML DOCUMENTS USING...
A SEMANTIC BASED APPROACH FOR INFORMATION RETRIEVAL FROM HTML DOCUMENTS USING...A SEMANTIC BASED APPROACH FOR INFORMATION RETRIEVAL FROM HTML DOCUMENTS USING...
A SEMANTIC BASED APPROACH FOR INFORMATION RETRIEVAL FROM HTML DOCUMENTS USING...cscpconf
 
A semantic based approach for information retrieval from html documents using...
A semantic based approach for information retrieval from html documents using...A semantic based approach for information retrieval from html documents using...
A semantic based approach for information retrieval from html documents using...csandit
 
IRJET- Lightweight MVC Framework in PHP
IRJET- Lightweight MVC Framework in PHPIRJET- Lightweight MVC Framework in PHP
IRJET- Lightweight MVC Framework in PHPIRJET Journal
 
A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...
A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...
A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...csandit
 
A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...
A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...
A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...cscpconf
 
A semantic based approach for knowledge discovery and acquistion from multipl...
A semantic based approach for knowledge discovery and acquistion from multipl...A semantic based approach for knowledge discovery and acquistion from multipl...
A semantic based approach for knowledge discovery and acquistion from multipl...csandit
 
IRJET- Data Mining - Secure Keyword Manager
IRJET- Data Mining - Secure Keyword ManagerIRJET- Data Mining - Secure Keyword Manager
IRJET- Data Mining - Secure Keyword ManagerIRJET Journal
 

Similar to Combining and easing the access of the eswc semantic web data 0 (20)

Document Based Data Modeling Technique
Document Based Data Modeling TechniqueDocument Based Data Modeling Technique
Document Based Data Modeling Technique
 
CSE681 – Software Modeling and Analysis Fall 2013 Project .docx
CSE681 – Software Modeling and Analysis Fall 2013 Project .docxCSE681 – Software Modeling and Analysis Fall 2013 Project .docx
CSE681 – Software Modeling and Analysis Fall 2013 Project .docx
 
Adcom2006 Full 6
Adcom2006 Full 6Adcom2006 Full 6
Adcom2006 Full 6
 
IT6701-Information management question bank
IT6701-Information management question bankIT6701-Information management question bank
IT6701-Information management question bank
 
cakephp UDUYKTHA (1)
cakephp UDUYKTHA (1)cakephp UDUYKTHA (1)
cakephp UDUYKTHA (1)
 
Aucfanlab Datalake - Big Data Management Platform -
Aucfanlab Datalake - Big Data Management Platform -Aucfanlab Datalake - Big Data Management Platform -
Aucfanlab Datalake - Big Data Management Platform -
 
An Implementation of a New Framework for Automatic Generation of Ontology and...
An Implementation of a New Framework for Automatic Generation of Ontology and...An Implementation of a New Framework for Automatic Generation of Ontology and...
An Implementation of a New Framework for Automatic Generation of Ontology and...
 
Sem tech 2011 v8
Sem tech 2011 v8Sem tech 2011 v8
Sem tech 2011 v8
 
Adopting AnswerModules ModuleSuite
Adopting AnswerModules ModuleSuiteAdopting AnswerModules ModuleSuite
Adopting AnswerModules ModuleSuite
 
The "Big Data" Ecosystem at LinkedIn
The "Big Data" Ecosystem at LinkedInThe "Big Data" Ecosystem at LinkedIn
The "Big Data" Ecosystem at LinkedIn
 
The “Big Data” Ecosystem at LinkedIn
The “Big Data” Ecosystem at LinkedInThe “Big Data” Ecosystem at LinkedIn
The “Big Data” Ecosystem at LinkedIn
 
A SEMANTIC BASED APPROACH FOR INFORMATION RETRIEVAL FROM HTML DOCUMENTS USING...
A SEMANTIC BASED APPROACH FOR INFORMATION RETRIEVAL FROM HTML DOCUMENTS USING...A SEMANTIC BASED APPROACH FOR INFORMATION RETRIEVAL FROM HTML DOCUMENTS USING...
A SEMANTIC BASED APPROACH FOR INFORMATION RETRIEVAL FROM HTML DOCUMENTS USING...
 
A semantic based approach for information retrieval from html documents using...
A semantic based approach for information retrieval from html documents using...A semantic based approach for information retrieval from html documents using...
A semantic based approach for information retrieval from html documents using...
 
IRJET- Lightweight MVC Framework in PHP
IRJET- Lightweight MVC Framework in PHPIRJET- Lightweight MVC Framework in PHP
IRJET- Lightweight MVC Framework in PHP
 
E0322035037
E0322035037E0322035037
E0322035037
 
Final paper
Final paperFinal paper
Final paper
 
A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...
A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...
A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...
 
A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...
A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...
A SEMANTIC BASED APPROACH FOR KNOWLEDGE DISCOVERY AND ACQUISITION FROM MULTIP...
 
A semantic based approach for knowledge discovery and acquistion from multipl...
A semantic based approach for knowledge discovery and acquistion from multipl...A semantic based approach for knowledge discovery and acquistion from multipl...
A semantic based approach for knowledge discovery and acquistion from multipl...
 
IRJET- Data Mining - Secure Keyword Manager
IRJET- Data Mining - Secure Keyword ManagerIRJET- Data Mining - Secure Keyword Manager
IRJET- Data Mining - Secure Keyword Manager
 

More from STIinnsbruck

More from STIinnsbruck (20)

Unister
UnisterUnister
Unister
 
Twoo
TwooTwoo
Twoo
 
Twibes
TwibesTwibes
Twibes
 
Tweet deck 2012-01-02
Tweet deck 2012-01-02Tweet deck 2012-01-02
Tweet deck 2012-01-02
 
Tv handbook revised_100120141
Tv handbook revised_100120141Tv handbook revised_100120141
Tv handbook revised_100120141
 
Tv feratel 13032014
Tv feratel 13032014Tv feratel 13032014
Tv feratel 13032014
 
Tv evaluation 12032014
Tv evaluation 12032014Tv evaluation 12032014
Tv evaluation 12032014
 
T vb publication_rules_11032014
T vb publication_rules_11032014T vb publication_rules_11032014
T vb publication_rules_11032014
 
T vb mapping_implementation_25032014
T vb mapping_implementation_25032014T vb mapping_implementation_25032014
T vb mapping_implementation_25032014
 
T vb alignment_022814_0
T vb alignment_022814_0T vb alignment_022814_0
T vb alignment_022814_0
 
Ttr 20130701
Ttr 20130701Ttr 20130701
Ttr 20130701
 
Ttg mapping to_schema.org_
Ttg mapping to_schema.org_Ttg mapping to_schema.org_
Ttg mapping to_schema.org_
 
Ttb 08042014
Ttb 08042014Ttb 08042014
Ttb 08042014
 
Trust you
Trust youTrust you
Trust you
 
Tripwolf
TripwolfTripwolf
Tripwolf
 
Tripbirds
TripbirdsTripbirds
Tripbirds
 
Traveltainment
TraveltainmentTraveltainment
Traveltainment
 
Travelaudience
TravelaudienceTravelaudience
Travelaudience
 
Tourismuszukunft
TourismuszukunftTourismuszukunft
Tourismuszukunft
 
Tourismusverband innsbruck 24.09.2013
Tourismusverband innsbruck 24.09.2013Tourismusverband innsbruck 24.09.2013
Tourismusverband innsbruck 24.09.2013
 

Recently uploaded

Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...Sebastiano Panichella
 
GESCO SE Press and Analyst Conference on Financial Results 2024
GESCO SE Press and Analyst Conference on Financial Results 2024GESCO SE Press and Analyst Conference on Financial Results 2024
GESCO SE Press and Analyst Conference on Financial Results 2024GESCO SE
 
proposal kumeneger edited.docx A kumeeger
proposal kumeneger edited.docx A kumeegerproposal kumeneger edited.docx A kumeeger
proposal kumeneger edited.docx A kumeegerkumenegertelayegrama
 
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...Sebastiano Panichella
 
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxEngaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxAsifArshad8
 
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRachelAnnTenibroAmaz
 
Internship Presentation | PPT | CSE | SE
Internship Presentation | PPT | CSE | SEInternship Presentation | PPT | CSE | SE
Internship Presentation | PPT | CSE | SESaleh Ibne Omar
 
Application of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptxApplication of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptxRoquia Salam
 
cse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber securitycse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber securitysandeepnani2260
 
General Elections Final Press Noteas per M
General Elections Final Press Noteas per MGeneral Elections Final Press Noteas per M
General Elections Final Press Noteas per MVidyaAdsule1
 
Chizaram's Women Tech Makers Deck. .pptx
Chizaram's Women Tech Makers Deck.  .pptxChizaram's Women Tech Makers Deck.  .pptx
Chizaram's Women Tech Makers Deck. .pptxogubuikealex
 
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptxerickamwana1
 
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRRINDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRRsarwankumar4524
 
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunityDon't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunityApp Ethena
 
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...漢銘 謝
 
Quality by design.. ppt for RA (1ST SEM
Quality by design.. ppt for  RA (1ST SEMQuality by design.. ppt for  RA (1ST SEM
Quality by design.. ppt for RA (1ST SEMCharmi13
 
A Guide to Choosing the Ideal Air Cooler
A Guide to Choosing the Ideal Air CoolerA Guide to Choosing the Ideal Air Cooler
A Guide to Choosing the Ideal Air Coolerenquirieskenstar
 

Recently uploaded (17)

Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
 
GESCO SE Press and Analyst Conference on Financial Results 2024
GESCO SE Press and Analyst Conference on Financial Results 2024GESCO SE Press and Analyst Conference on Financial Results 2024
GESCO SE Press and Analyst Conference on Financial Results 2024
 
proposal kumeneger edited.docx A kumeeger
proposal kumeneger edited.docx A kumeegerproposal kumeneger edited.docx A kumeeger
proposal kumeneger edited.docx A kumeeger
 
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
 
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxEngaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
 
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
 
Internship Presentation | PPT | CSE | SE
Internship Presentation | PPT | CSE | SEInternship Presentation | PPT | CSE | SE
Internship Presentation | PPT | CSE | SE
 
Application of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptxApplication of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptx
 
cse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber securitycse-csp batch4 review-1.1.pptx cyber security
cse-csp batch4 review-1.1.pptx cyber security
 
General Elections Final Press Noteas per M
General Elections Final Press Noteas per MGeneral Elections Final Press Noteas per M
General Elections Final Press Noteas per M
 
Chizaram's Women Tech Makers Deck. .pptx
Chizaram's Women Tech Makers Deck.  .pptxChizaram's Women Tech Makers Deck.  .pptx
Chizaram's Women Tech Makers Deck. .pptx
 
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
05.02 MMC - Assignment 4 - Image Attribution Lovepreet.pptx
 
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRRINDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
 
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunityDon't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
 
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
 
Quality by design.. ppt for RA (1ST SEM
Quality by design.. ppt for  RA (1ST SEMQuality by design.. ppt for  RA (1ST SEM
Quality by design.. ppt for RA (1ST SEM
 
A Guide to Choosing the Ideal Air Cooler
A Guide to Choosing the Ideal Air CoolerA Guide to Choosing the Ideal Air Cooler
A Guide to Choosing the Ideal Air Cooler
 

Combining and easing the access of the eswc semantic web data 0

  • 1. STI INNSBRUCK COMBINING AND EASING THE ACCESS OF THE ESWC SEMANTIC WEB DATA Dieter Fensel, and Alex Oberhauser STI Innsbruck, University of Innsbruck, Technikerstraße 21a, 6020 Innsbruck, Austria firstname.lastname@sti2.at SEMANTIC TECHNOLOGY INSTITUTE INNSBRUCK STI INNSBRUCK Technikerstraße 21a A - 6020 Innsbruck Austria http://www.sti-innsbruck.a
  • 2. Combining and easing the access of the ESWC Semantic Web Data Alex Oberhauser, Dieter Fensel Semantic Technology Institute (STI) 6020 Innsbruck, Austria {firstname}.{lastname}@sti2.at Abstract. The following paper describes the combination of conference metadata stored on the conference website with Semantic Web Dog Food metadata retrieved from a paper submission system. The main goal is to increase the visibility of publicly available research papers and related conference information. The outcome of this paper is an alignment algorithm and a related implementation that supports the alignment process of the two metadata sets. Keywords: ESWC, easychair.org, conference, linked open data, metadata 1 Introduction In order to increase the visibility of research papers and conference information on the web this paper presents an approach that aligns conference metadata, retrieved from the official conference website, with Semantic Web Dog Food (SWDF) metadata, retrieved from the paper submission system easychair.org. Where applicable the SWDF metadata vocabularies are changed to vocabularies that are understandable by all major search engines1. This step increases further the discoverable via search engines. The alignment process is supported by a mapping algorithm and a related implementation that should ease the combination of the two metadata sets. Additional it describes how to setup the content management system (CMS) of the conference website in order to define the mappings from the CMS to RDF/RDFa/Microdata. This paper describes the technical approach with an implementation in section 3 and the mapping of the vocabularies in section 4. The last section, section 5, concludes the paper. 2 Motivation The main motivation behind this approach is to increase the visibility of research publications by combining the two related, but physically separated metadata sources and publishing them in a easy accessible form. The first data set is the metadata from the paper submission system and the second is the metadata provided by the conference website. As shown in Figure 1, the two data sets are mainly overlapping, but described with different vocabularies. This makes 1 schema.org was developed by Bing, Google, Yahoo! and Yandex 1
  • 3. the automatic alignment, with current technology, impossible and decreases the visibility and automatic discoverable. It increases also the effort of an human reader to check if concepts, in the two sets, are equal. Figure 1. Overlapping of metadata sets Additional the import of paper submissions into the conference website was in the past a time consuming, manual effort that results in most cases to human-only readable data. The presented approach overcomes these weaknesses with the help of an mapping algorithm that takes the available information from the Semantic Web Dog Food website and imports them to the content management system of the related conference website. This approach is possible, because the Semantic Web Dog Food dataset gets the information from the widely used conference system easychair.org. Easychair is used by the most conferences for the submit process. The import of semantic-enriched information has additional the advantage that it could be used for the search engine optimization process. 3 Conceptual Metadata Integration In order to be able to import the data into the CMS and later one back to the Semantic Web Dog Food (SWDF) a mapping has to be defined. Figure 2 visualizes the different metadata set. For this purpose the input source (SWDF data dump) is mapped with the help of the following rules: 1. schema.org vocabulary for the mapping from SWDF vocabulary. 2. If no concept in schema.org, fallback to sti2/oc vocabulary. 3. If no concept in sti2/oc vocabulary, fallback to original ontology. Table 1 and 2 outline a few, representative examples how such a mapping looks like. Semantic Web Dog Food Drupal Content Type RDF Concept foaf:Organization Organization schema:Organization foaf:Person Person schema:Person swc:ConferenceEvent ConferenceEvent sti2:ESWC swc:KeynoteTalk KeynoteTalk swc:KeynoteTalk 2
  • 4. swc:Chair Chair swc:Chair Table 1. Class mapping examples Semantic Web Dog Food Drupal Content Type RDF Concept rdfs:label title schema:name ical:description body[Body] schema:description swrc:affiliation affiliation schema:affiliation ical:summary body[Summary] ical:summary swc:booktitle booktitle swc:booktitle Table 2. Property mapping examples Figure 2. Metadata Sets The reason why schema.org was chosen for the primary mapping is the use of this vocabulary by search engines. After the import into the CMS the metadata could be used for search engine optimization (SEO). The good tool support for microdata makes the metadata easy publicly accessible. As intermediate step the data has to be imported into the CMS. For this purpose the SWDF data has to be mapped to the internal representation. This mapping must match with the internal class structure of the implementation. The CMS internals are not visible from the outside and have no effect of the end result. Important are only the annotation of the properties and classes. That is the reason why the ContentTypes in the CMS have the same name as the concepts in the source vocabulary, but without the namespace. 4 Technical Integration 3
  • 5. The ESWC to CMS process consists of different steps that are partially supported by an implementation. The first step is to get the conference data from the Semantic Web Dog Food platform. For the current implementation the following URL is used: http://data.semanticweb.org/ conference/eswc/2012/complete. During the development the RDF/XML dump was imported to OWLIM and exported as N3. This newly created dump can be used by the implementation that takes this data and imports it to the CMS Drupal 7.x. For this purpose the CMS must be setup correctly. That means that the ContentTypes of the CMS have to match with the ContentTypes used by the implementation. Another step that must be performed in the CMS is the annotation of the properties with the right vocabularies. This step is supported by a script that is able to extract all classes from the dump. In order to use the conference integration implementation the content management system Drupal 7.x has to be enriched by the following modules: RESTful Web Service [2]: This module provides a REST API that is used for the import process. RDFx [1]: Enriches the CMS with RDF capabilities for the mapping from internal properties to RDF vocabularies. In combination with the RESTful Web Services module it allows the serialization in different formats, e.g. RDF/XML, NTriples, Turtle. SPARQL [3]: The SPARQL module allows the data extraction with the help of SPARQL 1.0 queries. This extension is helpful for the export of the metadata. Figure 3. Implementation Architecture Diagram The related implementation extracts from the gained RDF dump the needed concepts and export them via REST API into the CMS. For this purpose an in-memory triplestore and SPARQL is used. Each class instance with related properties is mapped (with the help of SPARQL SELECT queries) to the code internal classes. Each class represents the mapping 4
  • 6. from one class from the ontology to one ContentType in the CMS. The instantiated objects are then converted to an XML message that is understandable by the content management systems. Such an XML messages represents the internal ContentType with related fields. The architecture diagram in Figure 3 describes graphical the import process. 5 Conclusion The approach presented in this paper outlines a strategy how to improve the visibility of conference data. For this purpose a basic alignment process was defined and supported by an implementation that should ease the alignment. The alignment algorithm describes a strategy how to improve the visibility and could be adapted to other use cases. As data sources the data from Semantic Web Dog Food and the internal data from the conference website was used. The algorithm was demonstrated on the base of easychair.org data and the content management system Drupal 7.x. The implementation was written as Ruby library. This allows the integration of the implementation into a bigger platform. A Mapping Definitions Prefix Namespace swdf http://data.semanticweb.org/ns/swc/ontology# foaf http://xmlns.com/foaf/0.1/ swrc http://swrc.ontoware.org/ontology# oc - (Online Communication Ontology, no URL yet) Semantic Web Dog Food Drupal Content Type RDF Concept swdf:ConferenceEvent ConferenceEvent oc:ConferenceSeries (or for ESWC: sti2:EWSC) foaf:Organization Organization schema:Organization foaf:Person Person schema:Person swdf:KeynoteTalk KeynoteTalk swdf:KeynoteTalk swrc:InProceedings InProceedings swrc:InProceedings swdf:Chair Chair swdf:Chair swdf:Presenter Presenter swdf:Presenter swrc:Proceedings Proceedings swrc:Proceedings 5
  • 7. Table 3. Class Mapping References 1. Drupal RDFx module. http://drupal.org/project/rdfx, accessed: 14.05.2012 2. Drupal RESTful Web Services module. http://drupal.org/project/restws, accessed: 14.05.2012 3. Drupal SPARQL module. http://drupal.org/project/sparql, accessed: 14.05.2012 4. Semantic web dog food. http://data.semanticweb.org/, accessed: 14.05.2012 6