Towards the Integration of a Research Group Website into the Web of Data Mikel Emaldi David Buj´n Diego L´pez de Ipi˜a a o n {m.emaldi, dbujan, dipina}@deusto.es Deusto Institute of Technology - DeustoTech November 2011
Motivation Our Solution Linked Data Extension Conclusions Future Work 1 Motivation 2 Our Solution First Approach Solution Overview Data Extraction System Architecture 3 Linked Data Extension 4 Conclusions 5 Future WorkMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Table of Contents 1 Motivation 2 Our Solution First Approach Solution Overview Data Extraction System Architecture 3 Linked Data Extension 4 Conclusions 5 Future Work
Motivation Our Solution Linked Data Extension Conclusions Future WorkMotivation The desire of oﬀering our research group website’s (http://www.morelab.deusto.es) data as Linked Data Our web is supported by Joomla! CMS The data is unstructuredMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkMotivation The desire of oﬀering our research group website’s (http://www.morelab.deusto.es) data as Linked Data Our web is supported by Joomla! CMS The data is unstructured We chose our publications section as ﬁrst attempt Almost 100 publications Possibility to link them to external datasets We saw the oportunity of centralize group’s FOAF ﬁlesMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkFirst ApproachFirst Approach A solution based on Python web-script (mod python) The core code of Joomla! was to be modiﬁed Here there was a major problem: When a security update was installed, Joomla! used to destroy our custom codeMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkSolution OverviewJoomla! Extension A solution based on an Extension for Joomla!Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkSolution OverviewJoomla! Extension A solution based on an Extension for Joomla! ComponentMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkSolution OverviewJoomla! Extension A solution based on an Extension for Joomla! PluginMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkSolution OverviewJoomla! Extension A solution based on an Extension for Joomla! It oﬀers a feasible solution for analyze published publications and to generate correspondent Linked DataMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkData ExtractionJoomla! Content Example TALISMAN+: Intelligent System for Follow-Up and Promotion of Personal Autonomy o n e ´ David Aus´ Diego L´pez-de-Ipi˜a, Jos´ Bravo, Miguel Angel Valero, Francisco Fl´rez. TALISMAN+: ın, o Intelligent System for Follow-Up and Promotion of Personal Autonomy. III International Workshop on Ambient Assisted Living - IWAAL 2011. M´laga, Spain. June 2011. a The TALISMAN+ project, ﬁnanced by the Spanish Ministry of Science and Innovation, aims to research and demonstrate innovative solutions transferable to society which oﬀer services and products based on information and communication technologies in order to promote personal autonomy in prevention and monitoring scenarios. It will solve critical interoperability problems among systems and emerging technologies in a context where heterogeneity brings about accessibility barriers not yet overcome and demanded by the scientiﬁc, technological or social-health settings. DownloadMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkData ExtractionOverview Data is extracted throught three ways:Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkData ExtractionOverview Data is extracted throught three ways: User deﬁned Regular ExpressionMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkData ExtractionOverview Data is extracted throught three ways: User deﬁned Regular Expression DBLP SPARQL EndpointMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkData ExtractionOverview Data is extracted throught three ways: User deﬁned Regular Expression DBLP SPARQL Endpoint Google Scholar search engineMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkData ExtractionRegex I User deﬁnes a regular expression to parse its content User has to deﬁne used ontologies and their preﬁxes into the admin control panel The regex tags are clearly understandable The ontology properties to be mapped are tagged between {} Every delimiter (also the {}) is identiﬁed by a The term {dummy } can be used to ignore contentMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkData ExtractionRegex II o n e ´ David Aus´ Diego L´pez-de-Ipi˜a, Jos´ Bravo, Miguel Angel Valero, Francisco Fl´rez. TALISMAN+: ın, o Intelligent System for Follow-Up and Promotion of Personal Autonomy. III International Workshop on Ambient Assisted Living - IWAAL 2011. M´laga, Spain. June 2011. a The TALISMAN+ project, ﬁnanced by the Spanish Ministry of Science and Innovation, aims to research and demonstrate innovative solutions transferable to society which oﬀer services and products based on information and communication technologies in order to promote personal autonomy in prevention and monitoring scenarios. It will solve critical interoperability problems among systems and emerging technologies in a context where heterogeneity brings about accessibility barriers not yet overcome and demanded by the scientiﬁc, technological or social-health settings. DownloadMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkData ExtractionRegex II o n e ´ David Aus´ Diego L´pez-de-Ipi˜a, Jos´ Bravo, Miguel Angel Valero, Francisco Fl´rez. TALISMAN+: ın, o Intelligent System for Follow-Up and Promotion of Personal Autonomy. III International Workshop on Ambient Assisted Living - IWAAL 2011. M´laga, Spain. June 2011. a The TALISMAN+ project, ﬁnanced by the Spanish Ministry of Science and Innovation, aims to research and demonstrate innovative solutions transferable to society which oﬀer services and products based on information and communication technologies in order to promote personal autonomy in prevention and monitoring scenarios. It will solve critical interoperability problems among systems and emerging technologies in a context where heterogeneity brings about accessibility barriers not yet overcome and demanded by the scientiﬁc, technological or social-health settings. Download {dc : c r e a t o r , s e p ( , ) } . {dc : t i t l e }. { s w r c : s e r i e s }. { s w r c : l o c a t i o n }. {dc : d a t e }. { b i b o : a b s t r a c t } Download$Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkData ExtractionDBLP I Digital Bibliography & Library Project > 1.3 million articles SPARQL endpoint at: http://dblp.l3s.de/d2r/sparql/ http://dblp.l3s.de/d2r/snorql/Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkData ExtractionDBLP II DBLP SPARQL endpoint is used to search data about publications SELECT DISTINCT ?uri ?p ?o WHERE {?uri dc:title “title-of-article”ˆˆ<http://www.w3.org/2001/XMLSchema#string>} Data is enriched with our own data and saved into the RDF store We also link members FOAF’s to DBLP authors data <http://www.morelab.deusto.es/resource/dipina> owl:sameAs <http://dblp.l3s.de/d2r/resource/authors/Diego L´pez-de-Ipi˜a> ; o nMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkData ExtractionGoogle Scholar I A simple way to broadly search for scholarly literature http://scholar.google.com It exports data in diferent formats BibTeX EndNote RefMan RefWorks WenXiangWangMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkData ExtractionGoogle Scholar II The data from GS is extracted via BibTeX scrappingMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkData ExtractionGoogle Scholar II The data from GS is extracted via BibTeX scrapping An HTTP request using an speciﬁc cookie to retrieve BibTeX dataMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkData ExtractionGoogle Scholar II The data from GS is extracted via BibTeX scrapping BibTeX data is retrievedMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkData ExtractionGoogle Scholar II The data from GS is extracted via BibTeX scrapping Mapping from BibTeX data to RDFMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkData ExtractionFOAF Every member of our group has its own FOAF ﬁle http://www.morelab.deusto.es/resource/member-alias Every publication is linked to its author’s URI <http://www.morelab.deusto.es/resource/imhotep-an-approach-to-user-and-device-conscious- mobile-applications> dc:creator <http://www.morelab.deusto.es/resource/dipina> This is done automatically looking for author’s nicknamesMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkData ExtractionFlowchartMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkSystem ArchitectureOverviewMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkSystem ArchitectureOverviewMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkSystem ArchitectureOverviewMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkSystem ArchitectureOverviewMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkSystem ArchitectureOverviewMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkSystem ArchitectureOverviewMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkSystem ArchitectureJoseki + SDB Joseki A SPARQL server for Jena Storage into RDF ﬁles and relational databases It allows SPARQL Updates It is private for our systemMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkSystem ArchitectureJoseki + SDB Joseki A SPARQL server for Jena Storage into RDF ﬁles and relational databases It allows SPARQL Updates It is private for our system SDB A component of Jena It provides: Scalable storage Query of RDF datasets using conventional SQL databasesMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkSystem ArchitecturePubby Pubby adds Linked Data interfaces to SPARQL endpoints It allows content negotiation among these formats: HTML RDF/XML N3Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkSystem ArchitectureSnorql An AJAXy front-end for exploring RDF SPARQL endpoints More usable than Joseki It is MoreLab’s public SPARQL endpointMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkAdmin Overview Dataset Creation:Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkAdmin Overview Ontology Preﬁx Deﬁnition: Regex Deﬁnition:Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkUser OverviewMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkConclusions This solution integrates our data into Web of Data easily Provides a reusable solution Opens the door to more extendable solutionsMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
Motivation Our Solution Linked Data Extension Conclusions Future WorkFuture Work Link our datasets with more external datasets DBPedia Geonames RDF and SPARQL search form Externalize linked data sources Building the Extension modularlyMikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - InternetTowards the Integration of a Research Group Website into the Web of Data
