SlideShare a Scribd company logo
Towards the Integration of a Research Group
      Website into the Web of Data

 Mikel Emaldi David Buj´n Diego L´pez de Ipi˜a
                         a            o      n
        {m.emaldi, dbujan, dipina}@deusto.es

        Deusto Institute of Technology - DeustoTech




                    November 2011
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work




       1 Motivation

       2 Our Solution
               First Approach
               Solution Overview
               Data Extraction
               System Architecture

       3 Linked Data Extension

       4 Conclusions

       5 Future Work


Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Table of Contents


   1 Motivation

   2 Our Solution
       First Approach
       Solution Overview
       Data Extraction
       System Architecture

   3 Linked Data Extension

   4 Conclusions

   5 Future Work
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work




Motivation


              The desire of offering our research group website’s
              (http://www.morelab.deusto.es) data as Linked Data
                      Our web is supported by Joomla! CMS
                              The data is unstructured




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work




Motivation


              The desire of offering our research group website’s
              (http://www.morelab.deusto.es) data as Linked Data
                      Our web is supported by Joomla! CMS
                              The data is unstructured
              We chose our publications section as first attempt
                      Almost 100 publications
                      Possibility to link them to external datasets
                      We saw the oportunity of centralize group’s FOAF files




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Table of Contents


   1 Motivation

   2 Our Solution
       First Approach
       Solution Overview
       Data Extraction
       System Architecture

   3 Linked Data Extension

   4 Conclusions

   5 Future Work
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



First Approach


First Approach



                 A solution based on Python web-script (mod python)
                 The core code of Joomla! was to be modified
                 Here there was a major problem:
                      When a security update was installed, Joomla! used to destroy
                      our custom code




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



Solution Overview


Joomla! Extension




              A solution based on an Extension for Joomla!




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



Solution Overview


Joomla! Extension

              A solution based on an Extension for Joomla!
                      Component




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



Solution Overview


Joomla! Extension

              A solution based on an Extension for Joomla!
                      Plugin




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



Solution Overview


Joomla! Extension




              A solution based on an Extension for Joomla!
                      It offers a feasible solution for analyze published publications
                      and to generate correspondent Linked Data




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation               Our Solution             Linked Data Extension                Conclusions               Future Work



Data Extraction


Joomla! Content Example


              TALISMAN+: Intelligent System for Follow-Up and
              Promotion of Personal Autonomy
                                    o         n      e               ´
              David Aus´ Diego L´pez-de-Ipi˜a, Jos´ Bravo, Miguel Angel Valero, Francisco Fl´rez. TALISMAN+:
                          ın,                                                                 o
              Intelligent System for Follow-Up and Promotion of Personal Autonomy. III International Workshop on
              Ambient Assisted Living - IWAAL 2011. M´laga, Spain. June 2011.
                                                       a

              The TALISMAN+ project, financed by the Spanish Ministry of Science and Innovation, aims to research
              and demonstrate innovative solutions transferable to society which offer services and products based on
              information and communication technologies in order to promote personal autonomy in prevention and
              monitoring scenarios. It will solve critical interoperability problems among systems and emerging
              technologies in a context where heterogeneity brings about accessibility barriers not yet overcome and
              demanded by the scientific, technological or social-health settings.

              Download




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                                          DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



Data Extraction


Overview




              Data is extracted throught three ways:




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



Data Extraction


Overview




              Data is extracted throught three ways:
                      User defined Regular Expression




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



Data Extraction


Overview




              Data is extracted throught three ways:
                      User defined Regular Expression
                      DBLP SPARQL Endpoint




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



Data Extraction


Overview




              Data is extracted throught three ways:
                      User defined Regular Expression
                      DBLP SPARQL Endpoint
                      Google Scholar search engine




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



Data Extraction


Regex I


              User defines a regular expression to parse its content
              User has to define used ontologies and their prefixes into the
              admin control panel
              The regex tags are clearly understandable
                      The ontology properties to be mapped are tagged between {}
                      Every delimiter (also the {}) is identified by a 
                      The term {dummy } can be used to ignore content




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation               Our Solution             Linked Data Extension                Conclusions               Future Work



Data Extraction


Regex II

                                    o         n      e               ´
              David Aus´ Diego L´pez-de-Ipi˜a, Jos´ Bravo, Miguel Angel Valero, Francisco Fl´rez. TALISMAN+:
                          ın,                                                                 o
              Intelligent System for Follow-Up and Promotion of Personal Autonomy. III International Workshop on
              Ambient Assisted Living - IWAAL 2011. M´laga, Spain. June 2011.
                                                       a

              The TALISMAN+ project, financed by the Spanish Ministry of Science and Innovation, aims to research
              and demonstrate innovative solutions transferable to society which offer services and products based on
              information and communication technologies in order to promote personal autonomy in prevention and
              monitoring scenarios. It will solve critical interoperability problems among systems and emerging
              technologies in a context where heterogeneity brings about accessibility barriers not yet overcome and
              demanded by the scientific, technological or social-health settings.

              Download




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                                          DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation               Our Solution                Linked Data Extension             Conclusions               Future Work



Data Extraction


Regex II

                                    o         n      e               ´
              David Aus´ Diego L´pez-de-Ipi˜a, Jos´ Bravo, Miguel Angel Valero, Francisco Fl´rez. TALISMAN+:
                          ın,                                                                 o
              Intelligent System for Follow-Up and Promotion of Personal Autonomy. III International Workshop on
              Ambient Assisted Living - IWAAL 2011. M´laga, Spain. June 2011.
                                                       a

              The TALISMAN+ project, financed by the Spanish Ministry of Science and Innovation, aims to research
              and demonstrate innovative solutions transferable to society which offer services and products based on
              information and communication technologies in order to promote personal autonomy in prevention and
              monitoring scenarios. It will solve critical interoperability problems among systems and emerging
              technologies in a context where heterogeneity brings about accessibility barriers not yet overcome and
              demanded by the scientific, technological or social-health settings.

              Download

              {dc : c r e a t o r , s e p ( , ) }  . {dc : t i t l e }.
              { s w r c : s e r i e s }. { s w r c : l o c a t i o n }.
              {dc : d a t e }. { b i b o : a b s t r a c t } Download$




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                                          DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



Data Extraction


DBLP I



              Digital Bibliography & Library Project
              > 1.3 million articles
              SPARQL endpoint at:
                      http://dblp.l3s.de/d2r/sparql/
                      http://dblp.l3s.de/d2r/snorql/




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension             Conclusions           Future Work



Data Extraction


DBLP II


              DBLP SPARQL endpoint is used to search data about
              publications
                      SELECT DISTINCT ?uri ?p ?o WHERE {?uri dc:title
                      “title-of-article”ˆˆ<http://www.w3.org/2001/XMLSchema#string>}

              Data is enriched with our own data and saved into the RDF
              store
              We also link members FOAF’s to DBLP authors data
                      <http://www.morelab.deusto.es/resource/dipina> owl:sameAs
                      <http://dblp.l3s.de/d2r/resource/authors/Diego L´pez-de-Ipi˜a> ;
                                                                      o          n




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                                    DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



Data Extraction


Google Scholar I


              A simple way to broadly search for scholarly literature
              http://scholar.google.com
              It exports data in diferent formats
                      BibTeX
                      EndNote
                      RefMan
                      RefWorks
                      WenXiangWang




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



Data Extraction


Google Scholar II




              The data from GS is extracted via BibTeX scrapping




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



Data Extraction


Google Scholar II


              The data from GS is extracted via BibTeX scrapping
                      An HTTP request using an specific cookie to retrieve BibTeX
                      data




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



Data Extraction


Google Scholar II



              The data from GS is extracted via BibTeX scrapping
                      BibTeX data is retrieved




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



Data Extraction


Google Scholar II

              The data from GS is extracted via BibTeX scrapping
                      Mapping from BibTeX data to RDF




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension             Conclusions              Future Work



Data Extraction


FOAF

              Every member of our group has its own FOAF file
                      http://www.morelab.deusto.es/resource/member-alias
              Every publication is linked to its author’s URI
                      <http://www.morelab.deusto.es/resource/imhotep-an-approach-to-user-and-device-conscious-
                      mobile-applications> dc:creator
                      <http://www.morelab.deusto.es/resource/dipina>


                      This is done automatically looking for author’s nicknames




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                                      DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



Data Extraction


Flowchart




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



System Architecture


Overview




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



System Architecture


Overview




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



System Architecture


Overview




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



System Architecture


Overview




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



System Architecture


Overview




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



System Architecture


Overview




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



System Architecture


Joseki + SDB


              Joseki
                      A SPARQL server for Jena
                      Storage into RDF files and relational databases
                      It allows SPARQL Updates
                      It is private for our system




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



System Architecture


Joseki + SDB


              Joseki
                      A SPARQL server for Jena
                      Storage into RDF files and relational databases
                      It allows SPARQL Updates
                      It is private for our system
              SDB
                      A component of Jena
                      It provides:
                              Scalable storage
                              Query of RDF datasets using conventional SQL databases



Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



System Architecture


Pubby
              Pubby adds Linked Data interfaces to SPARQL endpoints
              It allows content negotiation among these formats:
                      HTML
                      RDF/XML
                      N3




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work



System Architecture


Snorql

              An AJAXy front-end for exploring RDF SPARQL endpoints
              More usable than Joseki
              It is MoreLab’s public SPARQL endpoint




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Table of Contents


   1 Motivation

   2 Our Solution
       First Approach
       Solution Overview
       Data Extraction
       System Architecture

   3 Linked Data Extension

   4 Conclusions

   5 Future Work
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work




Admin Overview


              Dataset Creation:




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work




Admin Overview
              Ontology Prefix Definition:




              Regex Definition:




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work




User Overview




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Table of Contents


   1 Motivation

   2 Our Solution
       First Approach
       Solution Overview
       Data Extraction
       System Architecture

   3 Linked Data Extension

   4 Conclusions

   5 Future Work
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work




Conclusions




              This solution integrates our data into Web of Data easily
              Provides a reusable solution
              Opens the door to more extendable solutions




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Table of Contents


   1 Motivation

   2 Our Solution
       First Approach
       Solution Overview
       Data Extraction
       System Architecture

   3 Linked Data Extension

   4 Conclusions

   5 Future Work
Motivation              Our Solution              Linked Data Extension    Conclusions           Future Work




Future Work



              Link our datasets with more external datasets
                      DBPedia
                      Geonames
              RDF and SPARQL search form
              Externalize linked data sources
                      Building the Extension modularly




Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a
                       a          o          n                                           DeustoTech - Internet
Towards the Integration of a Research Group Website into the Web of Data
Towards the Integration of a Research Group
      Website into the Web of Data

 Mikel Emaldi David Buj´n Diego L´pez de Ipi˜a
                         a            o      n
        {m.emaldi, dbujan, dipina}@deusto.es

        Deusto Institute of Technology - DeustoTech




                    November 2011

More Related Content

Similar to Towards the Integration of Research Group Website into the Web of Data

Research paper on big data and hadoop
Research paper on big data and hadoopResearch paper on big data and hadoop
Research paper on big data and hadoop
Shree M.L.Kakadiya MCA mahila college, Amreli
 
Department of Commerce App Challenge: Big Data Dashboards
Department of Commerce App Challenge: Big Data DashboardsDepartment of Commerce App Challenge: Big Data Dashboards
Department of Commerce App Challenge: Big Data Dashboards
Brand Niemann
 
Toward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docxToward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docx
juliennehar
 
FAIR data_ Superior data visibility and reuse without warehousing.pdf
FAIR data_ Superior data visibility and reuse without warehousing.pdfFAIR data_ Superior data visibility and reuse without warehousing.pdf
FAIR data_ Superior data visibility and reuse without warehousing.pdf
Alan Morrison
 
Research Methodology (how to choose Datasets ).pptx
Research Methodology (how to choose Datasets ).pptxResearch Methodology (how to choose Datasets ).pptx
Research Methodology (how to choose Datasets ).pptx
Zainab Alhassani
 
What Is Mike2.0
What Is Mike2.0What Is Mike2.0
What Is Mike2.0
sean.mcclowry
 
Big data
Big dataBig data
Big data
Nimish Kochhar
 
Big data
Big dataBig data
Big data
Nimish Kochhar
 
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
Inside Analysis
 
Mike2.0 Methodology Overview
Mike2.0 Methodology OverviewMike2.0 Methodology Overview
Mike2.0 Methodology Overview
sean.mcclowry
 
How To Connect To Your Customers, Partners Securely, Privately and Effectively
How To Connect To Your Customers, Partners Securely, Privately and EffectivelyHow To Connect To Your Customers, Partners Securely, Privately and Effectively
How To Connect To Your Customers, Partners Securely, Privately and Effectively
Andy Harjanto
 
Business Intelligence for normal people
Business Intelligence for normal peopleBusiness Intelligence for normal people
Business Intelligence for normal people
mark madsen
 
Kbk group presentation
Kbk group presentationKbk group presentation
Kbk group presentation
AlexeyKudashov
 
Final Year Project Guidance
Final Year Project GuidanceFinal Year Project Guidance
Final Year Project Guidance
Varad Meru
 
IRJET- Youtube Data Sensitivity and Analysis using Hadoop Framework
IRJET-  	  Youtube Data Sensitivity and Analysis using Hadoop FrameworkIRJET-  	  Youtube Data Sensitivity and Analysis using Hadoop Framework
IRJET- Youtube Data Sensitivity and Analysis using Hadoop Framework
IRJET Journal
 
Forecast 2012 Panel: Big Data in the Cloud Das Kamhout
Forecast 2012 Panel: Big Data in the Cloud Das KamhoutForecast 2012 Panel: Big Data in the Cloud Das Kamhout
Forecast 2012 Panel: Big Data in the Cloud Das Kamhout
Open Data Center Alliance
 
Good enough
Good enoughGood enough
Good enough
Bethany Davis
 
GeoNode Motivation, Design, and Challenges
GeoNode Motivation, Design, and ChallengesGeoNode Motivation, Design, and Challenges
GeoNode Motivation, Design, and Challenges
Sebastian Benthall
 
Big data and its impact on SOA
Big data and its impact on SOABig data and its impact on SOA
Big data and its impact on SOA
Demed L'Her
 
Response needed 1The paper is well placed on the issues of the.docx
Response needed 1The paper is well placed on the issues of the.docxResponse needed 1The paper is well placed on the issues of the.docx
Response needed 1The paper is well placed on the issues of the.docx
audeleypearl
 

Similar to Towards the Integration of Research Group Website into the Web of Data (20)

Research paper on big data and hadoop
Research paper on big data and hadoopResearch paper on big data and hadoop
Research paper on big data and hadoop
 
Department of Commerce App Challenge: Big Data Dashboards
Department of Commerce App Challenge: Big Data DashboardsDepartment of Commerce App Challenge: Big Data Dashboards
Department of Commerce App Challenge: Big Data Dashboards
 
Toward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docxToward a System Building Agenda for Data Integration(and Dat.docx
Toward a System Building Agenda for Data Integration(and Dat.docx
 
FAIR data_ Superior data visibility and reuse without warehousing.pdf
FAIR data_ Superior data visibility and reuse without warehousing.pdfFAIR data_ Superior data visibility and reuse without warehousing.pdf
FAIR data_ Superior data visibility and reuse without warehousing.pdf
 
Research Methodology (how to choose Datasets ).pptx
Research Methodology (how to choose Datasets ).pptxResearch Methodology (how to choose Datasets ).pptx
Research Methodology (how to choose Datasets ).pptx
 
What Is Mike2.0
What Is Mike2.0What Is Mike2.0
What Is Mike2.0
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
 
Mike2.0 Methodology Overview
Mike2.0 Methodology OverviewMike2.0 Methodology Overview
Mike2.0 Methodology Overview
 
How To Connect To Your Customers, Partners Securely, Privately and Effectively
How To Connect To Your Customers, Partners Securely, Privately and EffectivelyHow To Connect To Your Customers, Partners Securely, Privately and Effectively
How To Connect To Your Customers, Partners Securely, Privately and Effectively
 
Business Intelligence for normal people
Business Intelligence for normal peopleBusiness Intelligence for normal people
Business Intelligence for normal people
 
Kbk group presentation
Kbk group presentationKbk group presentation
Kbk group presentation
 
Final Year Project Guidance
Final Year Project GuidanceFinal Year Project Guidance
Final Year Project Guidance
 
IRJET- Youtube Data Sensitivity and Analysis using Hadoop Framework
IRJET-  	  Youtube Data Sensitivity and Analysis using Hadoop FrameworkIRJET-  	  Youtube Data Sensitivity and Analysis using Hadoop Framework
IRJET- Youtube Data Sensitivity and Analysis using Hadoop Framework
 
Forecast 2012 Panel: Big Data in the Cloud Das Kamhout
Forecast 2012 Panel: Big Data in the Cloud Das KamhoutForecast 2012 Panel: Big Data in the Cloud Das Kamhout
Forecast 2012 Panel: Big Data in the Cloud Das Kamhout
 
Good enough
Good enoughGood enough
Good enough
 
GeoNode Motivation, Design, and Challenges
GeoNode Motivation, Design, and ChallengesGeoNode Motivation, Design, and Challenges
GeoNode Motivation, Design, and Challenges
 
Big data and its impact on SOA
Big data and its impact on SOABig data and its impact on SOA
Big data and its impact on SOA
 
Response needed 1The paper is well placed on the issues of the.docx
Response needed 1The paper is well placed on the issues of the.docxResponse needed 1The paper is well placed on the issues of the.docx
Response needed 1The paper is well placed on the issues of the.docx
 

Recently uploaded

Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
Pixlogix Infotech
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 

Recently uploaded (20)

Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 

Towards the Integration of Research Group Website into the Web of Data

  • 1. Towards the Integration of a Research Group Website into the Web of Data Mikel Emaldi David Buj´n Diego L´pez de Ipi˜a a o n {m.emaldi, dbujan, dipina}@deusto.es Deusto Institute of Technology - DeustoTech November 2011
  • 2. Motivation Our Solution Linked Data Extension Conclusions Future Work 1 Motivation 2 Our Solution First Approach Solution Overview Data Extraction System Architecture 3 Linked Data Extension 4 Conclusions 5 Future Work Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 3. Table of Contents 1 Motivation 2 Our Solution First Approach Solution Overview Data Extraction System Architecture 3 Linked Data Extension 4 Conclusions 5 Future Work
  • 4. Motivation Our Solution Linked Data Extension Conclusions Future Work Motivation The desire of offering our research group website’s (http://www.morelab.deusto.es) data as Linked Data Our web is supported by Joomla! CMS The data is unstructured Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 5. Motivation Our Solution Linked Data Extension Conclusions Future Work Motivation The desire of offering our research group website’s (http://www.morelab.deusto.es) data as Linked Data Our web is supported by Joomla! CMS The data is unstructured We chose our publications section as first attempt Almost 100 publications Possibility to link them to external datasets We saw the oportunity of centralize group’s FOAF files Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 6. Table of Contents 1 Motivation 2 Our Solution First Approach Solution Overview Data Extraction System Architecture 3 Linked Data Extension 4 Conclusions 5 Future Work
  • 7. Motivation Our Solution Linked Data Extension Conclusions Future Work First Approach First Approach A solution based on Python web-script (mod python) The core code of Joomla! was to be modified Here there was a major problem: When a security update was installed, Joomla! used to destroy our custom code Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 8. Motivation Our Solution Linked Data Extension Conclusions Future Work Solution Overview Joomla! Extension A solution based on an Extension for Joomla! Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 9. Motivation Our Solution Linked Data Extension Conclusions Future Work Solution Overview Joomla! Extension A solution based on an Extension for Joomla! Component Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 10. Motivation Our Solution Linked Data Extension Conclusions Future Work Solution Overview Joomla! Extension A solution based on an Extension for Joomla! Plugin Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 11. Motivation Our Solution Linked Data Extension Conclusions Future Work Solution Overview Joomla! Extension A solution based on an Extension for Joomla! It offers a feasible solution for analyze published publications and to generate correspondent Linked Data Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 12. Motivation Our Solution Linked Data Extension Conclusions Future Work Data Extraction Joomla! Content Example TALISMAN+: Intelligent System for Follow-Up and Promotion of Personal Autonomy o n e ´ David Aus´ Diego L´pez-de-Ipi˜a, Jos´ Bravo, Miguel Angel Valero, Francisco Fl´rez. TALISMAN+: ın, o Intelligent System for Follow-Up and Promotion of Personal Autonomy. III International Workshop on Ambient Assisted Living - IWAAL 2011. M´laga, Spain. June 2011. a The TALISMAN+ project, financed by the Spanish Ministry of Science and Innovation, aims to research and demonstrate innovative solutions transferable to society which offer services and products based on information and communication technologies in order to promote personal autonomy in prevention and monitoring scenarios. It will solve critical interoperability problems among systems and emerging technologies in a context where heterogeneity brings about accessibility barriers not yet overcome and demanded by the scientific, technological or social-health settings. Download Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 13. Motivation Our Solution Linked Data Extension Conclusions Future Work Data Extraction Overview Data is extracted throught three ways: Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 14. Motivation Our Solution Linked Data Extension Conclusions Future Work Data Extraction Overview Data is extracted throught three ways: User defined Regular Expression Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 15. Motivation Our Solution Linked Data Extension Conclusions Future Work Data Extraction Overview Data is extracted throught three ways: User defined Regular Expression DBLP SPARQL Endpoint Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 16. Motivation Our Solution Linked Data Extension Conclusions Future Work Data Extraction Overview Data is extracted throught three ways: User defined Regular Expression DBLP SPARQL Endpoint Google Scholar search engine Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 17. Motivation Our Solution Linked Data Extension Conclusions Future Work Data Extraction Regex I User defines a regular expression to parse its content User has to define used ontologies and their prefixes into the admin control panel The regex tags are clearly understandable The ontology properties to be mapped are tagged between {} Every delimiter (also the {}) is identified by a The term {dummy } can be used to ignore content Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 18. Motivation Our Solution Linked Data Extension Conclusions Future Work Data Extraction Regex II o n e ´ David Aus´ Diego L´pez-de-Ipi˜a, Jos´ Bravo, Miguel Angel Valero, Francisco Fl´rez. TALISMAN+: ın, o Intelligent System for Follow-Up and Promotion of Personal Autonomy. III International Workshop on Ambient Assisted Living - IWAAL 2011. M´laga, Spain. June 2011. a The TALISMAN+ project, financed by the Spanish Ministry of Science and Innovation, aims to research and demonstrate innovative solutions transferable to society which offer services and products based on information and communication technologies in order to promote personal autonomy in prevention and monitoring scenarios. It will solve critical interoperability problems among systems and emerging technologies in a context where heterogeneity brings about accessibility barriers not yet overcome and demanded by the scientific, technological or social-health settings. Download Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 19. Motivation Our Solution Linked Data Extension Conclusions Future Work Data Extraction Regex II o n e ´ David Aus´ Diego L´pez-de-Ipi˜a, Jos´ Bravo, Miguel Angel Valero, Francisco Fl´rez. TALISMAN+: ın, o Intelligent System for Follow-Up and Promotion of Personal Autonomy. III International Workshop on Ambient Assisted Living - IWAAL 2011. M´laga, Spain. June 2011. a The TALISMAN+ project, financed by the Spanish Ministry of Science and Innovation, aims to research and demonstrate innovative solutions transferable to society which offer services and products based on information and communication technologies in order to promote personal autonomy in prevention and monitoring scenarios. It will solve critical interoperability problems among systems and emerging technologies in a context where heterogeneity brings about accessibility barriers not yet overcome and demanded by the scientific, technological or social-health settings. Download {dc : c r e a t o r , s e p ( , ) } . {dc : t i t l e }. { s w r c : s e r i e s }. { s w r c : l o c a t i o n }. {dc : d a t e }. { b i b o : a b s t r a c t } Download$ Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 20. Motivation Our Solution Linked Data Extension Conclusions Future Work Data Extraction DBLP I Digital Bibliography & Library Project > 1.3 million articles SPARQL endpoint at: http://dblp.l3s.de/d2r/sparql/ http://dblp.l3s.de/d2r/snorql/ Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 21. Motivation Our Solution Linked Data Extension Conclusions Future Work Data Extraction DBLP II DBLP SPARQL endpoint is used to search data about publications SELECT DISTINCT ?uri ?p ?o WHERE {?uri dc:title “title-of-article”ˆˆ<http://www.w3.org/2001/XMLSchema#string>} Data is enriched with our own data and saved into the RDF store We also link members FOAF’s to DBLP authors data <http://www.morelab.deusto.es/resource/dipina> owl:sameAs <http://dblp.l3s.de/d2r/resource/authors/Diego L´pez-de-Ipi˜a> ; o n Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 22. Motivation Our Solution Linked Data Extension Conclusions Future Work Data Extraction Google Scholar I A simple way to broadly search for scholarly literature http://scholar.google.com It exports data in diferent formats BibTeX EndNote RefMan RefWorks WenXiangWang Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 23. Motivation Our Solution Linked Data Extension Conclusions Future Work Data Extraction Google Scholar II The data from GS is extracted via BibTeX scrapping Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 24. Motivation Our Solution Linked Data Extension Conclusions Future Work Data Extraction Google Scholar II The data from GS is extracted via BibTeX scrapping An HTTP request using an specific cookie to retrieve BibTeX data Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 25. Motivation Our Solution Linked Data Extension Conclusions Future Work Data Extraction Google Scholar II The data from GS is extracted via BibTeX scrapping BibTeX data is retrieved Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 26. Motivation Our Solution Linked Data Extension Conclusions Future Work Data Extraction Google Scholar II The data from GS is extracted via BibTeX scrapping Mapping from BibTeX data to RDF Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 27. Motivation Our Solution Linked Data Extension Conclusions Future Work Data Extraction FOAF Every member of our group has its own FOAF file http://www.morelab.deusto.es/resource/member-alias Every publication is linked to its author’s URI <http://www.morelab.deusto.es/resource/imhotep-an-approach-to-user-and-device-conscious- mobile-applications> dc:creator <http://www.morelab.deusto.es/resource/dipina> This is done automatically looking for author’s nicknames Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 28. Motivation Our Solution Linked Data Extension Conclusions Future Work Data Extraction Flowchart Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 29. Motivation Our Solution Linked Data Extension Conclusions Future Work System Architecture Overview Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 30. Motivation Our Solution Linked Data Extension Conclusions Future Work System Architecture Overview Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 31. Motivation Our Solution Linked Data Extension Conclusions Future Work System Architecture Overview Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 32. Motivation Our Solution Linked Data Extension Conclusions Future Work System Architecture Overview Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 33. Motivation Our Solution Linked Data Extension Conclusions Future Work System Architecture Overview Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 34. Motivation Our Solution Linked Data Extension Conclusions Future Work System Architecture Overview Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 35. Motivation Our Solution Linked Data Extension Conclusions Future Work System Architecture Joseki + SDB Joseki A SPARQL server for Jena Storage into RDF files and relational databases It allows SPARQL Updates It is private for our system Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 36. Motivation Our Solution Linked Data Extension Conclusions Future Work System Architecture Joseki + SDB Joseki A SPARQL server for Jena Storage into RDF files and relational databases It allows SPARQL Updates It is private for our system SDB A component of Jena It provides: Scalable storage Query of RDF datasets using conventional SQL databases Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 37. Motivation Our Solution Linked Data Extension Conclusions Future Work System Architecture Pubby Pubby adds Linked Data interfaces to SPARQL endpoints It allows content negotiation among these formats: HTML RDF/XML N3 Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 38. Motivation Our Solution Linked Data Extension Conclusions Future Work System Architecture Snorql An AJAXy front-end for exploring RDF SPARQL endpoints More usable than Joseki It is MoreLab’s public SPARQL endpoint Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 39. Table of Contents 1 Motivation 2 Our Solution First Approach Solution Overview Data Extraction System Architecture 3 Linked Data Extension 4 Conclusions 5 Future Work
  • 40. Motivation Our Solution Linked Data Extension Conclusions Future Work Admin Overview Dataset Creation: Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 41. Motivation Our Solution Linked Data Extension Conclusions Future Work Admin Overview Ontology Prefix Definition: Regex Definition: Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 42. Motivation Our Solution Linked Data Extension Conclusions Future Work User Overview Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 43. Table of Contents 1 Motivation 2 Our Solution First Approach Solution Overview Data Extraction System Architecture 3 Linked Data Extension 4 Conclusions 5 Future Work
  • 44. Motivation Our Solution Linked Data Extension Conclusions Future Work Conclusions This solution integrates our data into Web of Data easily Provides a reusable solution Opens the door to more extendable solutions Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 45. Table of Contents 1 Motivation 2 Our Solution First Approach Solution Overview Data Extraction System Architecture 3 Linked Data Extension 4 Conclusions 5 Future Work
  • 46. Motivation Our Solution Linked Data Extension Conclusions Future Work Future Work Link our datasets with more external datasets DBPedia Geonames RDF and SPARQL search form Externalize linked data sources Building the Extension modularly Mikel Emaldi, David Buj´n, Diego L´pez de Ipi˜a a o n DeustoTech - Internet Towards the Integration of a Research Group Website into the Web of Data
  • 47. Towards the Integration of a Research Group Website into the Web of Data Mikel Emaldi David Buj´n Diego L´pez de Ipi˜a a o n {m.emaldi, dbujan, dipina}@deusto.es Deusto Institute of Technology - DeustoTech November 2011