SlideShare a Scribd company logo
Query Expansion Methods and
                    Performance Evaluation
                              for
                Reusing Linking Open Data of the
              European Public Procurement Notices

                               José María Álvarez Rodríguez
                                   WESO-Universidad de Oviedo
                                   http://purl.org/weso/moldeas/
                      Tecnologías de Linked Data y sus aplicaciones en España (TLDE)
                                       CAEPIA 2011-Tenerife (Spain)
                                          8th of November, 2011

Code: TSI-020100-2010-919
Overview
 Use case & Context

SPARQL & Performance

     Next Steps
Objective




Creation of a pan-european
 e-procurement platform
E-procurement
   Long Tail
 TED
        BOE
       (official bulletin
        of the Spanish
        Governement)        BOPA
                            (official bulletin
                             of the Asturian
                             Governement)
To Be Able to answer to…



     Which public procurement notices are
relevant to Dutch companies (only SMEs) that
  want to tender for contracts announced by
 local authorities with a total value lower than
 170K € to procure “Road bridge construction
  work” and a two year duration in the Dutch-
    speaking region of Flanders (Belgium)?
Structuring public procurement notices
    d
                                                                 Providing new semantic-
                                                                      based services

             yD             Z                           ^
                                                                             ^
                                                                                 ^
                                                           D         D
K                                                                      W




KW          LOD
          enrichment



K                                                                                   

                                                                     W               ^


    Ws                      Z
                                                                    Easing the access to the
                                                                    published data using the
Ehd^                                                                     LOD approach


                      Transforming government classifications
Preliminary Results

/           d            d


      W
s                        Z



K


Ehd^


W
Semantic-based
         Services


      Problem of
«Query Expansion»
depending on the kind of
  information variable
Methods of«Query
                       Expansion»
                                 



                     /                           '



d                                                Z
                                    E



                            '              



    ^           ^            h       ,



        ^
                                         



    Z
Remembering…



     Which public procurement notices are
relevant to Dutch companies (only SMEs) that
  want to tender for contracts announced by
 local authorities with a total value lower than
 170K € to procure “Road bridge construction
  work” and a two year duration in the Dutch-
    speaking region of Flanders (Belgium)?
cpv:45221111-3
  NL




                                      Query…
                                  Ehd^    Z'
                                    t KEE


                                         ppn:nutsCode
                    ppn:hasDuration



                                             cpv:CodeIn2008



                 ppn:hasAmount                    org:classification



                                                  ^D
cpv:45221111-3
  NL


                      Applying Query Expansion…
                                       Ehd^ 
                                      Ehd^ E
                                      Ehd^
                                       Ehd^ 


                                             ppn:nutsCode
                    ppn:hasDuration




                                                 cpv:CodeIn2008


                 ppn:hasAmount                         org:classification



                                                       ^D
Example of SPARQL
                         query
SELECT DISTINCT * WHERE {
   ?ppn       rdf:type          http://purl.org/weso/ppn/def#ppn.
   ?ppn       ppn:nutsCode       ?nutsCode.
   ?ppn       cpv:codeIn2008 ?cpvCode.
   ?ppn       ppn:hasDuration ?duration
   ?ppn       dc:identifier      ?id.
   ?ppn       dc:date             ?date .
   ? ppn      ppn:hasAmount ?amount.
    FILTER(? cpvCode = cpv:45221111-3 ... ) .
    FILTER (
        (xsd:double(?amount) = xsd:long(170,000)) 
        (xsd:double(?amount) = xsd:long(200,000)) ).
.   FILTER(?nutsCode = nuts:B3 ... ) .
    FILTER (
        (xsd:long(?duration) = xsd:long(2)) 
        (xsd:long(?duration) =     xsd:long(3)) ).
}
Context

Performance of SPARQL
       Queries

     ~30 sec.
Hardware 
        Software
DELL PC 2GB RAM and 30GB HardDisk
      Virtual Box (version 4.0.6)

Linux 2.6.35-22-server #33-Ubuntu 2 SMP
           x86_64 GNU/Linux
              Ubuntu 10.10

   OpenLink Virtuoso Opensource-6-
              20110218
Question?
How to decrease the time of
 query execution without
modify the hardware and not
 use any vendor feature?
TripleStore
    25 graphs
20 M of RDF Triples
       But…
     8 graphs
11 M of RDF Triples
Focus on..
The generation of SPARQL
         queries
Let’s start…


9 SPARQL Queries

  3 executions
d   ^      /D/d   /dZ   'Z W,^   ^   W   d

d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
Simple SPARQL query

SELECT DISTINCT * WHERE {
   ?ppn    rdf:type        http://purl.org/weso/ppn/def#ppn.
   ?ppn    ppn:nutsCode     ?nutsCode.
   ?ppn    cpv:codeIn2008 ?cpvCode.
   ?ppn    ppn:hasDuration ?duration
   ?ppn    dc:identifier    ?id.
   ?ppn    dc:date           ?date .
   ? ppn   ppn:hasAmount ?amount.
   FILTER(? cpvCode = cpv:15331137 ) .
.  FILTER(?nutsCode = nuts:UK) .
}
Simple Query

    1 CPV Code
   1 NUTS Code


Time: ~3,29 sec.
T1

Rewrite SPARQL queries:
Match triples from specific to
           general
  Filter as soon as possible
T2

Use the LIMIT clause

 Value set to 10,000
Rewrite SPARQL query

SELECT DISTINCT * WHERE {
   ?ppn     rdf:type        http://purl.org/weso/ppn/def#ppn.
   ?ppn     cpv:codeIn2008 ?cpvCode.
    FILTER(? cpvCode = cpv:15331137 ) .
    ?ppn    ppn:nutsCode     ?nutsCode.
    FILTER(?nutsCode = nuts:UK) .
   ?ppn     ppn:hasDuration ?duration
   ?ppn     dc:identifier    ?id.
   ?ppn     dc:date           ?date .
    ? ppn   ppn:hasAmount ?amount.
.  }
LIMIT 10000
Results T2

    1 CPV Code
   1 NUTS Code



Time: ~3,26 sec.
Evaluation

  There is no significant
changes in execution time
       and gain…
           and
   We are interested in
   “enhanced queries”
T3

Execution of enhanced
       queries
Enhanced SPARQL
            query
SELECT DISTINCT * WHERE {
   ?ppn    rdf:type        http://purl.org/weso/ppn/def#ppn.
   ?ppn    ppn:nutsCode     ?nutsCode.
   ?ppn    cpv:codeIn2008 ?cpvCode.
   ?ppn    ppn:hasDuration ?duration
   ?ppn    dc:identifier    ?id.
   ?ppn    dc:date           ?date .
   ? ppn   ppn:hasAmount ?amount.
   FILTER(? cpvCode = {cpv:15331137 , cpv:48611000,
           cpv: 48611000, cpv:50531510, cpv: 15871210}) .
.  FILTER(?nutsCode = {nuts:B3, nuts:PL, nuts:RO ) .
}
Results T3

    5 CPV Codes
   3 NUTS Codes
       1 query


Time: ~20,65 sec.
T4

Rewrite SPARQL queries
           +
 Use the LIMIT clause
Results T4 wrt T3

    5 CPV Codes
   3 NUTS Codes
       1 query


Time: ~20,55 sec.
Info

     8 graphs

11 M of RDF Triples
T5

Rewrite SPARQL queries
            +
  Use the LIMIT clause
            +
 Named Graphs (FROM)
Results T5 wrt T3

    5 CPV Codes
   3 NUTS Codes
       1 query


Time: ~20,65 sec.
T6
Rewrite SPARQL queries
             +
  Use the LIMIT clause
             +
 Named Graphs (FROM)
             +
Split into simple queries
Results T6 wrt T3
    5 CPV Codes
   3 NUTS Codes
      4 Graphs
  4 simple queries

Time: ~20,60 sec.
T6-1
        Rewrite SPARQL queries
                    +
          Use the LIMIT clause
                    +
         Named Graphs (FROM)
                    +
Split enhance query into simple queries
                    +
   Parallelization of query execution
          (ad-hoc map/reduce)
Results T6-1 wrt T3
    5 CPV Codes
   3 NUTS Codes
      4 Graphs
  4 simple queries

Time: ~11,93 sec.
T7
        Rewrite SPARQL queries
                   +
         Use the LIMIT clause
                   +
Split enhance query into simple queries
Results T7 wrt T3
   1 CPV Code (5)
    3 NUTS Code
  5 simple queries


Time: ~15,81 sec.
T7-1
        Rewrite SPARQL queries
                    +
          Use the LIMIT clause
                    +
Split enhance query into simple queries
                    +
   Parallelization of query execution
          (ad-hoc map/reduce)
Results T7-1 wrt T3
   1 CPV Code (5)
   3 NUTS Codes
  5 simple queries


Time: ~10,55 sec.
T8
Rewrite SPARQL queries
             +
  Use the LIMIT clause
             +
 Named Graphs (FROM)
             +
Split into simple queries
Results T8 wrt T3
   1 CPV Code (5)
   3 NUTS Codes
      4 Graphs
  20 simple queries

Time: ~32,34 sec.
T8-1
        Rewrite SPARQL queries
                    +
          Use the LIMIT clause
                    +
         Named Graphs (FROM)
                    +
Split enhance query into simple queries
                    +
   Parallelization of query execution
          (ad-hoc map/reduce)
Results T8-1 wrt T3
   1 CPV Code (5)
   3 NUTS Codes
      4 Graphs
  20 simple queries

Time: ~18,45 sec.
T9
        Rewrite SPARQL queries
                    +
          Use the LIMIT clause
                    +
Split enhance query into simple queries
       (1 CPV code+1 NUTS code)
Results T9 wrt T3
    1 CPV Code (5)
   1 NUTS Code (3)
  15 simple queries


Time: ~22,462 sec.
T9-1
        Rewrite SPARQL queries
                    +
          Use the LIMIT clause
                    +
Split enhance query into simple queries
       (1 CPV code+1 NUTS code)
                    +
   Parallelization of query execution
           (ad-hoc map/reduce)
Results T9-1 wrt T3
    1 CPV Code (5)
   1 NUTS Code (3)
  15 simple queries


Time: ~12,77 sec.
T10
 Rewrite SPARQL queries
               +
   Use the LIMIT clause
               +
  Named Graphs (FROM)
               +
  Split into simple queries
(1 CPV code+1 NUTS code)
Results T10 wrt T3
    1 CPV Code (5)
   1 NUTS Code (3)
       4 Graphs
  60 simple queries

Time: ~71,17 sec.
T10-1
        Rewrite SPARQL queries
                    +
          Use the LIMIT clause
                    +
         Named Graphs (FROM)
                    +
Split enhance query into simple queries
       (1 CPV code+1 NUTS code)
                    +
   Parallelization of query execution
           (ad-hoc map/reduce)
Results T10-1 wrt T3
    1 CPV Code (5)
   1 NUTS Code (3)
       4 Graphs
  60 simple queries

Time: ~35,13 sec.
d       Table of Results
           d        '
    d               E
    d
    d               E
    d
    d
    d
    d
    d
    d
    d
    d
    d
    d
    d
    d
Discussion
•       The number of queries is a key-factor
•       The number of CPV codes implies more
        execution time
•       The parallelization improves execution
        time
•       T7-1 is the best execution in terms of
        time
    •     Rewrite SPARQL queries
    •     Use the LIMIT clause
    •     Split enhance query into simple queries
    •     Parallelization of query execution
Further Steps

• Distribute graphs in different nodes
  (HW improvement)
• Use of other triple stores
• (SW comparison)
• Add SPARQL 1.1 new features
  (Expressiveness improvement)
• Cache of queries (SW improvement)
Some
              References…
•   http://www4.wiwiss.fu-
    berlin.de/bizer/BerlinSPARQLBenchmark/results/index.html#comparison
•   http://www.slideshare.net/olafhartig/an-overview-on-linked-data-
    management-and-sparql-querying-isslod2011
•   http://squin.sourceforge.net/
•   http://www2.informatik.hu-
    berlin.de/~hartig/files/Slides_Hartig_ISSLOD2011.pdf
•   http://www2008.org/papers/pdf/p595-stocker1.pdf
•   http://www.informatik.uni-
    freiburg.de/~mschmidt/docs/diss_final01122010.pdf
•   http://mayor2.dia.fi.upm.es/oeg-upm/files/sparql-dqp/eswc11-bac-ext.pdf
•   http://www.slideshare.net/olafhartig/the-sparql-query-graph-model-for-
    query-optimization-1259536
•   http://www.w3.org/TR/sparql-features/
Query Expansion Methods and
                    Performance Evaluation
                              for
                Reusing Linking Open Data of the
              European Public Procurement Notices

                               José María Álvarez Rodríguez
                                   WESO-Universidad de Oviedo
                                   http://purl.org/weso/moldeas/
                      Tecnologías de Linked Data y sus aplicaciones en España (TLDE)
                                       CAEPIA 2011-Tenerife (Spain)
                                          8th of November, 2011

Code: TSI-020100-2010-919
WESO CAEPIA-20111108

More Related Content

Viewers also liked

The rise of group buying sites
The rise of group buying sitesThe rise of group buying sites
The rise of group buying sites
Nj Lopez-Tan
 
Do Something Impossible (made with love by @stratlab)
Do Something Impossible (made with love by @stratlab)Do Something Impossible (made with love by @stratlab)
Do Something Impossible (made with love by @stratlab)
Jeph Maystruck
 
Bringing Agile to universities
Bringing Agile to universitiesBringing Agile to universities
Bringing Agile to universities
Marc Lainez
 
Status för rättsinformationsprojektet
Status för rättsinformationsprojektetStatus för rättsinformationsprojektet
Status för rättsinformationsprojektetstafmal
 
Activism social
Activism socialActivism social
Activism social
JosephStash
 
Geodesic dome structures
Geodesic dome structuresGeodesic dome structures
Geodesic dome structures
IreneJohnLora
 
Health
HealthHealth
Health
wolsworld
 
How to brand and sell online
How to brand and sell onlineHow to brand and sell online
How to brand and sell online
Nj Lopez-Tan
 
「共通点を見つける練習」宮下芳明(明治大学)
「共通点を見つける練習」宮下芳明(明治大学)「共通点を見つける練習」宮下芳明(明治大学)
「共通点を見つける練習」宮下芳明(明治大学)
Homei Miyashita
 
Computers
ComputersComputers
Computers
molllymunkatchy
 
Intro til de 5 tjenester
Intro til de 5 tjenesterIntro til de 5 tjenester
Intro til de 5 tjenesterUTH2010
 
Lipi
LipiLipi
Does your website speak Chinese?
Does your website speak Chinese?Does your website speak Chinese?
Does your website speak Chinese?
KenticoCMS
 
[사회적기업가포럼]사단법인 씨즈 김동훈 청년국장
[사회적기업가포럼]사단법인 씨즈 김동훈 청년국장[사회적기업가포럼]사단법인 씨즈 김동훈 청년국장
[사회적기업가포럼]사단법인 씨즈 김동훈 청년국장
Hwajun Song
 
A Hinkhouse Design
A Hinkhouse DesignA Hinkhouse Design
A Hinkhouse Designborracho13
 
Questionnaire results
Questionnaire results Questionnaire results
Questionnaire results
nicolaalalaa
 
Demokväll med rättsinformationssystemet
Demokväll med rättsinformationssystemetDemokväll med rättsinformationssystemet
Demokväll med rättsinformationssystemet
stafmal
 

Viewers also liked (20)

The rise of group buying sites
The rise of group buying sitesThe rise of group buying sites
The rise of group buying sites
 
Do Something Impossible (made with love by @stratlab)
Do Something Impossible (made with love by @stratlab)Do Something Impossible (made with love by @stratlab)
Do Something Impossible (made with love by @stratlab)
 
Esmoda
EsmodaEsmoda
Esmoda
 
Bringing Agile to universities
Bringing Agile to universitiesBringing Agile to universities
Bringing Agile to universities
 
La llegenda d
La llegenda dLa llegenda d
La llegenda d
 
Status för rättsinformationsprojektet
Status för rättsinformationsprojektetStatus för rättsinformationsprojektet
Status för rättsinformationsprojektet
 
Activism social
Activism socialActivism social
Activism social
 
Geodesic dome structures
Geodesic dome structuresGeodesic dome structures
Geodesic dome structures
 
Health
HealthHealth
Health
 
How to brand and sell online
How to brand and sell onlineHow to brand and sell online
How to brand and sell online
 
「共通点を見つける練習」宮下芳明(明治大学)
「共通点を見つける練習」宮下芳明(明治大学)「共通点を見つける練習」宮下芳明(明治大学)
「共通点を見つける練習」宮下芳明(明治大学)
 
Interaction keynote
Interaction keynoteInteraction keynote
Interaction keynote
 
Computers
ComputersComputers
Computers
 
Intro til de 5 tjenester
Intro til de 5 tjenesterIntro til de 5 tjenester
Intro til de 5 tjenester
 
Lipi
LipiLipi
Lipi
 
Does your website speak Chinese?
Does your website speak Chinese?Does your website speak Chinese?
Does your website speak Chinese?
 
[사회적기업가포럼]사단법인 씨즈 김동훈 청년국장
[사회적기업가포럼]사단법인 씨즈 김동훈 청년국장[사회적기업가포럼]사단법인 씨즈 김동훈 청년국장
[사회적기업가포럼]사단법인 씨즈 김동훈 청년국장
 
A Hinkhouse Design
A Hinkhouse DesignA Hinkhouse Design
A Hinkhouse Design
 
Questionnaire results
Questionnaire results Questionnaire results
Questionnaire results
 
Demokväll med rättsinformationssystemet
Demokväll med rättsinformationssystemetDemokväll med rättsinformationssystemet
Demokväll med rättsinformationssystemet
 

Similar to WESO CAEPIA-20111108

WESO MeTTeG 2011
WESO MeTTeG 2011WESO MeTTeG 2011
WESO MeTTeG 2011
WESO MeTTeG 2011WESO MeTTeG 2011
WESO MeTTeG 2011
WESO (Oviedo Semantic Web)
 
XConf 2022 - Code As Data: How data insights on legacy codebases can fill the...
XConf 2022 - Code As Data: How data insights on legacy codebases can fill the...XConf 2022 - Code As Data: How data insights on legacy codebases can fill the...
XConf 2022 - Code As Data: How data insights on legacy codebases can fill the...
Alessandro Confetti
 
Distributed Queries in IDS: New features.
Distributed Queries in IDS: New features.Distributed Queries in IDS: New features.
Distributed Queries in IDS: New features.
Keshav Murthy
 
Query Rewriting Optimisation Techniques for Ontology-Based Data Access
Query Rewriting Optimisation Techniques for Ontology-Based Data AccessQuery Rewriting Optimisation Techniques for Ontology-Based Data Access
Query Rewriting Optimisation Techniques for Ontology-Based Data Access
Fujitsu Laboratories of Europe
 
Making Use of the Linked Data Cloud: The Role of Index Structures
Making Use of the Linked Data Cloud: The Role of Index StructuresMaking Use of the Linked Data Cloud: The Role of Index Structures
Making Use of the Linked Data Cloud: The Role of Index Structures
Thomas Gottron
 
Object Detection with Transformers
Object Detection with TransformersObject Detection with Transformers
Object Detection with Transformers
Databricks
 
CMPT470-usask-guest-lecture
CMPT470-usask-guest-lectureCMPT470-usask-guest-lecture
CMPT470-usask-guest-lecture
Masud Rahman
 
Jfall 2019 - Driving the energy transition with java
Jfall 2019 - Driving the energy transition with javaJfall 2019 - Driving the energy transition with java
Jfall 2019 - Driving the energy transition with java
Caspar Derksen
 
Visualize open data with Plone - eea.daviz PLOG 2013
Visualize open data with Plone - eea.daviz PLOG 2013Visualize open data with Plone - eea.daviz PLOG 2013
Visualize open data with Plone - eea.daviz PLOG 2013
Antonio De Marinis
 
Domain Driven Design Tactical Patterns
Domain Driven Design Tactical PatternsDomain Driven Design Tactical Patterns
Domain Driven Design Tactical Patterns
Robert Alexe
 
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
Neo4j
 
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
Sebastian Verheughe
 
How to valuate and determine standard essential patents
How to valuate and determine standard essential patentsHow to valuate and determine standard essential patents
How to valuate and determine standard essential patents
MIPLM
 
Building DSLs On CLR and DLR (Microsoft.NET)
Building DSLs On CLR and DLR (Microsoft.NET)Building DSLs On CLR and DLR (Microsoft.NET)
Building DSLs On CLR and DLR (Microsoft.NET)
Vitaly Baum
 
Validating statistical Index Data represented in RDF using SPARQL Queries: Co...
Validating statistical Index Data represented in RDF using SPARQL Queries: Co...Validating statistical Index Data represented in RDF using SPARQL Queries: Co...
Validating statistical Index Data represented in RDF using SPARQL Queries: Co...
Jose Emilio Labra Gayo
 
Maximilian Michels – Google Cloud Dataflow on Top of Apache Flink
Maximilian Michels – Google Cloud Dataflow on Top of Apache FlinkMaximilian Michels – Google Cloud Dataflow on Top of Apache Flink
Maximilian Michels – Google Cloud Dataflow on Top of Apache Flink
Flink Forward
 
Reactive Stream Processing Using DDS and Rx
Reactive Stream Processing Using DDS and RxReactive Stream Processing Using DDS and Rx
Reactive Stream Processing Using DDS and Rx
Sumant Tambe
 
Multidimensional Interfaces for Selecting Data with Order
Multidimensional Interfaces for Selecting Data with OrderMultidimensional Interfaces for Selecting Data with Order
Multidimensional Interfaces for Selecting Data with Order
Ruben Taelman
 
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
Sebastian Verheughe
 

Similar to WESO CAEPIA-20111108 (20)

WESO MeTTeG 2011
WESO MeTTeG 2011WESO MeTTeG 2011
WESO MeTTeG 2011
 
WESO MeTTeG 2011
WESO MeTTeG 2011WESO MeTTeG 2011
WESO MeTTeG 2011
 
XConf 2022 - Code As Data: How data insights on legacy codebases can fill the...
XConf 2022 - Code As Data: How data insights on legacy codebases can fill the...XConf 2022 - Code As Data: How data insights on legacy codebases can fill the...
XConf 2022 - Code As Data: How data insights on legacy codebases can fill the...
 
Distributed Queries in IDS: New features.
Distributed Queries in IDS: New features.Distributed Queries in IDS: New features.
Distributed Queries in IDS: New features.
 
Query Rewriting Optimisation Techniques for Ontology-Based Data Access
Query Rewriting Optimisation Techniques for Ontology-Based Data AccessQuery Rewriting Optimisation Techniques for Ontology-Based Data Access
Query Rewriting Optimisation Techniques for Ontology-Based Data Access
 
Making Use of the Linked Data Cloud: The Role of Index Structures
Making Use of the Linked Data Cloud: The Role of Index StructuresMaking Use of the Linked Data Cloud: The Role of Index Structures
Making Use of the Linked Data Cloud: The Role of Index Structures
 
Object Detection with Transformers
Object Detection with TransformersObject Detection with Transformers
Object Detection with Transformers
 
CMPT470-usask-guest-lecture
CMPT470-usask-guest-lectureCMPT470-usask-guest-lecture
CMPT470-usask-guest-lecture
 
Jfall 2019 - Driving the energy transition with java
Jfall 2019 - Driving the energy transition with javaJfall 2019 - Driving the energy transition with java
Jfall 2019 - Driving the energy transition with java
 
Visualize open data with Plone - eea.daviz PLOG 2013
Visualize open data with Plone - eea.daviz PLOG 2013Visualize open data with Plone - eea.daviz PLOG 2013
Visualize open data with Plone - eea.daviz PLOG 2013
 
Domain Driven Design Tactical Patterns
Domain Driven Design Tactical PatternsDomain Driven Design Tactical Patterns
Domain Driven Design Tactical Patterns
 
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
 
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
 
How to valuate and determine standard essential patents
How to valuate and determine standard essential patentsHow to valuate and determine standard essential patents
How to valuate and determine standard essential patents
 
Building DSLs On CLR and DLR (Microsoft.NET)
Building DSLs On CLR and DLR (Microsoft.NET)Building DSLs On CLR and DLR (Microsoft.NET)
Building DSLs On CLR and DLR (Microsoft.NET)
 
Validating statistical Index Data represented in RDF using SPARQL Queries: Co...
Validating statistical Index Data represented in RDF using SPARQL Queries: Co...Validating statistical Index Data represented in RDF using SPARQL Queries: Co...
Validating statistical Index Data represented in RDF using SPARQL Queries: Co...
 
Maximilian Michels – Google Cloud Dataflow on Top of Apache Flink
Maximilian Michels – Google Cloud Dataflow on Top of Apache FlinkMaximilian Michels – Google Cloud Dataflow on Top of Apache Flink
Maximilian Michels – Google Cloud Dataflow on Top of Apache Flink
 
Reactive Stream Processing Using DDS and Rx
Reactive Stream Processing Using DDS and RxReactive Stream Processing Using DDS and Rx
Reactive Stream Processing Using DDS and Rx
 
Multidimensional Interfaces for Selecting Data with Order
Multidimensional Interfaces for Selecting Data with OrderMultidimensional Interfaces for Selecting Data with Order
Multidimensional Interfaces for Selecting Data with Order
 
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
 

More from WESO (Oviedo Semantic Web)

CAEPIA 2011 Linked Data Methodology
CAEPIA 2011 Linked Data MethodologyCAEPIA 2011 Linked Data Methodology
CAEPIA 2011 Linked Data Methodology
WESO (Oviedo Semantic Web)
 
Curso Integración Web Semántica-Conclusiones
Curso Integración Web Semántica-ConclusionesCurso Integración Web Semántica-Conclusiones
Curso Integración Web Semántica-Conclusiones
WESO (Oviedo Semantic Web)
 
Curso Integración Web Semántica-OWL
Curso Integración Web Semántica-OWLCurso Integración Web Semántica-OWL
Curso Integración Web Semántica-OWL
WESO (Oviedo Semantic Web)
 
Curso Integración Web Semántica Estadísticas
Curso Integración Web Semántica EstadísticasCurso Integración Web Semántica Estadísticas
Curso Integración Web Semántica Estadísticas
WESO (Oviedo Semantic Web)
 
Curso integración Web Semántica
Curso integración Web Semántica Curso integración Web Semántica
Curso integración Web Semántica
WESO (Oviedo Semantic Web)
 
WESO SATBI 2011
WESO SATBI 2011WESO SATBI 2011
WESO 10ders
WESO 10dersWESO 10ders
WESO-10ders
WESO-10dersWESO-10ders

More from WESO (Oviedo Semantic Web) (8)

CAEPIA 2011 Linked Data Methodology
CAEPIA 2011 Linked Data MethodologyCAEPIA 2011 Linked Data Methodology
CAEPIA 2011 Linked Data Methodology
 
Curso Integración Web Semántica-Conclusiones
Curso Integración Web Semántica-ConclusionesCurso Integración Web Semántica-Conclusiones
Curso Integración Web Semántica-Conclusiones
 
Curso Integración Web Semántica-OWL
Curso Integración Web Semántica-OWLCurso Integración Web Semántica-OWL
Curso Integración Web Semántica-OWL
 
Curso Integración Web Semántica Estadísticas
Curso Integración Web Semántica EstadísticasCurso Integración Web Semántica Estadísticas
Curso Integración Web Semántica Estadísticas
 
Curso integración Web Semántica
Curso integración Web Semántica Curso integración Web Semántica
Curso integración Web Semántica
 
WESO SATBI 2011
WESO SATBI 2011WESO SATBI 2011
WESO SATBI 2011
 
WESO 10ders
WESO 10dersWESO 10ders
WESO 10ders
 
WESO-10ders
WESO-10dersWESO-10ders
WESO-10ders
 

Recently uploaded

“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
CAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on BlockchainCAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on Blockchain
Claudio Di Ciccio
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 

Recently uploaded (20)

“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
CAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on BlockchainCAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on Blockchain
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 

WESO CAEPIA-20111108

  • 1. Query Expansion Methods and Performance Evaluation for Reusing Linking Open Data of the European Public Procurement Notices José María Álvarez Rodríguez WESO-Universidad de Oviedo http://purl.org/weso/moldeas/ Tecnologías de Linked Data y sus aplicaciones en España (TLDE) CAEPIA 2011-Tenerife (Spain) 8th of November, 2011 Code: TSI-020100-2010-919
  • 2. Overview Use case & Context SPARQL & Performance Next Steps
  • 3. Objective Creation of a pan-european e-procurement platform
  • 4. E-procurement Long Tail TED BOE (official bulletin of the Spanish Governement) BOPA (official bulletin of the Asturian Governement)
  • 5. To Be Able to answer to… Which public procurement notices are relevant to Dutch companies (only SMEs) that want to tender for contracts announced by local authorities with a total value lower than 170K € to procure “Road bridge construction work” and a two year duration in the Dutch- speaking region of Flanders (Belgium)?
  • 6. Structuring public procurement notices d Providing new semantic- based services yD Z ^ ^ ^ D D K W KW LOD enrichment K W ^ Ws Z Easing the access to the published data using the Ehd^ LOD approach Transforming government classifications
  • 7. Preliminary Results / d d W s Z K Ehd^ W
  • 8. Semantic-based Services Problem of «Query Expansion» depending on the kind of information variable
  • 9. Methods of«Query Expansion» / ' d Z E ' ^ ^ h , ^ Z
  • 10. Remembering… Which public procurement notices are relevant to Dutch companies (only SMEs) that want to tender for contracts announced by local authorities with a total value lower than 170K € to procure “Road bridge construction work” and a two year duration in the Dutch- speaking region of Flanders (Belgium)?
  • 11. cpv:45221111-3 NL Query… Ehd^ Z' t KEE ppn:nutsCode ppn:hasDuration cpv:CodeIn2008 ppn:hasAmount org:classification ^D
  • 12. cpv:45221111-3 NL Applying Query Expansion… Ehd^ Ehd^ E Ehd^ Ehd^ ppn:nutsCode ppn:hasDuration cpv:CodeIn2008 ppn:hasAmount org:classification ^D
  • 13. Example of SPARQL query SELECT DISTINCT * WHERE { ?ppn rdf:type http://purl.org/weso/ppn/def#ppn. ?ppn ppn:nutsCode ?nutsCode. ?ppn cpv:codeIn2008 ?cpvCode. ?ppn ppn:hasDuration ?duration ?ppn dc:identifier ?id. ?ppn dc:date ?date . ? ppn ppn:hasAmount ?amount. FILTER(? cpvCode = cpv:45221111-3 ... ) . FILTER ( (xsd:double(?amount) = xsd:long(170,000)) (xsd:double(?amount) = xsd:long(200,000)) ). . FILTER(?nutsCode = nuts:B3 ... ) . FILTER ( (xsd:long(?duration) = xsd:long(2)) (xsd:long(?duration) = xsd:long(3)) ). }
  • 14. Context Performance of SPARQL Queries ~30 sec.
  • 15. Hardware Software DELL PC 2GB RAM and 30GB HardDisk Virtual Box (version 4.0.6) Linux 2.6.35-22-server #33-Ubuntu 2 SMP x86_64 GNU/Linux Ubuntu 10.10 OpenLink Virtuoso Opensource-6- 20110218
  • 16. Question? How to decrease the time of query execution without modify the hardware and not use any vendor feature?
  • 17. TripleStore 25 graphs 20 M of RDF Triples But… 8 graphs 11 M of RDF Triples
  • 18. Focus on.. The generation of SPARQL queries
  • 19. Let’s start… 9 SPARQL Queries 3 executions
  • 20. d ^ /D/d /dZ 'Z W,^ ^ W d d d d d d d d d d d d d d d d
  • 21. Simple SPARQL query SELECT DISTINCT * WHERE { ?ppn rdf:type http://purl.org/weso/ppn/def#ppn. ?ppn ppn:nutsCode ?nutsCode. ?ppn cpv:codeIn2008 ?cpvCode. ?ppn ppn:hasDuration ?duration ?ppn dc:identifier ?id. ?ppn dc:date ?date . ? ppn ppn:hasAmount ?amount. FILTER(? cpvCode = cpv:15331137 ) . . FILTER(?nutsCode = nuts:UK) . }
  • 22. Simple Query 1 CPV Code 1 NUTS Code Time: ~3,29 sec.
  • 23. T1 Rewrite SPARQL queries: Match triples from specific to general Filter as soon as possible
  • 24. T2 Use the LIMIT clause Value set to 10,000
  • 25. Rewrite SPARQL query SELECT DISTINCT * WHERE { ?ppn rdf:type http://purl.org/weso/ppn/def#ppn. ?ppn cpv:codeIn2008 ?cpvCode. FILTER(? cpvCode = cpv:15331137 ) . ?ppn ppn:nutsCode ?nutsCode. FILTER(?nutsCode = nuts:UK) . ?ppn ppn:hasDuration ?duration ?ppn dc:identifier ?id. ?ppn dc:date ?date . ? ppn ppn:hasAmount ?amount. . } LIMIT 10000
  • 26. Results T2 1 CPV Code 1 NUTS Code Time: ~3,26 sec.
  • 27. Evaluation There is no significant changes in execution time and gain… and We are interested in “enhanced queries”
  • 29. Enhanced SPARQL query SELECT DISTINCT * WHERE { ?ppn rdf:type http://purl.org/weso/ppn/def#ppn. ?ppn ppn:nutsCode ?nutsCode. ?ppn cpv:codeIn2008 ?cpvCode. ?ppn ppn:hasDuration ?duration ?ppn dc:identifier ?id. ?ppn dc:date ?date . ? ppn ppn:hasAmount ?amount. FILTER(? cpvCode = {cpv:15331137 , cpv:48611000, cpv: 48611000, cpv:50531510, cpv: 15871210}) . . FILTER(?nutsCode = {nuts:B3, nuts:PL, nuts:RO ) . }
  • 30. Results T3 5 CPV Codes 3 NUTS Codes 1 query Time: ~20,65 sec.
  • 31. T4 Rewrite SPARQL queries + Use the LIMIT clause
  • 32. Results T4 wrt T3 5 CPV Codes 3 NUTS Codes 1 query Time: ~20,55 sec.
  • 33. Info 8 graphs 11 M of RDF Triples
  • 34. T5 Rewrite SPARQL queries + Use the LIMIT clause + Named Graphs (FROM)
  • 35. Results T5 wrt T3 5 CPV Codes 3 NUTS Codes 1 query Time: ~20,65 sec.
  • 36. T6 Rewrite SPARQL queries + Use the LIMIT clause + Named Graphs (FROM) + Split into simple queries
  • 37. Results T6 wrt T3 5 CPV Codes 3 NUTS Codes 4 Graphs 4 simple queries Time: ~20,60 sec.
  • 38. T6-1 Rewrite SPARQL queries + Use the LIMIT clause + Named Graphs (FROM) + Split enhance query into simple queries + Parallelization of query execution (ad-hoc map/reduce)
  • 39. Results T6-1 wrt T3 5 CPV Codes 3 NUTS Codes 4 Graphs 4 simple queries Time: ~11,93 sec.
  • 40. T7 Rewrite SPARQL queries + Use the LIMIT clause + Split enhance query into simple queries
  • 41. Results T7 wrt T3 1 CPV Code (5) 3 NUTS Code 5 simple queries Time: ~15,81 sec.
  • 42. T7-1 Rewrite SPARQL queries + Use the LIMIT clause + Split enhance query into simple queries + Parallelization of query execution (ad-hoc map/reduce)
  • 43. Results T7-1 wrt T3 1 CPV Code (5) 3 NUTS Codes 5 simple queries Time: ~10,55 sec.
  • 44. T8 Rewrite SPARQL queries + Use the LIMIT clause + Named Graphs (FROM) + Split into simple queries
  • 45. Results T8 wrt T3 1 CPV Code (5) 3 NUTS Codes 4 Graphs 20 simple queries Time: ~32,34 sec.
  • 46. T8-1 Rewrite SPARQL queries + Use the LIMIT clause + Named Graphs (FROM) + Split enhance query into simple queries + Parallelization of query execution (ad-hoc map/reduce)
  • 47. Results T8-1 wrt T3 1 CPV Code (5) 3 NUTS Codes 4 Graphs 20 simple queries Time: ~18,45 sec.
  • 48. T9 Rewrite SPARQL queries + Use the LIMIT clause + Split enhance query into simple queries (1 CPV code+1 NUTS code)
  • 49. Results T9 wrt T3 1 CPV Code (5) 1 NUTS Code (3) 15 simple queries Time: ~22,462 sec.
  • 50. T9-1 Rewrite SPARQL queries + Use the LIMIT clause + Split enhance query into simple queries (1 CPV code+1 NUTS code) + Parallelization of query execution (ad-hoc map/reduce)
  • 51. Results T9-1 wrt T3 1 CPV Code (5) 1 NUTS Code (3) 15 simple queries Time: ~12,77 sec.
  • 52. T10 Rewrite SPARQL queries + Use the LIMIT clause + Named Graphs (FROM) + Split into simple queries (1 CPV code+1 NUTS code)
  • 53. Results T10 wrt T3 1 CPV Code (5) 1 NUTS Code (3) 4 Graphs 60 simple queries Time: ~71,17 sec.
  • 54. T10-1 Rewrite SPARQL queries + Use the LIMIT clause + Named Graphs (FROM) + Split enhance query into simple queries (1 CPV code+1 NUTS code) + Parallelization of query execution (ad-hoc map/reduce)
  • 55. Results T10-1 wrt T3 1 CPV Code (5) 1 NUTS Code (3) 4 Graphs 60 simple queries Time: ~35,13 sec.
  • 56. d Table of Results d ' d E d d E d d d d d d d d d d d d
  • 57. Discussion • The number of queries is a key-factor • The number of CPV codes implies more execution time • The parallelization improves execution time • T7-1 is the best execution in terms of time • Rewrite SPARQL queries • Use the LIMIT clause • Split enhance query into simple queries • Parallelization of query execution
  • 58. Further Steps • Distribute graphs in different nodes (HW improvement) • Use of other triple stores • (SW comparison) • Add SPARQL 1.1 new features (Expressiveness improvement) • Cache of queries (SW improvement)
  • 59. Some References… • http://www4.wiwiss.fu- berlin.de/bizer/BerlinSPARQLBenchmark/results/index.html#comparison • http://www.slideshare.net/olafhartig/an-overview-on-linked-data- management-and-sparql-querying-isslod2011 • http://squin.sourceforge.net/ • http://www2.informatik.hu- berlin.de/~hartig/files/Slides_Hartig_ISSLOD2011.pdf • http://www2008.org/papers/pdf/p595-stocker1.pdf • http://www.informatik.uni- freiburg.de/~mschmidt/docs/diss_final01122010.pdf • http://mayor2.dia.fi.upm.es/oeg-upm/files/sparql-dqp/eswc11-bac-ext.pdf • http://www.slideshare.net/olafhartig/the-sparql-query-graph-model-for- query-optimization-1259536 • http://www.w3.org/TR/sparql-features/
  • 60. Query Expansion Methods and Performance Evaluation for Reusing Linking Open Data of the European Public Procurement Notices José María Álvarez Rodríguez WESO-Universidad de Oviedo http://purl.org/weso/moldeas/ Tecnologías de Linked Data y sus aplicaciones en España (TLDE) CAEPIA 2011-Tenerife (Spain) 8th of November, 2011 Code: TSI-020100-2010-919