EU-FP7 LOD2 Project Overview . Page 1 http://lod2.eu
Creating Knowledge out of Interlinked Data
http://lod2.eu
Advanced Ex...
EU-FP7 LOD2 Project Slide 2 http://lod2.eu
Creating Knowledge out of Interlinked Data
Starting point:
 potential benefits...
EU-FP7 LOD2 Project Slide 3 http://lod2.eu
Creating Knowledge out of Interlinked Data
Public Contracts Ontology
pc:Contra
...
EU-FP7 LOD2 Project Slide 4 http://lod2.eu
Creating Knowledge out of Interlinked Data
Public contract notices:
 HTML – na...
EU-FP7 LOD2 Project Slide 5 http://lod2.eu
Creating Knowledge out of Interlinked Data
number of triples: 28,8M
notices: 41...
EU-FP7 LOD2 Project Slide 6 http://lod2.eu
Creating Knowledge out of Interlinked Data
Polish dataset characteristics (2013)
EU-FP7 LOD2 Project Slide 7 http://lod2.eu
Creating Knowledge out of Interlinked Data
leverage analytical and data mining ...
EU-FP7 LOD2 Project Slide 8 http://lod2.eu
Creating Knowledge out of Interlinked Data
beneficiary: bidder
case:
• identify...
EU-FP7 LOD2 Project Slide 9 http://lod2.eu
Creating Knowledge out of Interlinked Data
beneficiary: supervisory bodies
case...
EU-FP7 LOD2 Project Slide 10 http://lod2.eu
Creating Knowledge out of Interlinked Data
leverages the link between notices ...
EU-FP7 LOD2 Project Slide 11 http://lod2.eu
Creating Knowledge out of Interlinked Data
Geography vs. value association
EU-FP7 LOD2 Project Slide 12 http://lod2.eu
Creating Knowledge out of Interlinked Data
Number of tenders - long tail
• 38%...
EU-FP7 LOD2 Project Slide 13 http://lod2.eu
Creating Knowledge out of Interlinked Data
Type of contract vs. number of bidd...
EU-FP7 LOD2 Project Slide 14 http://lod2.eu
Creating Knowledge out of Interlinked Data
Public procurement in Czech Republi...
Upcoming SlideShare
Loading in …5
×

EDF2014: Talk of Krzysztof Wecel, Assistant professor, Poznan University of Economics, Poland: Advanced Exploration of Public Procurement Data in Linked Data Paradigm

4,085 views
3,954 views

Published on

Selected Talk of Krzysztof Wecel, Assistant professor, Poznan University of Economics, Poland at the European Data Forum 2014, 19 March 2014 in Athens, Greece: Advanced Exploration of Public Procurement Data in Linked Data Paradigm

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
4,085
On SlideShare
0
From Embeds
0
Number of Embeds
1,179
Actions
Shares
0
Downloads
12
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

EDF2014: Talk of Krzysztof Wecel, Assistant professor, Poznan University of Economics, Poland: Advanced Exploration of Public Procurement Data in Linked Data Paradigm

  1. 1. EU-FP7 LOD2 Project Overview . Page 1 http://lod2.eu Creating Knowledge out of Interlinked Data http://lod2.eu Advanced Exploration of Public Procurement Data in Linked Data Paradigm Krzysztof Wecel - I2G / Poznan University of Economics Vojtech Svatek, Jindrich Mynarz – University of Economics, Prague
  2. 2. EU-FP7 LOD2 Project Slide 2 http://lod2.eu Creating Knowledge out of Interlinked Data Starting point:  potential benefits for a wide range of players  mandatory publication of public contract notices Weak points:  restrictions of search interfaces  GUI in local languages  no wider analyses available: aggregations, trends, patterns…  lack of mechanical reasoning  geographical context not leveraged  no links to external information Opportunity:  representation in a form of linked data Motivation
  3. 3. EU-FP7 LOD2 Project Slide 3 http://lod2.eu Creating Knowledge out of Interlinked Data Public Contracts Ontology pc:Contra ct pc:Notice gr:BusinessEntity vc:VCard pc:Tender gr:Offering gr:PriceSpecification pc:AwardCriteriaCombination kind procedure CPV
  4. 4. EU-FP7 LOD2 Project Slide 4 http://lod2.eu Creating Knowledge out of Interlinked Data Public contract notices:  HTML – navigation, scrapping  XML – modeling approach  mapping issues Additional data – the real value  business entities (ARES, CEIDG) and their codes (ICO, NIP, REGON)  geographical codes (NUTS, TERYT)  geographical coordinates (geocoding)  CPV and other vocabularies  optional external information • Czech Trade Inspection Authority • sentences of Polish National Board of Appeal Data sources and transformation
  5. 5. EU-FP7 LOD2 Project Slide 5 http://lod2.eu Creating Knowledge out of Interlinked Data number of triples: 28,8M notices: 413,382 offerings: 922,038 contracting authorities: 17,648 contractors: 177,136 business entities: 194,784 unique CPV codes: 11,341 Polish dataset characteristics (2013)
  6. 6. EU-FP7 LOD2 Project Slide 6 http://lod2.eu Creating Knowledge out of Interlinked Data Polish dataset characteristics (2013)
  7. 7. EU-FP7 LOD2 Project Slide 7 http://lod2.eu Creating Knowledge out of Interlinked Data leverage analytical and data mining techniques in order to find patterns, trends and anomalies in public contract data Specific problems of graph data: • multidimensionality • big number of potential attributes • overlap of the classes • unbalanced learning data (different counts of classes) • loss of information during transformation from graph to tabular data Data mining
  8. 8. EU-FP7 LOD2 Project Slide 8 http://lod2.eu Creating Knowledge out of Interlinked Data beneficiary: bidder case: • identify contracts from the past that would be most suitable • monitor new notices similar the contracts they have already realised • more expressive than typical search language beneficiary: contracting authority case: • help in preparation of specific contract notice (not only CPV) • aggregated demand opportunities CLUSTERING: looking for similar contracts
  9. 9. EU-FP7 LOD2 Project Slide 9 http://lod2.eu Creating Knowledge out of Interlinked Data beneficiary: supervisory bodies case: • discover anomalies • contractor-product association: stability of the offer; the tighter the relationship, the more reliable the contractor is • contractor-authority association: signals the need to check for corruption • analysis of the depth of the market ASSOCIATIONS: ties between various market players
  10. 10. EU-FP7 LOD2 Project Slide 10 http://lod2.eu Creating Knowledge out of Interlinked Data leverages the link between notices concerning the same procurement process beneficiary: contracting authority case: • the bigger number of bidders, the better • one bidder can mean overspecified contract notice PREDICTIVE MODELS: number of bidders
  11. 11. EU-FP7 LOD2 Project Slide 11 http://lod2.eu Creating Knowledge out of Interlinked Data Geography vs. value association
  12. 12. EU-FP7 LOD2 Project Slide 12 http://lod2.eu Creating Knowledge out of Interlinked Data Number of tenders - long tail • 38% of contracts had just one offer • one bidder = overspecified contract notice? • one contract had 610 offers • in some cases large numbers of rejected offers: 298, 245, 111 1 10 100 1000 10000 100000 1000000 1 10 100 1000 count count Number of Tenders count Percentage 1 129910 38,2% 2 67929 20,0% 3 46412 13,6% 4 29850 8,8% 5 19367 5,7% 6 12679 3,7% 7 8277 2,4% 8 5639 1,7% 9 3920 1,2% 10 2799 0,8%
  13. 13. EU-FP7 LOD2 Project Slide 13 http://lod2.eu Creating Knowledge out of Interlinked Data Type of contract vs. number of bidding entities • supply contracts are the most popular • construction work was the least popular • areas of low competitiveness are more susceptible to abuse Number of contractors to number of notices ratio • should be compared to typical value in similar contracting authorities group Some other discoveries
  14. 14. EU-FP7 LOD2 Project Slide 14 http://lod2.eu Creating Knowledge out of Interlinked Data Public procurement in Czech Republic and Poland • ontology has been elaborated • significant amount of data gathered • we are looking for other interested parties Data mining specific issues of graph data have to be addressed old and new tools applied • similar contracts by clustering • ties between various market players • prediction of the number of bidders Conclusions

×