Successfully reported this slideshow.
Your SlideShare is downloading. ×

Linked Open Data for Public Contracts

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Upcoming SlideShare
LOD2 Webinar Series FOX
LOD2 Webinar Series FOX
Loading in …3
×

Check these out next

1 of 35 Ad

More Related Content

Slideshows for you (20)

Advertisement

Similar to Linked Open Data for Public Contracts (20)

Advertisement

Recently uploaded (20)

Linked Open Data for Public Contracts

  1. 1. Linked Open Data for Public Contracts Martin Nečaský Faculty of Mathematics and Physics, Charles University in Prague Faculty of Informatics and Statistics, University of Economics in Prague 13.6.2013 – Publications Office of the European Union, Luxembourg
  2. 2. Outline  Introduction to Linked Data  What benefits Linked Data bring for TED and Public Procurement in EU?  What does it mean for TED and others to publish its data as Linked Data?  What we have already done in LOD2 project?
  3. 3. Linked Data - Introduction
  4. 4. Web Applications Eco-system  Linked Data helps to create an eco-system of web applications which publish, enrich and consume data about things in one shared global data space Shared Global Data Space on the Web (Web of Data) App 1 App 2 App 3 App 4 App 5 App 4
  5. 5. Architecture of Web of Documents Shared global space of documents Built on top of several simple principles: 1. HTML as a format for publishing documents 2. URLs as unique global identifiers of documents 3. HTTP for localization and accessing documents by their URLs 4. hyperlinks between documents There are two kinds of applications working in this space of documents: • web browsers (localizing and browsing documents through hyperlinks) • search engines (indexing and full text searching of documents) HTML HTML HTML HTML Web browser Search engine HTTP HTTP
  6. 6. Web of Documents Current Web (of Documents) provides lot of data about Prague. Problems • Data about Prague encoded in documents distributed across the Web • Documents intended for humans not computers • Documents about Prague or related things not linked • Therefore, computers not able to process data about Prague published on the Web http://monitor.statnipokladna.cz Prague budget http://registry.czso.cz Basic info about Prague http://www.praha.eu Prague public contracts http://www.czso.cz Demography of Prague http://www.risy.cz EU funded projects in Prague
  7. 7. Web of Documents Try to search for this information on the current Web • Top 100 suppliers of Prague with headquarters outside of Prague region. • Money spent in Prague for new children playgrounds in the last 5 years per one child. • Organizations in Prague funded by EU structural funds and their top 100 suppliers. http://monitor.statnipokladna.cz Prague budget http://registry.czso.cz Basic info about Prague http://www.praha.eu Prague public contracts http://www.czso.cz Demography of Prague http://www.risy.cz EU funded projects in Prague
  8. 8. Linked Data  data published on the Web according to four simple principles (introduced by sir T. B. Lee) 1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) 4. Include links to other URIs so that they can discover more things.
  9. 9. Things as first-class citizens Project CZ.2.16/2.1.00/22189 Prague City Prague Council Prague Demography Prague Budget Contract DIL/23/07/007302/2010
  10. 10. HTTP URIs for Things Project CZ.2.16/2.1.00/22189 praha.eu (Prague) http://praha.eu/ contract/7302 http://praha.eu/ council http://praha.eu/ city mfcr.cz (Ministry of Finance) http://mfcr.cz/ prague/budget http://mfcr.cz/ prague risy.cz (Regional Information Service) http://risy.cz/ location/prague http://risy.cz/con tract/22189-01 http://risy.cz/ project/22189 czso.cz (Czech Statistical Office) http://registry. czso.cz/prague http://czso.cz/ prague http://czso.cz/pr ague/demogstat
  11. 11. Data about Things in RDF Client PlaygroundRevitalization Authority: Prague Delivery date: 31.8.2011 Price: 28 444 000 CZK ... Playground Revitalization 28444000 CZK dcterms:title pc:contracting Authority pc:agreedPrice gr:hasCurrency gr:hasCurrency Value 31.8.2011 pc:estimated EndDate http://praha.eu/ contract/7302 http://praha.eu/ contract/7302 http://praha.eu/ contract/7302/price http://praha.eu/ council
  12. 12. <http://www.praha.eu/contract/7302> dcterms:title "Playground Revitalization" ; pc:estimatedEndDate "31.8.2011" ; pc:agreedPrice <http://www.praha.eu/contract/7302/price> ; pc:contractingAuthority <http://www.praha.eu/council> . <http://www.praha.eu/contract/7302/price> gr:hasCurrency "CZK" ; gr:hasCurrencyValue "28444000" . Data about Things in RDF Client PlaygroundRevitalization Authority: Prague Delivery date: 31.8.2011 Price: 28 444 000 CZK ... http://praha.eu/ contract/7302
  13. 13. Vocabularies  published RDF data would be hardly interpretable when each publisher would use proprietary predicates  therefore, standardized (or at least widely used) predicates should have priority before proprietary ones  e.g. Dublin Core, Good Relations, FOAF, schema.org, ...  or more specific ones for public procurement • e.g., Public Contracts Ontology (http://purl.org/procurement/public-contracts )  predicates are defined in so called vocabularies (or ontologies)  note: ontology is a special case of vocabulary, it contains more detailed reasoning rules which is out of scope of this lecture  note: not only predicates but also classes (= types of things) are defined in vocabularies/ontologies
  14. 14. Linking URIs of Related Things praha.eu (Prague) http://praha.eu/ contract/7302 http://praha.eu/ city mfcr.cz (Ministry of Finance) http://mfcr.cz/ prague/budget http://mfcr.cz/ prague risy.cz (Regional Information Service) http://risy.cz/ location/prague http://risy.cz/con tract/22189-01 http://risy.cz/ project/22189 czso.cz (Czech Statistical Office) http://registry. czso.cz/prague http://czso.cz/ prague http://czso.cz/pr ague/demogstat c: hasBeneficiary a:fundedBy b:hasBudgethttp://praha.eu/ council d:hasDemography
  15. 15. d:hasDemography Linking URIs of Related Things praha.eu (Prague) http://praha.eu/ contract/7302 mfcr.cz (Ministry of Finance) http://mfcr.cz/ prague/budget http://mfcr.cz/ prague risy.cz (Regional Information Service) http://risy.cz/con tract/22189-01 http://risy.cz/ project/22189 czso.cz (Czech Statistical Office) http://czso.cz/pr ague/demogstat c:hasBeneficiary a:fundedBy http://praha.eu/ city http://risy.cz/ location/prague http://registry. czso.cz/prague http://czso.cz/ prague http://praha.eu/ council owl:sameAs owl:sameAs b:hasBudget
  16. 16. Linked Data for TED – What are the benefits?
  17. 17. Benefits of Publishing TED as LD  Problem: It is hard to get a unified view of a chosen thing (i.e. contracting authority, supplier, contract, contract notice, tender, ...) from TED.  The data about the thing is distributed across several contract notices.  LD solution: Each thing has a unique TED HTTP URI which can be used by third-party applications to get all TED data for this thing.  Data is represented as RDF graph respecting openly defined vocabularies shared across developers and communities.  Data include links to URIs of other things on TED.  TED can flexibly and continuously extend the data provided for the thing.
  18. 18. Benefits of Publishing TED as LD User Web application ?detail=http://ted.eu/contract/CZ/54782145 TED LD Service http://ted.eu/contract/CZ/54782145 http://pra ha.eu/con tract/730 2 http://praha.eu/ contract/7302/pri ce http://pra ha.eu/cou ncil TED easily assembles data related to the requested contract and returns it as an interconnected graph to the requesting web application.
  19. 19. Benefits of Publishing TED as LD User Web application TED LD Service http://ted.eu/org/CZ/00064581 http://pra ha.eu/con tract/730 2 http://praha.eu/ contract/7302/pri ce http://pra ha.eu/cou ncil TED easily assembles data related to the requested authority and returns it as an interconnected graph to the requesting web application. click ?detail=http://ted.eu/ org/CZ/00064581
  20. 20. Problems with HTTP URIs  Today, public procurement data are collected from contracting authorities in a form of contract notices (calls for tender, contract award notices, etc.)  Notices usually do not contain explicit identifiers of contracting authorities and suppliers.  These organizations are usually identified in the notices only by names and addresses which are often misspelled and incorrect.  Therefore, if we create an HTTP URI for an organization from one notice, it is often very hard to recognize whether an organization from another notice is the same one or not.  Therefore, a serious questions arise – how the HTTP URI of an organization (contracting authority/supplier) should look like? How an organization should be identified in a notice so that we are able to unambiguously recognize it?
  21. 21. Problems with HTTP URIs  There are two possible solutions to this question, both are very simple from the technical point of view but very complex from the political point of view (enforcement in all EU countries)  1st solution:  Some countries define unique mandatory identifiers for organizations (for both, private companies as well as public institutions).  These identifiers should be present in the notices to identify contracting authorities and suppliers.  We can then use them to recognize organizations and associate them with corresponding HTTP URIs.
  22. 22. Problems with HTTP URIs  2nd solution:  Each organization involved in public procurement should have own public profile on the Web with own HTTP URI.  The public profile can be a simple HTML web page which also contains few data encoded in RDF (technically, it is very simple)  The public profile can be a part of the official web site of the organization, e.g. http://praha.eu/public-profile  Or, the organization can use services which can manage public web profiles of organizations. There already exist such services, e.g. http://opencorporates.org • This service already contains profiles of many organizations, it associates them with HTTP URIs and provides basic RDF data about them (title, address, etc.)  The HTTP URI of the profile should become a part of the notice.  This solution also saves some time and money because details about the organization do not have to be repeated in each notice – each notice is linked to the HTTP URI where the information is present. • Yes, if you think about the problem that there is only actual information on the profile which can be different than the information which was valid before for some earlier notices, then you are right. But this can be technically solved (e.g. TED and other authorities responsible for collecting public procurement data can back-up those information, etc.).
  23. 23. Problems with HTTP URIs  2nd solution: praha.eu (Prague) http://praha.eu/ public-profile company-a.cz (Company A) http://company-a.cz/ public-profile opencorporates.org http://opencorporates.org/ company-b/public-profile http://opencorporates.org/ company-c/public-profile ... http://ted.europa.eu/ notice/574832 http://ted.europa.eu/ notice/575833 pc:contracting Authority
  24. 24. Benefits of Publishing TED as LD  Problem: It is hard to find information related to public contracts, contracting authorities and suppliers which is published outside of TED somewhere else on the Web, e.g.,  data from the post-award phase  public contracts not published on TED  profiles of contracting authorities and suppliers  LD solution: TED publishes the basic data infrastructure of HTTP URIs of public contracts, contracting authorities, suppliers, etc.  Others can enrich this basic infrastructure with their own data.  The enriched TED datasets can be consumed by third-party applications and even by TED itself.
  25. 25. Benefits of Publishing TED as LD Shared Global Data Space (Web) TED Linked Data Basic Infrastructure Publisher of profiles of CZ suppliers Publisher of post-award data of GE contracts
  26. 26. Suitable suppliers for a contract ? Benefits of Publishing TED as LD Public spending per inhabitant in 2010 Contracts similar to a contract PC Filing Application Public spending in Czech Republic "HeatMap" Application
  27. 27. Benefits of Publishing TED as LD  Problem: Other authorities must copy TED data to their databases if they want to use TED data (which includes also republishing TED data).  Repeated work for building such databases and their maintenance is paid from public budgets (!)  LD solution: Other public authorities link their primary data (represented as Linked Data, not necessarily published) to TED without the need to copy, integrate and maintain this data in their database.  Anyone who works with the data of such other public authority can get the data directly from TED if necessary.
  28. 28. Benefits of Publishing TED as LD  Our planned experiment in Czech Republic in cooperation with Czech Ministry of Finance (MoF) and data about public contracts CZ Public Budgets (MoF) NUTS&LAU CZ regions CZ Public ContractsDemography (Czech Stat. Office) Public contracts in Prague with Prague budget and demography statistics?  To show that institutions can share data by linking the data instead of copying them
  29. 29. Benefits for Stakeholders Contracting Authorities and Suppliers  Unified global data space covering various aspects of public procurement across all EU countries.  contracting authorities  They can find similar contracts to their contracts.  They can group their calls with other authorities to achieve better offers from suppliers.  They can verify their requirements against requirements of other buyers to increase quality and completeness of their requirements and ask for better prices.  They can search for suitable suppliers who realized similar contracts successfully in the past.  suppliers  They can get necessary information about opened calls for tenders.  They can better inform potential customers about their offers.  They can analyze previous contracts in their market to better target their tenders and improve the quality of the services they offer.  They can group with other suppliers with complementary offers for joint tendering.
  30. 30. Benefits for Stakeholders EU and Citizens  EU saves money  Only basic infrastructure is build and primary data is published • Related data is published and linked by third-parties  There is no need to build and pay for complex applications and services • These will be built by third-parties not only for citizens but also for contracting authorities and suppliers solely on the base of their demand.  There is no need to duplicate data in different public administration services and applications • Data is linked instead of copied  EU supports building a common market and interoperability (ISA)  EU supports transparency  Citizens can more easily monitor what public administrations buy in their city/country, from who and for how much  They can also more easily compare the purchases of their city/country with other cities/countries.
  31. 31. Linked Data for TED – What needs to be done to adopt LD principles?
  32. 32. LOD lifecycle Interlinking, fusing Evolution, repair Quality analysis Evolution, repair Search, browsing, exploration Extraction Storage, querying Manual revision, authoring LOD lifecycle supported by LOD2 Stack http://stack.lod2.eu
  33. 33. Public Procurement and LOD2 Project  vocabulary for publishing Public Contracts as Linked Data  combination of existing broadly adopted vocabularies and their extension for public procurement (GoodRelations, Payments Ontology, schema.org, Dublin Core, SKOS)  Public Contracts filing application  web application for contracting authorities and suppliers  It enables to publish data about public contracts as Linked Data.  Contracting authorities can search for similar contracts and suitable suppliers.  Experimental Linked Data from Czech Republic, Great Britain and TED
  34. 34. Experimental Linked Data from Czech Republic, Great Britain and TED created as part of LOD2 project CZ Public Contracts Common Procurement Vocabulary CZ Business Entities CZ Demography Stats CZ Public Budgets DBPedia TED Public Contracts and Organizations SDMX CZ LAU Regions NUTS Regions (RAMON) GB Public Contracts and Organizations Products Ontology
  35. 35. Thank You for Your Attention

×