Successfully reported this slideshow.

Open statistics Belgium

1

Share

1 of 30
1 of 30

More Related Content

More from Open Knowledge Belgium

Open statistics Belgium

  1. 1. Open Statistics Open Belgium 6 March 2017 Statistics Belgium Lucia Decuyper Youri Baeyesn
  2. 2. Open Statistics – Agenda  Statistics Belgium => Open Data • Statistics Belgium • Open Data Start • Statbel Open Data Portal • Statistics Belgium in the EU  Open Data => Linked Open Data • 5*****? • RDF • LOD • Semantic Web • Ontologies for statisticians • LOD in the NSIs • RDF@statbel  Questions  Contact
  3. 3. Open Statistics – Statistics Belgium 1  Statistics Belgium ? – National Statistical Institute (before NIS) • largest producer of official statistics in Belgium  What do we do? – Collect data: administrative sources (registers) or surveys – Process and analyse data: • common methodology, definitions (national, European) – Publish data • => +/- 400y releases on Statbel
  4. 4. Open Statistics – Statistics Belgium 2  One of the core tasks consists in making all produced statistics available to everyone (European Statistics Code of Practice) – Website Statbel since 1997 – Free re-(use) => source – ‘open by default’  +/-100 statistics – The main fields covered are population, society, work, economy, real estate, construction, mobility and transport. – Census
  5. 5. Open Statistics – Open Data Start?  Why? • 2nd PSI – directive • Belgian Federal Open Data strategy 2015 • Digital agenda (EU) • Eurostat => EU Open Data Portal • Crossroad Bank Enterprises (KBO) company register • Users  Benefits
  6. 6. Open Statistics – Statbel Open Data Portal 1  Open Data Portal on the Statbel website since Q4 2015 : www.statbel.fgov.be/opendata – Population & Census – Labour market & living conditions • Fiscal statistics on income – Environment – Prices • CPI – Tools • Geography • Codes and Classifications
  7. 7. Open Statistics – Statbel Open Data Portal 2  +-/ 110 datasets  Formats • XLSX  Excel  Pivot tables • CSV, TXT  R, SAS, …, PostgreSQL, • GML, SHP  QGIS, ArcGIS, … , • Json, XML, CSV, XLSX be.STAT=> dynamic databank of Statbel  Special care – Privacy – Continuity  Goal : 1 new dataset/month – Next : population, households, real estate
  8. 8. Open Statistics – Statistics Belgium in the EU  European Statistical System = Eurostat + NSI’s – Key provider of public open data – Draft Open Data Strategy (feb 2017) Statistics Belgium • Statbel.fgov.be/opendata Eurostat • Key contributor to the open data portals EU Open Data Portal • Data.europa.eu/euodp Belgium • Data.gov.be Metadata harvesting European Data Portal • www.europeandataportal.eu metadata Metadata harvesting
  9. 9. Open Statistics – 5***** ?  Statistics Belgium => Open Data Statbel: Situation actuelle Statbel: Ambition
  10. 10. Open Statistics – RDF Resource description framework (RDF)
  11. 11. Open Statistics – RDF - Uniform resource identifier URI  Use URIs to identify things, so that people can point at your stuff – A URI identifies a concept. – Example of a URI for the Rixensart commune:http://vocab.belgif.be/refnis/25091#id – In general, a URI is associated with a web page that documents the concept. For Rixensart: http://vocab.belgif.be/refnis/25091
  12. 12. Open Statistics – Resource description framework (RDF)  In the RDF files, triplets of the type “subject-predicate-object” are stored  In RDF files, – subjects are URIs. – predicats are URIs. – objects are URIs ou des litéraux  Example (nomenclature): <http://vocab.belgif.be/refnis/25091#id> <http://www.w3.org/2004/02/skos/core#prefLabel> "Rixensart"@fr .  There are "standard vocabularies" (rules for forming triplets). Skos is one of them.
  13. 13. Open Statistics – Resource description framework (RDF)  It’s possible to use "prefixes" to "abbreviate" URIs in RDF files  Example: @prefix refnis: http://vocab.belgif.be/refnis/ . @prefix skos: http://www.w3.org/2004/02/skos/core# . refnis:25091#id skos:prefLabel "Rixensart"@fr. refnis:25091#id skos:broader refnis:25000#id.
  14. 14. Open Statistics – Resource description framework (RDF)  Sample RDF file to describe a study(metadata): – ddi:Study_1 a disco:Study. – ddi:Study_1 dcterms:title "National Population and Housing Census, 1980"@en. – ddi:Study_1 dcterms:identifier "ARG_1980_PHC_v01_A_IPUMS“ .  This description uses the vocabulary « ddi-rdf » (disco): – DDI-RDF is “A vocabulary for publishing metadata about data sets (research and survey data) into the Web of Linked Data” – Described here : http://rdf-vocabulary.ddialliance.org/discovery.html
  15. 15. Open Statistics – Resource description framework (RDF)  RDF = forming triplets  There are several syntaxes to form them – turtle, – N-triples, – xml, – …
  16. 16. Open Statistics – Linked Open Data (LOD) Linked open-data (LOD)
  17. 17. Open Statistics – Linked Open Data (LOD)  It’s possible to link several RDF sources. This is referred to as Linked Open Data (LOD). Examples of LOD sites on which to link : – Dbpedia – Wikidata – Geonames  A simple way to link to another DB is to re-use its URIs
  18. 18. Open Statistics – Linked Open Data (LOD)  Example of LOD (nomenclature): – @prefix refnis: http://vocab.belgif.be/refnis/ . @prefix skos: http://www.w3.org/2004/02/skos/core# . refnis:25091#id skos:prefLabel "Rixensart"@fr. refnis:25091#id skos:broader refnis:25000#id. refnis:25091#id skos:exactMatch <http://sws.geonames.org/2787990>. refnis:25091#id skos:exactMatch <http://www.wikidata.org/entity/Q630478> .
  19. 19. Open Statistics – Semantic web Semantic web
  20. 20. Open Statistics – Semantic web  All the " sujet-prédicat-objet " sentences of the different LODs form a giant "knowledge graph" whose size increases rapidly
  21. 21. Open Statistics – Semantic web
  22. 22. Open Statistics – Ontologies for statisticians Standard vocabularies
  23. 23. Open Statistics – Standard vocabularies  Classifications – SKOS: Classifications (nomenclatures) – XKOS: SKOS extension (for NACE, …)  Document a list of files (catalog) – DCAT – StatDCAT-AP – GeoDCAT-AP
  24. 24. Open Statistics – Standard vocabularies  Metadata: – Dublin core – DDI-RDF  Data: – RDF Data cube vocabulary
  25. 25. Open Statistics – Standard vocabularies  Other interesting vocabularies recommended by Eurostat – The Organization Ontology – The PROV ontology – Time Ontology in OWL – Dublin Core – ISA Core Vocabularies in RDF (Person, Public Organisation, Business, Public Service, Location) – Vocabulary of Interlinked Datasets (VoID)
  26. 26. Open Statistics – Nomenclatures  Some nomenclatures, "controlled vocabularies" & thesauri recommended by Eurostat:INSPIRE code lists – EuroVoc thesaurus – Named Authority Lists (NAL)
  27. 27. Open Statistics – LOD IN THE NSIs  Some NSIs already have LOD: – Insee: Some code tables + legal population – Istat – ONS + Geoportal UK – Census 2011 in Ireland
  28. 28. Open Statistics – RDF@Statbel  What to publish as LOD?  Priorities for publication as LOD: – Nomenclatures (create URIs for NACEBEL, REFNIS, … + create files that expose hierarchies, …) – Catalog of the data (to let the ‘machines’ all over the world know that our datasets are available in csv, …) – Metadata – A selection of datasets (For example: legal population of municipalities)
  29. 29. Open Statistics – Questions
  30. 30. Open Statistics – Contact  Check out our websites  Explore our datasets  Re-use our data  and  Contact us!  For questions please contact : statbel.opendata@economie.fgov.be Lucia.Decuyper@economie.fgov.be Youri.Baeyens@economie.fgov.be  To find out more check: http://statbel.fgov.be https://bestat.statbel.fgov.be http://statbel.fgov.be/opendata/ http://statbel.fgov.be/en/statistics/opendata/licence/  Follow Statbel on Twitter

×