Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Open statistics Belgium

186 views

Published on

Presentation by Lucia Decuyper and Youri Baeyens at Open Belgium 2017.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Open statistics Belgium

  1. 1. Open Statistics Open Belgium 6 March 2017 Statistics Belgium Lucia Decuyper Youri Baeyesn
  2. 2. Open Statistics – Agenda  Statistics Belgium => Open Data • Statistics Belgium • Open Data Start • Statbel Open Data Portal • Statistics Belgium in the EU  Open Data => Linked Open Data • 5*****? • RDF • LOD • Semantic Web • Ontologies for statisticians • LOD in the NSIs • RDF@statbel  Questions  Contact
  3. 3. Open Statistics – Statistics Belgium 1  Statistics Belgium ? – National Statistical Institute (before NIS) • largest producer of official statistics in Belgium  What do we do? – Collect data: administrative sources (registers) or surveys – Process and analyse data: • common methodology, definitions (national, European) – Publish data • => +/- 400y releases on Statbel
  4. 4. Open Statistics – Statistics Belgium 2  One of the core tasks consists in making all produced statistics available to everyone (European Statistics Code of Practice) – Website Statbel since 1997 – Free re-(use) => source – ‘open by default’  +/-100 statistics – The main fields covered are population, society, work, economy, real estate, construction, mobility and transport. – Census
  5. 5. Open Statistics – Open Data Start?  Why? • 2nd PSI – directive • Belgian Federal Open Data strategy 2015 • Digital agenda (EU) • Eurostat => EU Open Data Portal • Crossroad Bank Enterprises (KBO) company register • Users  Benefits
  6. 6. Open Statistics – Statbel Open Data Portal 1  Open Data Portal on the Statbel website since Q4 2015 : www.statbel.fgov.be/opendata – Population & Census – Labour market & living conditions • Fiscal statistics on income – Environment – Prices • CPI – Tools • Geography • Codes and Classifications
  7. 7. Open Statistics – Statbel Open Data Portal 2  +-/ 110 datasets  Formats • XLSX  Excel  Pivot tables • CSV, TXT  R, SAS, …, PostgreSQL, • GML, SHP  QGIS, ArcGIS, … , • Json, XML, CSV, XLSX be.STAT=> dynamic databank of Statbel  Special care – Privacy – Continuity  Goal : 1 new dataset/month – Next : population, households, real estate
  8. 8. Open Statistics – Statistics Belgium in the EU  European Statistical System = Eurostat + NSI’s – Key provider of public open data – Draft Open Data Strategy (feb 2017) Statistics Belgium • Statbel.fgov.be/opendata Eurostat • Key contributor to the open data portals EU Open Data Portal • Data.europa.eu/euodp Belgium • Data.gov.be Metadata harvesting European Data Portal • www.europeandataportal.eu metadata Metadata harvesting
  9. 9. Open Statistics – 5***** ?  Statistics Belgium => Open Data Statbel: Situation actuelle Statbel: Ambition
  10. 10. Open Statistics – RDF Resource description framework (RDF)
  11. 11. Open Statistics – RDF - Uniform resource identifier URI  Use URIs to identify things, so that people can point at your stuff – A URI identifies a concept. – Example of a URI for the Rixensart commune:http://vocab.belgif.be/refnis/25091#id – In general, a URI is associated with a web page that documents the concept. For Rixensart: http://vocab.belgif.be/refnis/25091
  12. 12. Open Statistics – Resource description framework (RDF)  In the RDF files, triplets of the type “subject-predicate-object” are stored  In RDF files, – subjects are URIs. – predicats are URIs. – objects are URIs ou des litéraux  Example (nomenclature): <http://vocab.belgif.be/refnis/25091#id> <http://www.w3.org/2004/02/skos/core#prefLabel> "Rixensart"@fr .  There are "standard vocabularies" (rules for forming triplets). Skos is one of them.
  13. 13. Open Statistics – Resource description framework (RDF)  It’s possible to use "prefixes" to "abbreviate" URIs in RDF files  Example: @prefix refnis: http://vocab.belgif.be/refnis/ . @prefix skos: http://www.w3.org/2004/02/skos/core# . refnis:25091#id skos:prefLabel "Rixensart"@fr. refnis:25091#id skos:broader refnis:25000#id.
  14. 14. Open Statistics – Resource description framework (RDF)  Sample RDF file to describe a study(metadata): – ddi:Study_1 a disco:Study. – ddi:Study_1 dcterms:title "National Population and Housing Census, 1980"@en. – ddi:Study_1 dcterms:identifier "ARG_1980_PHC_v01_A_IPUMS“ .  This description uses the vocabulary « ddi-rdf » (disco): – DDI-RDF is “A vocabulary for publishing metadata about data sets (research and survey data) into the Web of Linked Data” – Described here : http://rdf-vocabulary.ddialliance.org/discovery.html
  15. 15. Open Statistics – Resource description framework (RDF)  RDF = forming triplets  There are several syntaxes to form them – turtle, – N-triples, – xml, – …
  16. 16. Open Statistics – Linked Open Data (LOD) Linked open-data (LOD)
  17. 17. Open Statistics – Linked Open Data (LOD)  It’s possible to link several RDF sources. This is referred to as Linked Open Data (LOD). Examples of LOD sites on which to link : – Dbpedia – Wikidata – Geonames  A simple way to link to another DB is to re-use its URIs
  18. 18. Open Statistics – Linked Open Data (LOD)  Example of LOD (nomenclature): – @prefix refnis: http://vocab.belgif.be/refnis/ . @prefix skos: http://www.w3.org/2004/02/skos/core# . refnis:25091#id skos:prefLabel "Rixensart"@fr. refnis:25091#id skos:broader refnis:25000#id. refnis:25091#id skos:exactMatch <http://sws.geonames.org/2787990>. refnis:25091#id skos:exactMatch <http://www.wikidata.org/entity/Q630478> .
  19. 19. Open Statistics – Semantic web Semantic web
  20. 20. Open Statistics – Semantic web  All the " sujet-prédicat-objet " sentences of the different LODs form a giant "knowledge graph" whose size increases rapidly
  21. 21. Open Statistics – Semantic web
  22. 22. Open Statistics – Ontologies for statisticians Standard vocabularies
  23. 23. Open Statistics – Standard vocabularies  Classifications – SKOS: Classifications (nomenclatures) – XKOS: SKOS extension (for NACE, …)  Document a list of files (catalog) – DCAT – StatDCAT-AP – GeoDCAT-AP
  24. 24. Open Statistics – Standard vocabularies  Metadata: – Dublin core – DDI-RDF  Data: – RDF Data cube vocabulary
  25. 25. Open Statistics – Standard vocabularies  Other interesting vocabularies recommended by Eurostat – The Organization Ontology – The PROV ontology – Time Ontology in OWL – Dublin Core – ISA Core Vocabularies in RDF (Person, Public Organisation, Business, Public Service, Location) – Vocabulary of Interlinked Datasets (VoID)
  26. 26. Open Statistics – Nomenclatures  Some nomenclatures, "controlled vocabularies" & thesauri recommended by Eurostat:INSPIRE code lists – EuroVoc thesaurus – Named Authority Lists (NAL)
  27. 27. Open Statistics – LOD IN THE NSIs  Some NSIs already have LOD: – Insee: Some code tables + legal population – Istat – ONS + Geoportal UK – Census 2011 in Ireland
  28. 28. Open Statistics – RDF@Statbel  What to publish as LOD?  Priorities for publication as LOD: – Nomenclatures (create URIs for NACEBEL, REFNIS, … + create files that expose hierarchies, …) – Catalog of the data (to let the ‘machines’ all over the world know that our datasets are available in csv, …) – Metadata – A selection of datasets (For example: legal population of municipalities)
  29. 29. Open Statistics – Questions
  30. 30. Open Statistics – Contact  Check out our websites  Explore our datasets  Re-use our data  and  Contact us!  For questions please contact : statbel.opendata@economie.fgov.be Lucia.Decuyper@economie.fgov.be Youri.Baeyens@economie.fgov.be  To find out more check: http://statbel.fgov.be https://bestat.statbel.fgov.be http://statbel.fgov.be/opendata/ http://statbel.fgov.be/en/statistics/opendata/licence/  Follow Statbel on Twitter

×