www.kdz.or.atwww.kdz.or.at
August 17th, 2019
Bernhard Krabina
Rethinking public sector data
ecosystems -
Open Government Data, Semantic MediaWiki and Wikidata
www.kdz.or.at
Introduction
 KDZ – Centre for Public
Administration Research
 Open (Government) Data
 Semantic MediaWiki
17. August 2019 · Seite 2
www.kdz.or.at
Open Data – a short introduction
 Open Government Data means data from the public
sector that is published for everybody to use
without restrictions
 Several Open Government Initiatives exist on
municipal, regional and federal level
 The European Data portal collects information about
metadata published by member states
 So you need:
 metadata describing the data (type, format, license…) -> data.gv.at
 a place where you can actually retrieve the data -> URL
 Very often open data is the source for
Wikidata/Wikipedia
17. August 2019 · Seite 3
www.kdz.or.at
Semantic MediaWiki – a short
introduction
 „Older sibling“ of Wikidata
 Ecosystem of extensions around managing
data in your own MediaWiki
 Enterprise ready, used in many large wikis
(open or closed) outside of WMF
 Commercial support available
 SMW installations can be datasource for
Wikidata
17. August 2019 · Seite 4
www.kdz.or.at
The problem:
We live in the age of
big data…
17. August 2019 · Seite 5
…but still a lot of data is
missing, closed or old,
especially data from „official“
sources (public sector data)
www.kdz.or.at
Examples for data needed to
monitor SDGs (on local level)
 Children in child care facilities (SDG 4)
 Number of male/female officials /public
servants (SDG 5)
 Expenditure for renewable engergy (SDG 7)
 Quality of housing (SDG 10)
 Public parks (SDG 11)
 Amount of waste managed (SDG 12)
 Bike routes (SDG 13)
 …
17. August 2019 · Seite 6
www.kdz.or.at
Example: Modal split
17. August 2019 · Seite 7
www.kdz.or.at
Example: Modal split
17. August 2019 · Seite 8
 Wikidata: entity, but no data
 Nothing in Wikipedia city articles
 Manual list in article „Modal share“
https://en.wikipedia.org/wiki/Modal_share
https://de.wikipedia.org/wiki/Modal_Split
www.kdz.or.at
Example: Modal split
17. August 2019 · Seite 9
 Data from 2006, 2016 and 2006 in
 Nothing in Open Data Portals
 Nothing at Statistics Austria
www.kdz.or.at
Data ecosystems: central register
Example: inhabitants
17. August 2019 · Seite 10
 Cities register new residents in a central
database (ZMR)
 ZMR delivers data quarterly to Statistics Austria
 Some cities publish their own statistics on open
data portals (data.gv.at)
 data has to be added to Wikidata manually
www.kdz.or.at
Possible solution: API, open data
 City delivers data to central register
 register offers (part of) data in API
 Statistics Austria, open data portal and
other users (Wikidata) can access data
 Probability: somewhat likely
17. August 2019 · Seite 11
www.kdz.or.at
Data ecosystems: finances
17. August 2019 · Seite 12
03/2019 (data from 2018) 10/2019
 City agrees on spending report and send spending data
to regional government
 Regional government sends data to Statistics Austria
 KDZ buys data from Statistics Austria and publishes
them on www.offenerhaushalt.at
 Metadata gets delivered to open data portal (data.gv.at)
www.kdz.or.at
Possible solution: publish often
 City delivers data several times
 Probability:
 high for open data (already done)
 unlikely for Wikidata: (not going to happen!)
17. August 2019 · Seite 13
mandatory
voluntary
voluntary (?)
www.kdz.or.at
Possible solution:
municipal data infrastructure
 Cities operate a data infrastructure with
data input and API/export possibilities
 data can be exported to/consumed by many
 probability: likely for some data
17. August 2019 · Seite 14
www.kdz.or.at
Example: Publication
„Austrian Cities in Figures“
17. August 2019 · Seite 15
3 data sources
 Statistics Austria
 regional governments
 questionnaire for participating
municipalities
www.kdz.or.at
Example: Publication
„Austrian Cities in Figures“
17. August 2019 · Seite 16
www.kdz.or.at
Example: Publication
„Austrian Cities in Figures“
17. August 2019 · Seite 17
www.kdz.or.at
Result: „Austrian Cities in Figures“

 only PDF publication is
planned
 ~ 75 cities are included
(2098 municipalities)
 data from 2018 will be
published in the document
„Cities in Figures 2019“ that
will be available in early 2020
 Data will not be available
electronically
 No open license

 SMW has proven to be a
reliable infrastructure
 Easy form based entering
of data
 Own publication process
„approve data“, without
wiki know-how
 CSV, RDF, JSON export
for reuse oft data
 Visualisation options
(“result formats”)
17. August 2019 · Seite 18
www.kdz.or.at
How to improve the situation:
Basic steps
1. Convince public institutions to open up data
you need: open license, machine-readable
format & infrastructure
2. Publish metadata on open data portals
infrastructure is there, DCAT-AP
3. Put (some of) their data into Wikidata – as
an additional usecase
17. August 2019 · Seite 19
www.kdz.or.at
How to improve the situation:
Things to consider
If they need to build an open infrastructure,
there are 3 options:
1. Open up existing infrastructure (API) 
2. Change to different infrastructure 
3. Donate to open infrastructure (Wikidata,
Commons) 
4. Build up open infrastructure 
17. August 2019 · Seite 20
www.kdz.or.at
Options?
 Man use case: runs
Wikidata
 Complex installation
 User interface for
entering and querying
data too complex
 Where provenance
metadata is essential
 For integration with
Wikibase
 Main use case: manage
data in your own wiki
 Simple editing using
forms
 Editing restricted to
specific user groupos
 Managing text and data
 https://www.semantic-
mediawiki.org/wiki/Help:Refere
nce_and_provenance_data
17. August 2019 · Seite 21
www.kdz.or.at
More by Denny Vrandecic
17. August 2019 · Seite 22
https://twitter.com/vrandezo/status/1149012495497138178
www.kdz.or.at
Lets work together on bringing
the best data out of public
institutions so everyone can use
them as simple as possible!
17. August 2019 · Seite 23
 Current data ecosystems are very complex,
cumbersome and old-fashioned and
therefore slow
 There is not enough open data available
Summary
www.kdz.or.at
Contact:
www.semantic-mediawiki.org
4. Oktober 2012 · Seite 24
Bernhard Krabina
 krabina@kdz.or.at
 www.kdz.or.at
 @krabina

Rethinking public sector data ecosystems - Open Government Data, Semantic MediaWiki and Wikidata - Wikimania 2019

  • 1.
    www.kdz.or.atwww.kdz.or.at August 17th, 2019 BernhardKrabina Rethinking public sector data ecosystems - Open Government Data, Semantic MediaWiki and Wikidata
  • 2.
    www.kdz.or.at Introduction  KDZ –Centre for Public Administration Research  Open (Government) Data  Semantic MediaWiki 17. August 2019 · Seite 2
  • 3.
    www.kdz.or.at Open Data –a short introduction  Open Government Data means data from the public sector that is published for everybody to use without restrictions  Several Open Government Initiatives exist on municipal, regional and federal level  The European Data portal collects information about metadata published by member states  So you need:  metadata describing the data (type, format, license…) -> data.gv.at  a place where you can actually retrieve the data -> URL  Very often open data is the source for Wikidata/Wikipedia 17. August 2019 · Seite 3
  • 4.
    www.kdz.or.at Semantic MediaWiki –a short introduction  „Older sibling“ of Wikidata  Ecosystem of extensions around managing data in your own MediaWiki  Enterprise ready, used in many large wikis (open or closed) outside of WMF  Commercial support available  SMW installations can be datasource for Wikidata 17. August 2019 · Seite 4
  • 5.
    www.kdz.or.at The problem: We livein the age of big data… 17. August 2019 · Seite 5 …but still a lot of data is missing, closed or old, especially data from „official“ sources (public sector data)
  • 6.
    www.kdz.or.at Examples for dataneeded to monitor SDGs (on local level)  Children in child care facilities (SDG 4)  Number of male/female officials /public servants (SDG 5)  Expenditure for renewable engergy (SDG 7)  Quality of housing (SDG 10)  Public parks (SDG 11)  Amount of waste managed (SDG 12)  Bike routes (SDG 13)  … 17. August 2019 · Seite 6
  • 7.
  • 8.
    www.kdz.or.at Example: Modal split 17.August 2019 · Seite 8  Wikidata: entity, but no data  Nothing in Wikipedia city articles  Manual list in article „Modal share“ https://en.wikipedia.org/wiki/Modal_share https://de.wikipedia.org/wiki/Modal_Split
  • 9.
    www.kdz.or.at Example: Modal split 17.August 2019 · Seite 9  Data from 2006, 2016 and 2006 in  Nothing in Open Data Portals  Nothing at Statistics Austria
  • 10.
    www.kdz.or.at Data ecosystems: centralregister Example: inhabitants 17. August 2019 · Seite 10  Cities register new residents in a central database (ZMR)  ZMR delivers data quarterly to Statistics Austria  Some cities publish their own statistics on open data portals (data.gv.at)  data has to be added to Wikidata manually
  • 11.
    www.kdz.or.at Possible solution: API,open data  City delivers data to central register  register offers (part of) data in API  Statistics Austria, open data portal and other users (Wikidata) can access data  Probability: somewhat likely 17. August 2019 · Seite 11
  • 12.
    www.kdz.or.at Data ecosystems: finances 17.August 2019 · Seite 12 03/2019 (data from 2018) 10/2019  City agrees on spending report and send spending data to regional government  Regional government sends data to Statistics Austria  KDZ buys data from Statistics Austria and publishes them on www.offenerhaushalt.at  Metadata gets delivered to open data portal (data.gv.at)
  • 13.
    www.kdz.or.at Possible solution: publishoften  City delivers data several times  Probability:  high for open data (already done)  unlikely for Wikidata: (not going to happen!) 17. August 2019 · Seite 13 mandatory voluntary voluntary (?)
  • 14.
    www.kdz.or.at Possible solution: municipal datainfrastructure  Cities operate a data infrastructure with data input and API/export possibilities  data can be exported to/consumed by many  probability: likely for some data 17. August 2019 · Seite 14
  • 15.
    www.kdz.or.at Example: Publication „Austrian Citiesin Figures“ 17. August 2019 · Seite 15 3 data sources  Statistics Austria  regional governments  questionnaire for participating municipalities
  • 16.
    www.kdz.or.at Example: Publication „Austrian Citiesin Figures“ 17. August 2019 · Seite 16
  • 17.
    www.kdz.or.at Example: Publication „Austrian Citiesin Figures“ 17. August 2019 · Seite 17
  • 18.
    www.kdz.or.at Result: „Austrian Citiesin Figures“   only PDF publication is planned  ~ 75 cities are included (2098 municipalities)  data from 2018 will be published in the document „Cities in Figures 2019“ that will be available in early 2020  Data will not be available electronically  No open license   SMW has proven to be a reliable infrastructure  Easy form based entering of data  Own publication process „approve data“, without wiki know-how  CSV, RDF, JSON export for reuse oft data  Visualisation options (“result formats”) 17. August 2019 · Seite 18
  • 19.
    www.kdz.or.at How to improvethe situation: Basic steps 1. Convince public institutions to open up data you need: open license, machine-readable format & infrastructure 2. Publish metadata on open data portals infrastructure is there, DCAT-AP 3. Put (some of) their data into Wikidata – as an additional usecase 17. August 2019 · Seite 19
  • 20.
    www.kdz.or.at How to improvethe situation: Things to consider If they need to build an open infrastructure, there are 3 options: 1. Open up existing infrastructure (API)  2. Change to different infrastructure  3. Donate to open infrastructure (Wikidata, Commons)  4. Build up open infrastructure  17. August 2019 · Seite 20
  • 21.
    www.kdz.or.at Options?  Man usecase: runs Wikidata  Complex installation  User interface for entering and querying data too complex  Where provenance metadata is essential  For integration with Wikibase  Main use case: manage data in your own wiki  Simple editing using forms  Editing restricted to specific user groupos  Managing text and data  https://www.semantic- mediawiki.org/wiki/Help:Refere nce_and_provenance_data 17. August 2019 · Seite 21
  • 22.
    www.kdz.or.at More by DennyVrandecic 17. August 2019 · Seite 22 https://twitter.com/vrandezo/status/1149012495497138178
  • 23.
    www.kdz.or.at Lets work togetheron bringing the best data out of public institutions so everyone can use them as simple as possible! 17. August 2019 · Seite 23  Current data ecosystems are very complex, cumbersome and old-fashioned and therefore slow  There is not enough open data available Summary
  • 24.
    www.kdz.or.at Contact: www.semantic-mediawiki.org 4. Oktober 2012· Seite 24 Bernhard Krabina  krabina@kdz.or.at  www.kdz.or.at  @krabina

Editor's Notes