5. 16. GOVERNMENT TRANSPARENCY
The Government believes that we need to throw open the doors of public bodies, to
enable the public to hold politicians and public bodies to account. We also
recognise that this will help to deliver better value for money in public spending,
and help us achieve our aim of cutting the record deficit. Setting government data
free will bring significant economic benefits by enabling businesses and non-
profit organisations to build innovative applications and websites.
We will require public bodies to publish online the job titles of every member of
staff and the salaries and expenses of senior officials paid more than the lowest
salary permissible in Pay Band 1 of the Senior Civil Service pay scale, and
organograms that include all positions in those bodies.
We will ensure that all data published by public bodies is published in an open
and standardised format, so that it can be used easily and with minimal cost by
third parties.
5
7. Formats for people Formats for machines
Focused on presentation or Focused on data interchange
typographic layout between computers
Look good, but hard to access Look dreadful, hard for people
the underlying data to understand but easy to import
into other systems and use
7
8. Formats for Single Formats for
people source of machines
Focused on Focused on data
presentation or data interchange between
typographic layout computers
8
9. Download
Good for static information
Small files
Used for export/import
Easy for publishers
Most of the data registered on data.gov.uk
Programmatic access
Good for dynamic or real-time information or very large datasets
Lets developers select and use just the information they need
Retains more control for the publisher
More complicated to implement but much more powerful
Vital for many useful datasets
9
11. He also developed the first industrially
practical screw-cutting lathe in 1800, allowing
standardisation of screw thread sizes for the
first time. This allowed the concept of
interchangeability (a idea that was already
taking hold) to be practically applied to nuts
and bolts. Before this, all nuts and bolts had to
be made as matching pairs only. This meant
that when machines were disassembled,
careful account had to be kept of the
matching nuts and bolts ready for when
reassembly took place.
http://en.wikipedia.org/wiki/Henry_Maudslay
12. In 1841, Joseph Whitworth created a design
that, through its adoption by many British
railroad companies, became a national
standard for the United Kingdom called
British Standard Whitworth. During the
1840s through 1860s, this standard was
often used in the United States and Canada
as well, in addition to myriad intra- and
inter-company standards. .
http://en.wikipedia.org/wiki/Screw_thread#
History_of_standardization
13. * make your stuff available on the Web
(whatever format) under an open licence
** make it available as structured data (e.g.,
Excel instead of image scan of a table)
*** use non-proprietary formats (e.g., CSV
instead of Excel)
**** use URIs to identify things, so that people
can point at your stuff
***** link your data to other data to provide
context
13
15. Give names, or web identifiers (URIs), to
things
Publish information about them as Web
Resources
Use RDF triples (subject, property, value)
Link to other data about those things
15
16. Enables web-scale data publishing - distributed
publication with web-based discovery mechanisms
Everything is a resource – follow your nose to
discover more about properties, classes, or codes
within a code list
Everything can be annotated - make comments
about observations, data series, points on a map
Easy to extend - create new properties as required,
no need to plan everything up-front
Easy to merge - slot together RDF graphs, no need
to worry about name clashes
16
18. developing standards for responsible
publishing of key types of data (financial
data, organisation data, aggregate statistics,
location data)
developing guidance, practices and tools
that make it easy to publish data in Linked
Data form, at low cost
making it easy for people to consume data
in a programmatic way
19. Director General
2008 2009 2010
A 1,345 1,456 2,301
Director Director B 2,112 3,543 2,111
(Operations) (Strategy)
C 2,345 2,987 2,455
D 6,342 6,256 6,123
Deputy Director Deputy Director E 7,435 7,432 8,102
(A) (A)
Transaction Date Supplier Amount
A-1263 09/09/2010 Spottiswoode & Co £ 2,345
A-1264 09/09/2010 JSB & Sons £ 2,111
A-1265 09/09/2010 BLG Ltd £ 2,455
A-1266 09/09/2010 Spottiswoode & Co £ 6,123
A-1267 09/09/2010 BLG Ltd £ 8,102
20. URI = uniform resource identifier
Everything starts HTTP – which gives us
actionable names
There is choice about how to make URIs
We are using
{sector}.data.gov.uk/id/{something}
20
23. If you visit legislation.gov.uk you will see we
have taken great care with naming things
Returns an html document for United Kingdom Public General Act (ukpga), 2005,
Chapter 14, Section 1
Returns an html document with a list from all legislation types where the title
contains “wildlife”
23
24. UK Public General Act (ukpga)
1981
Chapter 69
Section 5
As it extends to England
As it stood on 30th January 2001
Displayed as an HTML document with the timeline
on
Although URIs are opaque having this type of
design changes how people use the service
24
26. Everything on legislation.gov.uk is available
as open data under the terms of our Open
Government Licence
To access the data, visit any page and add:
/data.xml
/data.rdf
/data.xht
For lists
/data.feed
26
27. Re-use where we can, create where we must
Small, high level, light weight vocabularies
Examples include datacube, organization,
provenance
Create local specialisations
Examples include payments, central-government
Post hoc linking
27
33. http://reference.data.gov.uk/id/day/2011-06-1
There are similar URIs for seconds, minutes,
hours, weeks, months, quarters, years
We were a bit slow (170 years) to move from the
Julian to Gregorian Calendar (see the Calendar
Act, 1750)
To transition, we lost 11 days in 1752
Convoluted explanation of why the tax year in
the UK starts on the 6th April
Our URIs for time intervals work this way too
and the British time intervals URI Set is linked
to the legislation
35. Malcolm Gladwell article on Ron Popeil from 2000 in
the New Yorker:
”And how do you persuade people to disrupt their lives?
Not merely by ingratiation or sincerity, and not by being
famous or beautiful. You have to explain the invention
to consumers - not once or twice but three or four times,
with a different twist each time. You have to show them
exactly how it works and why it works, and make them
follow your hands as you chop liver with it, and then tell
them precisely how it fits into their routine, and, finally,
sell them on the paradoxical fact that, revolutionary as
the gadget is, it's not at all hard to use.”
40. Open Standard
Generic approach for creating APIs from
Linked Data
Sits on top of a Linked Data store
Several implementations, most mature is
Puelia
40
45. We will require public bodies to publish online
the job titles of every member of staff and the
salaries and expenses of senior officials paid
more than the lowest salary permissible in Pay
Band 1 of the Senior Civil Service pay scale, and
organograms that include all positions in those
bodies.
46. October 2010
CSV template and PDFs of organograms,
typically authored using Powerpoint
Emphasis on visual appearance, led to
inconsistent datasets which are very hard to
re-use
No relationship between the organogram and
data
Not using web standards
46
48. “The Government has published
the most comprehensive
organisational charts of the UK
Civil Service ever released online,
taking another step towards its
goal of being the most transparent
government in the world and
opening up the structure of the
Civil Service to public scrutiny”
49. 100s of UK Government Organisations have published
their organisation data as Linked Data
Distributed data publishing
It the largest number of organisations joining the Web
of Linked Data in a single day!
The data is deeply linked (Departments, Grades ,
Professions, date of the snapshot)
Cross dataset queries are perhaps the most
interesting
Proves Linked Data is moving from research topic to
commodity publishing
We can now extend this approach to other types of
dataset and link our transparency data
49
50. Make it as simple as possible for people in Departments to
create Linked Data
Create high quality, consistent data that matches the
policy intent and guidance
Distributed capture and publishing
Create open data in open standards using open source
tools
Human readable and machine readable from single source
Provide download and API access in different formats
(CSV, XML, JSON, RDF, HTML)
Evolutionary route to create longitudinal datasets,
reconciling against previous data
Enable everyone to publish 5 Star Linked Data
50
51. Capture organisation data using a
spreadsheet, which verifies policy rules and
datatypes
Upload spreadsheet
Preview organogram
Download RDF and two CSVs
Publish on your website and register with
data.gov.uk
51
52. It’s the tool most Civil Servants have
This *does* also work in Libre Office / Open
Office etc
52
57. Organogram
HTML, CSS &
JavaScript
Excel file
HTML XML JSON
1. Upload Excel
Organogram (PHP) Linked Data API
2. Create 3. Create 4. Query 5. Create
CSVs Mapping (SPARQL) RDF
RDF file
Senior Junior Mapping API Config
6. Load
CSV CSV TRiG RDF
7. Query
XLWrap (SPARQL)
Sesame
TDB RDF Store
57 Reconciliation
59. Implicit properties are made explicit (person,
role, person in a role)
Reconciliation adds value by automatic
linking to other data
Provenance
Example data
Explicit open licence
62. Linked Data is essential to realising the promise
of Open Government Data
Using Linked Data means working on
Standards
Reference Data
Production
Publishing
Lots of opportunities for international
collaboration
Best advice, just start