In the 21st century, data is infrastructure for our economy, just like roads. In this session, Jeni will talk about the big challenges of building a strong data infrastructure: challenges of equality of access, challenges of privacy and trust, and the technical challenges of discovery and interoperability.
2. We connect, equip and inspire
people around the world
to innovate with data
Sir Tim Berners-Lee, President
Sir Nigel Shadbolt, Chairman Neelie Kroes, ODI Board
Martha Lane-Fox, ODI Board
4. Unlock £trillions & impact everyone
Link insight from countries, millions of companies, billions
of people and things
Machine-readable data, sensors and the internet of everything
A robust data infrastructure will enable open innovation at web-scale
What is the future of
the web of data?
25 years of the web
of documents
17. Design for open
Build with the web
Respect privacy
Benefit everyone
Think big but start small
Design to adapt
Encourage open innovation
theodi.org/guides/principles-for-strengthening-our-data-infrastructure
42. Open Banking Working Group
Set up in September 2015
at the request of HM Treasury to
explore how data could be used
to help people transact, save,
borrow, lend and invest their
money
43.
44.
45. May 2016, CMA requires banks
to release and make available
transaction data through an
open API and particular
reference and product
information as open data
First change: new sources of data as the world becomes more instrumented: we have internet of things, carry location trackers in our pockets, post updates on social media and so on. Any talk about the future has to contain something about what a massive change this is, how much more data there is.
Over the past 25 years, the web of documents now numbers in the billions of pages. The web of data will dwarf the existing web.
But it's not just about the quantity of data. There is a second swathe of changes that come about because of the availability of data, in the relationships between the state and both citizens and businesses
I use the analogy that data is like a road – pretty uninteresting in and of itself, but gets you somewhere
roads connect together, like data connects together, but the other thing about roads is that they don't just appear out of nowhere: we make them
Roads can be poor quality, in which case they're hard to navigate, just like poor quality data is hard to use. But you can still use it.
Most roads start like this, just like most data starts out not being particularly high quality
we choose what roads we make, we choose which roads we invest in to make them wider or easier to travel on, where we put junctions and where bridges
and they make everyone's lives easier
Our road infrastructure is more than just the physical roads, it's also things like the rules of the road
In the same way, our data infrastructure is more than a list of datasets…
not just startups taking advantage of free data sources
large companies like Arup gaining greater insights
companies like Thomson Reuters producing open data to give better value to customers
companies like Syngenta releasing data as part of their accountability
not just digital products – where Sainsbury's places their stores, how they manage the quality of produce they get from their farmers
how councils like Camden plan interventions to tackle child obesity
how parents choose which school to send their child to
how you decide which train to take to work
Datopolis is a board game that we created to help look at the way data works in a society, how we can use it and what's important about the decisions we make about the data infrastructure we build
The game is set in the city of Sheridan, which you're trying to keep functioning well.
You keep track of Social, Economic & Environmental health of the city on the dashboard.
If any of the scores reach -4, you have one last chance to save the city or you all lose the game.
If they all reach +4, the city has reached nirvana.
You'll be building a data infrastructure together, using data tiles of these different sorts.
You'll be creating applications based on the data that's available in the data infrastructure, and how it's linked together.
Three kinds of cards:
events are drawn at the end of your turn, and they show things happening to the City of Sheridan
tools are ideas you have in your hand, that you can build during your turn if the data infrastructure contains the right data tiles in the right configuration. When you build a tool, you can increase one of the scores on the City Dashboard, depending on what's highlighted at the bottom of the card. Each tool that you build is worth the same number of points as tiles used to create the tool. When someone has built 10 points of tools, it's the end of the game.
role cards tell you what you're aiming for. There are businesses who aim to be first to create 10 points of tools. There are third sector groups who want the score in the area they care about to be highest at the end of the game. And there are public sector groups who care about the data in the data infrastructure being as open as possible.
Turns are detailed on the turn card.
You can't talk about data without talking about people's personal data and how that is handled, so let's get that out of the way first
We have a model that's mostly based on consent, but this is problematic
people often don't understand what they're really consenting to, or have no option but to consent
keeping track of consent is hard, especially when it involves nuances about use
we draw wrong conclusions if we only use data on those who consent (for whatever reason)
What is ownership of data?
Ownership isn't even clear for physical things. When you "own" a flat you might co-own it with someone else, have taken a mortgage to pay for it, such that a bank has rights of possession. There are limits to what you can do in your flat, both legally and socially. Others have the right to enter your flat under certain circumstances. Sometimes you might be able to sublet or rent out a room.
Ownership is complicated.
picking blackberries from your land: if I do all the harvesting & packaging work, how much do I owe you?
if ownership is complicated in the real world, it's even more complicated for data
data is not like tangible things, it's infinitely reproduceable; my use does not impact your use
Perhaps what matters more is what is done with the data about us: what decisions are made about us based on that data, that then have an impact on our lives.
This is a particular concern where the decisions are made through artificial intelligence.
There are new ways of providing access to data that don't involve copying data wholesale between different organisations, eg APIs, microsegmentation
We need to have a debate at a societal level about what uses of data are acceptable.
It is a cultural thing.
Transparency around data use helps inform that debate.
We can move data. We can open it and we can close it.
CKAN
Socrata
dataworld
LiveStories
European Data Portal
CSV on the web aims to be something that could underpin those changes:
Encourages publication of data in ways that can be discoverable.
Enables provision of metadata, but also the use of data within CSV files.
Supports linking between datasets for both discovery and more sophisticated ranking.
Provides mechanisms for informed presentation