A presentation by Daniel Lewis of the Open Knowledge Foundation.
Delivered at the Cataloguing and Indexing Group Scotland (CIGS) Linked Open Data (LOD) Conference which took place Fri 21 September 2012 at the Edinburgh Centre for Carbon Innovation.
2. OKFN in a nutshell...
Promotes Open Knowledge because:
•Better Governance
•Better Culture
•Better Research
•Better Economy
3. OKFN in a nutshell...
How? Projects
•CKAN
Regional Chapters •OpenSpending
•Cambridge •School of Data
•London •Textus
•Edinburgh •Public Domain Review
•Germany •Open Shakespeare
•Finland •...
•Brazil
•South Africa Working Groups
•... •Open GLAM
•Open Economics
•Open Linguistics
•...
4. Your Presenter...
Daniel Lewis
At Present:
• Full Time PhD Student @ Bristol - Intelligent Systems Lab, investigating Association Rules and Fuzzy Concept Lattices
• Part-Time "Community Coordinator, Data Wrangler and Linked Data Consultant" @ Open Knowledge
Foundation. Primarily assisting with Linked Data and Linked Open Data.
History:
• Various positions - Web Dev., Software Eng.r., Self Employed Consultant
• ... Technology Evangelist for OpenLink Software, promoting the use of the Semantic Web & Linked Data
• ... Research Intern at the Knowledge Media Institute (The Open University) looking at Social Semantic Tools and Taxonomies
5. OKFN + Linked Data...
Why We feel...
Linked Open Data
Linked is data in the community
Data? it has context
and allows collaboration
“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
6. OKFN + Linked Data...
What ckan
are we
doing
A catalogue of datasets
about
Powers data.gov.uk and other government dataset
it? websites around the world
For various formats, not just Linked Data... so CSV,
Web Services, PDF, MS Excel...
7. OKFN + Linked Data...
ckan ckan
features
Link to datasets
Store datasets
Search, Query and Filter
Visualisation and Analytics
Access control
http://ckan.org
8. OKFN + Linked Data...
What ckan
are we
doing
The Open Knowledge Foundation are official partners in the LOD2 project.
about •EU Funded (FP7)
•Industry and Academic Members
it? •Promote LOD to business and the public sector
•Advance LOD tools
•Advance LOD techniques
•Advance LOD datasets
9. OKFN + LOD2
What ckan
are we
doing
about ckan is one of the official LOD2 tools....
it?
10. What
Dataset Catalogue
are we
doing Aggregation of all public data in the EU.
about Enhanced with Linked Data Methods - RDF, SPARQL, Visualisation.
publicdata.eu
it?
11. Features
Everything that ckan and data.gov.uk has, plus...
EU Applications
* About Corporations
* About taxes
* About legislation
LOD Availability Map
publicdata.eu
12. Universities & R&D Organisations:
Who University of Economics in the Czech Republic
National University of Ireland, Galway
else? Universität Leipzig
Freie Universität Berlin
Korea Advanced Institute of Science and Technology (KAIST)
Instytut Informatyki Gospodarczej in Poland
Institute Mihajlo Pupin in Serbia
Centrum Wiskunde & Informatica in the Netherlands
Business & Industry:
Open Knowledge Foundation
OpenLink Software
TenForce
Exalead
Zemanta
Wolters Kluwer
13. Who has 2010 Publink:
•Umweltbundesamt GmbH, Austria
it helped
•Greater London Authority, U.K.
•The City of Vienna, Austria
directly?
•Instituto Canario des Estadística (ISTA), Canary Islands
•EU Digital Agenda Scoreboard, Belgium
•"Deutsche Biographie" project of the Historische Kommission, Germany
o www.deutsche-biographie.de
14. Who has 2011 Publink:
•Food and Agriculture Organization of the United Nations, Italy
it helped •European Environment Agency, Denmark
•Birmingham City Council, England
directly? •Municipality of Udine, Italy
•Statistical Office of the Republic of Serbia
•Bezirksverordnetenversammlung Berlin-Kreuzberg, Germany
22. Interesting Links...
• OKFN: http://okfn.org/
• LOD2: http://lod2.eu/
• CKAN: http://ckan.org/
• PublicData.eu: http://publicdata.eu/
• Digitised Manuscripts to Europeana: http://dm2e.eu/
• Open GLAM: http://openglam.org/
• Open Bibliography: http://openbiblio.net/
• Open Humanities: http://humanities.okfn.org/
• Public Domain Review: http://publicdomainreview.org/
• Open Shakespeare: http://openshakespeare.org/
• Scotland OKFN mailing list: http://lists.okfn.org/mailman/listinfo/ok-scotland
23. Thank You
Daniel Lewis :: daniel.lewis@okfn.org
presenting on behalf of the
Open Knowledge Foundation :: http://okfn.org/
and this has been my talk on
Linked Open Data in Europe
Thank you...
Editor's Notes
The Open Knowledge Foundation is a non-profit organisation which exists to promote open knowledge to organisations, it does this through community work and through developing open source software and services. OKFN believes the Open Knowledge provides... Better governance : openness improves governance through increased transparency and engagement. Better culture : openness means greater access, sharing and participation in relation to cultural material and activities. Better research : for research to function effectively, and for society to reap the full benefits from research activities, research outputs should be open. Better economy : openness permits easier and more rapid reuse of material and open data and content are the key raw ingredients for the development of new innovative tools and services.
How do we do this... well we have a few paid for staff, some full time and some part time (such as myself). we have many volunteers. If we have enough people in one area then we create a regional chapter, so that things can happen more efficiently. Our main hubs are in Cambridge and London, but we have chapters in Edinburgh, Germany Finland, Switzerland and around the world. I happen to live and work in the Bristol area down in the South West. We have a number of projects - including CKAN, OpenSpending, School of Data, Textus, Public Domain Review and Open Shakespeare... and we have a number of interest groups which work on getting open data in certain subject areas - they include Open Galleries, Libraries, Archives and Museums (or OpenGLAM for short), Open Economics, Open Linguistics and Open Humanities. So, just to show you a little bit about my own personal background, hopefully to prove to you that you can trust me as your presenter :-)
Until a few weeks ago I worked full time for the Open Knowledge Foundation as their "Community Coordinator, Data Wrangler and Linked Data Consultant", and so I have worked on various projects, mostly ckan and LOD2 with them. Some of my work with OKFN is presented in this presentation. But first a few weeks ago I moved from a full time contract at OKFN to a part-time freelance contract with them, because I won a full time PhD position at the University of Bristol in their Intelligent Systems Lab, where I am investigating extraction of Fuzzy Concept Lattices from log files using Association Rule Mining. It is related in some way to Linked Data and the Semantic Web, because Formal Concept Analysis (or just Concept Lattices) share a common structure to the kind of metadata that you get in Linked Data and the Semantic Web, and so the logic behind it is incredibly similar and there much overlap in the research community. That aside, I've also worked as a Semantic Web Technology Evangelist for OpenLink Software, and a Research Intern at the Knowledge Media Institute of The Open University where I looked at Social Semantic tools and taxonomies. ok, so lets go back to looking at OKFNs involvement with Linked Data.
At OKFN, we feel that Linked Open Data, that is Linked Data which is also Open Data, allows data to be placed in context and allows for collaboration over the world wide web. Linked Data, without the "open", provides structure and meaning to plain data, and places it in a distributed environment (i.e. the world-wide web) which is friendly for all kinds of organisations and users. Open Data allows data to be released to the world, under a licence. Combine the two and you have got the whole reason why Open Knowledge exists. Linked Open Data provides all the properties required for "Open Knowledge", and that is why the Open Knowledge Foundation has got involved - we believe that Open Knowledge provides better governance, culture, research and economy. So Linked Open Data is one of the areas that OKFN supports.
So, what are we, at the Open Knowledge Foundation, doing about Linked Open Data. Well, firstly, we've developed ckan (which is an acronym for Comprehensive Knowledge Archive Network), it is an open source tool for cataloguing datasets. and by datasets, I mean any collection of data - it could be in a records based format such as a spreadsheet (CSV, or Excel) or it could be in a PDF or Word document, or whatever. It could even be a web service. The data could be under an open licence, a closed licence or no licence at all. One of the most famous installations of ckan in the UK is the government-backed data.gov.uk, which has been used to store and link to various data sources - many of which are CSV files, but there is also a huge push from both community activists and from people within the public sector, to release data as pure RDF (which is one of the main modelling frameworks used for Linked Open Data). Lets have a quick summary of the features...
The Open Knowledge Foundation, a few years back, started participating in an EU Funded project called LOD2. The LOD2 project was created to bring together organisations in both industry and academia, in order to promote Linked Open Data to businesses and to government, As well as to advance LOD tools to become production ready, to improve and make clearer LOD creation techniques, and to create LOD datasets themselves. There are also a number of LOD2 programmes where the members of the consortium work with organisations to release their datasets as linked open data, and I'll come back to that in a bit. So, how about OKFNs involvement.
The idea of OKFN taking part in the LOD2 project is to provide our ckan tool to the community, and to also offer our services in (open source consultancy) and (community coordination). So ckan is the official tool in the technology stack for the cataloguing of datasets in the LOD2 project.... and this introduces....
publicdata.eu, is essentially a data.gov.uk for the whole of Europe. It aggregates data catalogues, including data.gov.uk, into one central european repository. This ckan installation provides the features that we've developed at OKFN to the Linked Open Data community, at the time we have been working with the other tool creators in the LOD2 Consortium, meaning that we've started to integrate ckan with software such as Virtuoso Universal Server by OpenLink Software, and also started enhancing linked data visualisation techniques in ckan. Some of these enhanced integrations should be appearing as optional features in the standalone ckan installation within the next year or two. Lets have a quick overview of the features of publicdata.eu that are available now...
But who else is in the LOD2 consortium? Universities & R&D Organisations: University of Economics in the Czech Republic National University of Ireland, Galway Universität Leipzig Freie Universität Berlin Korea Advanced Institute of Science and Technology (KAIST) Instytut Informatyki Gospodarczej in Poland (Institute of Business Informatics) Institute Mihajlo Pupin in Serbia (Mihajlo Pupin Institute) Centrum Wiskunde & Informatica ( Centre for Mathematics and Computer Science ) in the Netherlands Business & Industry: Open Knowledge Foundation OpenLink Software TenForce Exalead Zemanta Wolters Kluwer
Each year there is a publink competition, where an organisation can bid for help from the LOD2 consortium, to release their data as Linked Open Data or to integrate a Linked Open Data tool. So in 2010 the LOD2 Consortium helped, under the publink project: Umweltbundesamt GmbH, Austria (Environment Agency Austria) Greater London Authority, U.K. Deutsche Biographie project of the Historische Kommission, Germany (German Biography Project of the History Commission) The City of Vienna, Austria Instituto Canario des Estadística (ISTA), Canary Islands (Canarian Institute of Statistics) EU Digital Agenda Scoreboard, Belgium One of the organisations here is particularly interesting to library and information professionals... and that is the German Biography Project. It might be good if I just read you the official blurb: "The German Biography is an online project of the Historical Commission at the Bavarian Academy of Science. The original print version of two biographical lexica contains information of about 47,000 biographies, 45,000 additional people and at least 12,000 places." This data is being transformed into Linked Open Data with the help of the LOD2 consortium. It is project much bigger than just Linked Open Data as it also involves digitisation, and so it is funded by the German Research Foundation (Deutsche Forschungsgemeinschaft). More information is publicly available at http://www.deutsche-biographie.de/ .
In 2011 the publink consultation happened again, this time helping... Food and Agriculture Organization of the United Nations, Italy European Environment Agency, Denmark Birmingham City Council, Great Britain Municipality of Udine, Italy Statistical Office of the Republic of Serbia Bezirksverordnetenversammlung Berlin-Kreuzberg, Germany (Borough Assembly of Berlin-Kreuzberg)
and so for more information on LOD2 please visit lod2.eu ok, so before I start wrapping up, I thought that it would be a good idea to take you through the history and future of linked open data, from my own perspective.
Start with a few words on past and future... I think that there is very little point in trying to hide the fact that "Linked Data" comes straight from the "Semantic Web". But is Linked Data, just a rebranding of the Semantic Web, or is something completely different?
The reason why I am saying this is because I have met people who have just looked at me and blindly proclaimed that the Semantic Web has never and will never take off. To which, I say why? We can certainly see that there are still academic conferences on the Semantic Web, including ISWC and ESWC. There are also a few semantic web interest groups around. Plus some businesses still say that they are Semantic Web consultants. So the Semantic Web isn't really dead...
What is going wrong then? I would say, that perhaps it is about target audience... the Semantic Web, and I can put my hand on my heart on this one, was essentially an buzzword which took off in the academic but didn't take off in the corporate world. There are plenty of instances in academia, where Semantic Web frameworks and formats have been used, for instance in biological taxonomies, for chemical materials, for engineering information, even for things like natural language processing, an not to mention bibliographic information... but for some reason it just didn't capture the imagination of business, and got stuck being taught in computer and information degrees at university, where students wouldn't be able to see past the pizza and wine ontologies. However, the Semantic Web is still useful, and there are still groups which are working on the semantic web, hence the conference I already mentioned. Some of the most useful research in the semantic web field is on ontologies, which are complicated structures, built on pure logic - they can be seen as metadata. There is work on how to extract ontologies from plain text like in books, and how to extract ontologies from more structured texts such as spreadsheets and databases. but what about the more business-friendly and user-friendly world... how does that fit in... well in comes in the form of Linked Data.
The way that I see it is that Linked Data takes all the tools and techniques of the Semantic Web, and throws away the Semantic Web. It brings the benefits of a distributable structure, and places it in human-friendly terms. It readies it for business, and yet keeps its foot in community. This leads to happy users, as they don't need to know any of the complicated or overly academic stuff regarding the Semantic Web. Linked Data is the Semantic Web in a new context, it has thrown away to baggage of the Semantic Web.... and purely focuses on being useful.
Hence we see: Easier to use tools Easier data transformation and creation methodologies Better integration Understandable ontologies, as "schema" It is about information, rather than logic... and so we see a less of a reliance on languages such as RDF and SPARQL... because we know that it is possible to automate or semi-automate the transformation of data in formats such as CSV and relational databases.
... and so projects such as LOD2, and organisations such as the Open Knowledge Foundation, provide the world with a capability to understand data. This is a very powerful thing, and certainly has the potential to make life alot easier in certain respects. This is all community work, and so we can all have a say. It is quite easy to get in touch with the LOD2 project coordinators, and the Open Knowledge Foundation network.
Here are some interesting links... feel free to see me afterwards if you don't catch them now... The OKFN main site and the LOD2 site will link on to most of the others. I imagine that Open GLAM and Open Biblio might be of interest to some of the people here. and so all that now remains...
... is to say thank you, you've been a lovely audience... theres my email if you'd like to get in touch.