Hallo, First of all, I would like to thank the organisers for this very interesting conference and for all their efforts in promoting open knowledge for everyone through the availability of open data. It’s a lot thanks to the open government movement that we are also able today to talk about open culture data and we are inspired by the dynamism of all you. So, my name is Georgia Angelaki, I work for Europeana and I will present what is probably the most massive in scale effort todate to open up data that comes from the cultural heritage sector in Europe. I will try to keep my presentation to 20 minutes so that we can have some discussion at the end.
First of all, I wonder if you all know Europeana
If you don’t know Europeana, you may know her- it’s the European Commissioner for the Digital Agenda who tweeted last year that Europeana is the most visible representation of Europe.
Europeana.eu is this portal.
It started when 6 members of state back in 2005 wrote to EU President Baroso to say that European culture is at stake if Google monopolises search on the web. This lead to the creation and the launch in 2008 of “a digital library, that is a single, direct and multilingual access point to the European cultural heritage. And it is a project that has been heavily supported by the European Commission and the member states.
The value of Europeana is mainly created by its network of content providers and aggregators who deliver data to Europeana. These are mainly museums, libraries, archives and audiovisual collections from all over Europe. There, is for example, the Louvre museum in Paris and the British Library in England delivering data. There are pan- European thematic aggregators of data that are being funded by the European Commission such as Hope- the portal for the social history in Europe, the European Film Gateway and BHL-Europe a project on the biodiversity heritage of Europe. There are also local, regional and national aggregators such as BAM - a digital library project in Baden-Württemberg , cultura italia and culture.fr Currently there are 91 direct data providers but these represent a few hundreds of memory institutions across Europe. And over the past 2,5 years they have delivered more than 19m items to Europeana.
Before going further I’d like to clarify a bit what we mean with data, metadata, items or content in Europeana. Here is a brief results’ pageon the search done for the girl with the pearl earing in Europeana. I click on the thumbnail and
I am directed in the detailed view. In Europeana we make the distinction between the digital object itself which always resides at a content provider’s side and what we call the metadata. The metadata is information about the object, it’s more or less what they call a bibliographic record in the libraries. So, Europeana harvests metadata and a link to the digital object from which it caches a thumbnail. The user is always directed to the provider’s side to view the object itself in its original “context” Europeana is therefore a gateway to the providers’ sites. Needless to say, that this 19m of data is diverse, often rich, most of it of high quality and that these datasets have been harmonised and enriched by Europeana and intermediaries and for this reason constitute a very valuable pool of information…
about the artistic, scientific, Social and political history of Europe and beyond And of everyday culture
So, business was going well for us over the past couple of years- the providers were happily digitising and delivering metadata to Europeana and everyone has been wanting to be visible on Europeana. We have largerly supported this model of vertical and horizontal aggregation of cultural data that proved an interesting business model and lots of new aggregators sprung out. Also for this very important reason which gave them a feeling of safety. We had an agreement with them based on the principles of the creative commons attribution non-commercial license although not quite a CC-BY-NC license
And last year we set on a mission to change the agreement- basically to change, as we were accused, the rules of the game while it was happening. And go for something radical. 19m items of high artistic and scientific value, normalised, enriched and put in the public domain for everyone to re-use without restrictions. Naturally, we opened a can of worms.
Why did we want to do that?
Last year as part of defining our Strategy for the coming years, we asked our stakeholders what values Europeana should be delivering to them for the years to come. Our users said they wanted a trusted source, that would be easy to use and re-use in their school and leisure projects; that they could reach from their regular workflows and customary interfaces, that is, they didn’t want to have to go on europeana.eu to access cultural data.
The cultural institutions said they wanted more visibility to end-users and to politicians and the development of new services with their data that could bring potentially more revenues to them
European politicians said they wanted Europeana to contribute through making culture more accessible to social and economic inclusion. Primarily this would happen by embedding more widely the cultural data in the educational sector in europe. They also expected Europeana to take the leadership in innovating the cultural heritage sector. Through innovation in the cultural sector they expected Europeana to contribute to economic growth in Europe.
We also asked commercial players, such as telecoms, technology companies, search engines and interactive whiteboard developers and they said they want a one-stop-shop to access data and the cultural heritage sector in Europe, that they would be willing to pay for premium services and they valued the brand association Europeana was offering with the cultural sector.
So, it was clear that Europeana cannot be a destination portal any more and that we need to bring the data where the users are To share it on our providers’ sites with wikipedia On academics’ blogs On commercial sites that can help our providers’ generate income On apps that are developed for educational uses and for tourism And to publish it as linked open data in order to make full use of the potential of the semantic web to improve the richness and the functionalities around the data Ultimately we want to do all this to stimulate users’ engagement with culture in order to create more culture, more knowledge and hopefully more creative projects and money for everyone And to do all these at the same time we need a very open license and to provide unobstructed re-use of the data We can basically do very little under the current agreement we have with our providers which imposes a non-commercial clause in the re-use
Why do we want to drop NC? Mainly, because most of the metadata is factual information and there should be no copyright on Mozart’s date of birth We ask the instiutions to withhold information if they think it is that rich that it carries copyright and we ask them to give us o nly what they feel comfortable sharing Most has been created with taxpayers’ money and everyone should have the right to use it for all sorts of purposes. The German national library for example is making a 2m euro per year by selling metadata to other public libraries- that is public money paid twice for the same product. It is anyway very difficult to define the boundaries of NC. Is all commercial activity unwanted? Why should we restrict the national audiovisual archive of france from re-using Europeana data because they are selling content via their website? Is Europeana the appropriate body to police what re-use should be made with the data? We believe that there is more to gain overall by giving up some metadata, and I will come back to this later. And why drop attribution? And this is a very sensitive issue for the memory institutions We believe that attribution is very hard to enforce on the web especially when a long chain of intermediaries is involved and they all want to be credited- what’s been called vanity publishing. Reality though has shown that attribution helps raise the value of the information and that it is common practice in many communities such as wikipedia. So, overal we believe in a standardised license that will allow a minimum threshold for re-use
Now, I’ll come to the arguments in favour and against of open data. I’m pretty sure you’ve heard all this before-it’s same arguments public institutions all over the place give. We explored these over many workshops with memory institutions the past year when we were trying to think with them the risks and the rewards of openly sharing metadata. The institutions acknowledge that: By openly sharing metadata users find the data more easily and there are many and new uses of the data and that this drives traffic to the cultural institution’s site Providers can still commercially exploit own metadata- the British Library, for exampl, that made the UK’s national bibliography available under CC0 declared that it doesn’t expect this to harm the normal exploitation routes and it is using different packaging methods and richer metadata for commercial purposes Memory institutions will help populate the LOD and the web with trusted, quality resource from which the cultural heritage community as a whole wil benefit as the information they make available is more contextualised through increased data interlinking and providers can make use of the the enriched data to develop further their own applications. The cultural sector will thus be poineer and by innovating it is more likely to stay relevant in the digital age and to attract funding
We also explored the risks. Memory institutions are afraid of loss of control and of potential income, of potential damage to their reputation, because apparently everyone will want to publish viagra ads next to the van goghs, etc…. It was obvious that a lot of the arguments were based on inherent fears about change rather than real risks.
It is not easy to convince the whole cultural heritage sector in Europe about the necessity and the value in giving up their metadata for free. The cultural sector has not been the most cutting edge or open to changes sector. Museums and libraries are usually places where changes are slow and where there is a possessive feeling towards data that has been created, organised and curated over many decades of meticulous research and work. These organisations are also mandated to promote access to knowledge for people, from where an oxymoron arises. There is also very little evidence about good practices of open cultural data on the web, and of course a catch 22 situtation because institutions are hesitant to open their data because there I is little evidence of examples of data re-use. So, what we did over the past months was a lot of talking, talking and talking and listening at dedicated workshops with museums, libraries and archives to assess with them the risks and rewards of opening up access to metadata. We held two rounds of consultations with our network on our new license and I will talk about the results later. We had to raise awareness about existing initiatives like the british library publishing their data as CC0 and around new concepts for the cultural sector such as linked open data and business models for open data We had to create evidence about good re-use of the data and we run four hackathons also for commercial applications and an LOD pilot with some pioneers. We had to do some research- we commissioned a paper on the compliance of CC0 with national jurisdictions. And last, but not least, we created a website to exlain our reasoning behind our Open data activities and our new agreements and to keep everyone informed with developments
I will briefly run you through these activities now. This is the website we created about our new agreements
We have published there some metadata principles that Europeana adopted where we state for example that Europeana itself doesn’ t plan to monetise on the use of the metadata.
We’ve published there our draft still guidelines for the use of the metadata where we encourage users to give credit where credit is due. These guidelines will be embedded for example in our LOD
You will find there related documents, like the comite des sages report that was published this year and which recommends that
Metadata related to the digitised objects produced by the cultural institutions should be widely and freely available for re-use.
We link to other open data initiatives in the cultural sector because of course we are not the first or the only ones to be promoting open data in this sector
We also link to our Linked Open data pilot which is as I said, coming out of a need to create evidence for good practice with regards to open data in the cultural heritage sector. Via word of mouth practically, we invited some of our partners to allow us to publish their data as LOD. This 3,5 data is now online but we are in the process of clearing a CC0 attribution for them. If you want to know more, check Data.europeana.eu
Also, in the same context of creating evidence, we run four hackathons in four cities in June. Europeana has an API that due to our current agreement we have restricted its use to our network of providers only. Through the hackathons we wanted to showcase what kind of applications can be developed with Europeana data when developers go creative. There were 48 prototypes developed in 4 categories including a category for commercial potential. And the prizes were given by Mrs Kroes at the Digital Agenda Day at a common ceremony with the Open Knowledge Foundation’s Open Data Challenge which shows the Commission’s interest in supporting open data as part of the Digital Agenda strategy.
The frist prize in the commercial potential went to this application which I hope to be able to show to you. It’s an app for android developed in Poland. You take a picture of an artwork, it uses image recognition, it fetches the Europeana record and reads it out loud for you if you want.
The innovation award went to this app called timemash which is based on geolocation. A user can search on his phone for items in Europeana which are located around him. He comes across a building of which there is an old photo in Europeana and the app helps him take a picture of it today using the same angle. It then creates these then and now comparisons. It geotags the new picture so that other users can easily locate the item and have a try as well.
The Audience Awardw went to Timebook- a facebook like application for historical people whose works are in Europeana. It mashes up content with Dbpedia and shows a portrait of the artist, Links to his friends , namely other contemporary artists and his works
So, where are now
In May this year we held a second round of consultations with our network of cultural insitutions to see what are the chances our new CC0 agreement is signed by them. We got 104 replies out of which 45 was a straight yes and another 43% yes subject to approval by the laywers. Needless to say that this picture depicts an incredible progress since last year when we set off to convince the cultural institutions
There are some big differences though among cultural institutions in their readiness to accept the new agreement. Only 8 % of the libraries said no.
But this number rises to 36 % when it comes down to museums.
So, given the overall positive result we will go ahead with launching the new agreement and we hope to have everyone sign by the end of the year.
But we are not quite there yet and we definitely need to keep up the work on convincing the institutions. There will be more workshops and individual meetings with providers to explain the goals and the means We are creating an animation about LOD We’ll try as well to operationalise some of the apps that have been created in order to show the cool stuff that can eb done. We will be publishing a paper on the business models of Open Data for the Cultural Heritage sector as a result fo the workshop we did with the directors of major cultural institutions And we’ll keep on the advocacy towards politicians. Whether or not cultural institutions are included in the Public Sector Inforamtion Directive the commission and governmetns should impose on all publicly funded digitisation or aggregation projects for example the requirement to deliver open data.
If you are interested further, you’ll find more on our new agreement page or you can write to me. Thank you for your time and I’ll be glad to take any questions now.
Europeana and Open Cultural Heritage Data [email_address] OKCon 1st July 2011
Market: Straightforward route to content Access to the network Premium services Brand Association
Why drop NC and BY? <ul><li>Most of the metadata is factual information </li></ul><ul><li>Most has been created with taxpayers’ money and everyone should have the right to use it </li></ul><ul><li>It is very difficult to define the boundaries of NC </li></ul><ul><li>There is much more to gain by giving up something </li></ul><ul><li>Attribution is very hard to enforce especially when a long chain of intermediaries are involved </li></ul><ul><li>We believe in a standardised license that will allow a minimum threshold for re-use </li></ul>
The Rewards <ul><li>Increased data use & visibility drives traffic to content holder’s site. </li></ul><ul><li>Providers can still commercially exploit own metadata. </li></ul><ul><li>Europeana LOD helps populate linked data cloud with trusted, quality resource . </li></ul><ul><li>Enhances context of information through increased data interlinking. </li></ul><ul><li>Enriched data back to provider, for own applications & users. </li></ul><ul><li>Shows cultural heritage organisations at vanguard of innovation & stimulating digital research. Leads to funding </li></ul><ul><li>Reinforces relevance of their cultural heritage to new generations. </li></ul>
The Risks <ul><li>Loss of control over the channels of access and of authority </li></ul><ul><li>Loss of potential income </li></ul><ul><li>Loss of reputation </li></ul><ul><li>Loss of branding </li></ul><ul><li>Loss of context and of the control over the integrity of the data </li></ul><ul><li>Additional work required </li></ul><ul><li>… workshop participants acknowledged that rather than real risks, these are fears related to change… </li></ul>
The Process <ul><li>Workshops on risks and rewards of open licenses – (September 2010-December 2010) </li></ul><ul><li>Workshops and presentations (APENET, ATHENA, EFG, EUSCREEN) </li></ul><ul><li>Workshop with directors of museums, libraries, archives and av on the business models of open data </li></ul><ul><li>Online consultation with the network between December 2010 and January 2011 </li></ul><ul><li>Second round of consultation with whole network in May </li></ul><ul><li>4 Hackathons in June (Barcelona, Poznan, London, Stockholm) </li></ul><ul><li>LOD pilot </li></ul><ul><li>Paper commissioned on the compatibility of CC0 with German jurisdiction </li></ul><ul><li>Dedicated website about open data and our new agreement </li></ul>
Metadata related to the digitised objects produced by the cultural institutions should be widely and freely available for re-use . Key recommendations, p5
Europeana Linked Open Data Pilot <ul><li>9 direct providers representing </li></ul><ul><li>300 libraries, museums, archives and av collections </li></ul><ul><li>16 countries </li></ul><ul><li>3,5 m records </li></ul><ul><li>Pilot went live in June </li></ul><ul><li>Proof that nothing bad will happen </li></ul><ul><li>It’s a pilot- it’s still subject to change </li></ul><ul><li>We are in the process of clearing CC0 for this data </li></ul><ul><li>Check it out: Data.europeana.eu </li></ul>
Digital Agenda Day API Hackathons <ul><li>Hack4Europe! </li></ul><ul><li>About 85 developers participated </li></ul><ul><li>With a majority being independent developers or representing SMEs </li></ul><ul><li>Creating 48 prototypes </li></ul><ul><li>Why: to showcase the social and commercial value of open cultural data </li></ul><ul><li>With 14 winners in the categories and local special awards </li></ul>
Winner of the Commercial Potential Award: Art4Europe http://www. youtube .com/watch? v=C6PEz2d7OLE
My organization will sign the new DEA (N=104): Overall response clearly positive
My organization will sign the new DEA (libraries N=51) Libraries
Museums My organization will sign the new DEA (museums N=14)
<ul><li>New CC0 agreement will take effect from September onwards </li></ul>
More activities… <ul><li>LOD animation </li></ul><ul><li>Individual meetings with providers </li></ul><ul><li>Lots of workshops </li></ul><ul><li>Operationalise some of the apps </li></ul><ul><li>Paper on the Business Models of Open Data for the Cultural Heritage Sector </li></ul><ul><li>Advocacy towards politicians: ie all EU-funded projects have to make their data available under open licenses and publicly funded digitisation should deliver CC0 metadata </li></ul>