1. Are Memory Institutions Ready for Open Data and
Crowdsourcing?
Results of a Pilot Survey from Switzerland
Beat Estermann, 5 August 2013 – OpenSym/WikiSym, Hong Kong
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
2. Recent Trends in the GLAM sector…
2
Coordinated Digitization Efforts
Single-Point-of-Access
EU: Lund Action Plan for
Digitization (2001)
Increased cooperation and
coordination among GLAMs:
- common catalogues
- virtual libraries
- coordination of digitization efforts
- long-term archiving
Wikimedia Commons, User:Dvortygirl (CC-by-sa)
Source: http://www.europeana.eu/
4. 4
Linked Open Data
Crowdsourcing / Collaborative Content Creation
Free Licensing / Open Data
Open Data / Content:
- «freely» re-usable
- machine readable
«Web of Data» /
Semantic Web
- RDF triples
- unique URLs
Crowdsourcing
Approaches:
- Correction
- Classification
- Contextualisation
- Co-curation
- Complementing
collections
- Crowdfunding
See: Oomen / Aroyo 2011
Source: https://commons.wikimedia.org/wiki/Commons:Bundesarchiv and http://www.flickr.com/groups/greatwararchive
Source: http://www.wikiarthistory.info (CC-by-sa)
Source: http://www.creativecommons.org
5. Where do Swiss GLAMs stand today with regard to…?
5
…Digitization?
…Exchange of metadata in multilateral cooperations?
…Open Data?
…Crowdsourcing?
…Linked Open Data?
What are the perceived risks and opportunities? (drivers vs. hindering factors)
What are the expected benefits? Who are the beneficiaries?
Awareness Evaluation AdoptionTrialInterest
Innovation Diffusion Model,
Everett Rogers, 1962
6. Pilot Study among Swiss GLAMs
GLAMs in Switzerland:
• ca. 600-700 independent GLAMs of national or regional significance
• ca. 1000 independent GLAMs organized in three umbrella organizations
Our sample: memory institutions of national significance in the German-speaking
part of Switzerland
• 197 organisations contacted (233 e-mail addresses)
• 72 questionnaires completed (34% of the contacted organisations)
Caveats:
• The sample is rather small (results are not very precise with regard to the
entire Swiss GLAM population, large confidence intervals apply)
• Archives are over-represented in the sample (higher response rate);
museums and «other institutions» are under-represented; libraries are about
average. 6
7. Innovation Diffusion among Swiss GLAMs: The Overall Picture
7
A critical mass has been reached.
How about the laggards?
Will we see a higher rate of adoption for
Open Data than for Crowdsourcing?
Some institutions are starting to think
about Linked Data…
8. Digitization and Availability on the Internet
8
42%
23% 11%
17%
37%
32%
Metadata Reproductions of
memory objects
Background
information
Availability on the Internet
(in % of institutions, N=71)
"is partly the case"
"is the case"
60% of institutions make metadata and reproductions at least partly
available on the Internet. 40% still don’t!
9. Exchange of Metadata / Cooperation in Networks
61% of the responding GLAMs exchange metadata with other institutions. 39% don’t.
30% do so in the context of bilateral cooperation; 43% in the context of multilateral
cooperation.
For 29% the exchange of metadata is part of their core mission. 17% say this is partly the
case.
9
61%
39%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
yes no
Do you exchange metadata
with other institutions?
(in % of institutions; N=72)
15%
35% 29%
3%
15%
8% 17%
3%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
The exchange of metadata is important for us... (in % of institutions; N=72)
"is partly the case"
"is the case"
10. Metadata: Need for Improvement
10
11%
42%
21%
25%
10%
43%
23% 24%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
urgent need need in the
medium term
no need no answer
Metadata: Need for improvement? (in % of institutions; N=71)
Quality of metadata
(accuracy, completeness, up-to-
dateness, clarity, availability)
Interoperability of metadata
(availability in digital
format, conformity with standards)
Ca. 50% of GLAMs perceive a need to improve their metadata.
The needs to improve metadata quality and interoperability are highly
correlated. – Does the envisioned exchange of metadata lead to higher
quality requirements?
25% of responding GLAMs couldn’t answer this question. – What does
this mean?
11. Metadata: What needs to be improved?
11
9%
51%
16%
30%
40% 42%
26%
60%
33%
40%
37%
23%
26%
28%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
accuracy completeness up-to-dateness clarity availability digitization conformity with
current
exchange
formats
Metadata: What needs to be improved? (in % of institutions; N=43)
"is partly the case"
"is the case"
The main challenges: completeness, availability, digitization
12. Open Data Readiness
12
7%
1%
32%
7%
21%
51%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
for charitable projects, such as
Wikipedia, which also permit
commercial use
for users who are intending to
commercially exploit them
The memory objects are available on the Internet... (in % of institutions; N=68)
not accessible for free
accessible at no charge (but you are
not allowed to modify them)
"freely" accessible
Between 1% and 7% of responding GLAMs make scans/photographs of their
heritage objects «freely» available on the Internet. Over half of them make them
available on the Internet, but with restrictions. 40% don’t make them available at all.
Over 50% of the GLAMs which make their heritage objects available on the Internet
do not understand that you cannot make works available for Wikipedia and
simultaneously prevent their modification and/or their commercial use!
13. Desirability and Importance of Open Data
13
0%
1%
6% 6%
7%
36%
25%
11%
6%
3%
0%
5%
10%
15%
20%
25%
30%
35%
40%
-10 to -
8
-8 to -6 -6 to -4 -4 to -2 -2 to 0 0 to 2 2 to 4 4 to 6 6 to 8 8 to 10
Desirability of Open Data (in % of institutions, N=71)
1%
8% 7%
3%
21%
31%
8% 14%
6%
0%
5%
10%
15%
20%
25%
30%
35%
very
important
important neither, nor unimportant no answer
Importance / Desirability of Open Data
(in % of institutions; N=71)
risks prevail opportunities prevail
For over 80% of responding GLAMs the opportunities outweigh the risks of
Open Data.
Over 50% think Open Data is an important issue; almost all of these believe
that the opportunities outweigh the risks.
14. Open Data / “Free” Licensing of Content
14
59%
76%
60%
29%
7%
69%
40%
21%
19%
23%
26%
9%
20%
34%
1%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
For private useFor education and researchFor charitable projectsFor charitable projects, such as Wikipedia, which also permit
commercial use
For users who are intending to commercially exploit themOnly if the name of the institution remains attached to the dataOnly if the work will be re-used in unmodified form
Conditions under which they would make memory objects freely accessible on the Internet
(in % der Institutionen; N=70)
"is partly the case"
"is the case"
Most GLAMs wouldn’t readily agree to «freely» license their content – even in
the absence of third party rights: they would like to prevent the commercial use
at no charge as well as the modification of works.
15. 15
Crowdsourcing
11%
4%
14%
3%
6% 1%0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Wikipedia Wikimedia
Commons
Flickr
Commons
others
Are any of your staff members engaging in projects which support open
data or collaborative projects on the Internet? (in % of institutions; N=71)
in their spare time
as part of their professional
activity
11% of responding GLAMs have staff members who contribute to Wikipedia as
part of their professional activity.
10% of responding GLAMs say that online volunteering plays partly an
important role for them.
Interestingly, no correlation was found between the two variables.
16. 16
Desirability and Importance of Crowdsourcing
4%
15%
19%
11%
43%
3% 3% 1%
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
-10 to -
8
-8 to -6 -6 to -4 -4 to -2 -2 to 0 0 to 2 2 to 4 4 to 6 6 to 8 8 to 10
Desirability of Crowdsourcing (in % of institutions; N=69)
10%
25%
14%
29%
16%
3%
1%
1%
0%
5%
10%
15%
20%
25%
30%
35%
very
important
important neither, nor unimportant no answer
Importance / Desirability of Crowdsourcing
(in % of institutions; N=69)
risks prevail opportunities prevail
For over 90% of the responding GLAMs the risks of Crowdsourcing are at least
as great as the opportunities. For half of them the risks clearly prevail.
Among GLAMs which think that Crowdsourcing is an important issue, the risk
perception is equally high.
17. Linked Data / Semantic Web
17
6%
23%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Is „Linked Data“ / „Semantic Web“ an issue for your
institution?
(in % of institutions; N=71)
Yes, it is an issue, but we haven't
planned any projects yet
Yes, we have already planned
projects in this area
29% of responding GLAMs say that Linked Data is an issue for them.
None of them has a running project.
18. Recapitulation
18
59%
60%
43%
53%
81%
7%
1%
38%
7%
10%
11%
29%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%100%
Metadata available on the Internet
Photos/scans of memory object available on the Internet
Exchange of metadata takes place and is important
Open Data is important
Open Data is desirable
Readiness to make data available for Wikipedia
Readiness to make data available for commercial use
Crowdsourcing is important
Crowdsourcing is desirable
Importance of online-volunteer work
Professional engagement in Wikipedia
Linked Data is an issue
Different dynamics for
Open Data and
Crowdsourcing
60% of responding
GLAMs are technically
ready for Open Data.
20. Open Data: Risks
20
66%
34% 32% 28%
18%
25%
3%
20%
34% 34%
23%
17%
34%
11%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Time effort and
expense for
making them
available
The use of the
data cannot be
controlled
Copyright
infringements
Infringements of
data protection
regulations
Divulgation of
classified
information
Increased time
effort in order to
respond to
enquiries
Loss of
revenues
What are the risks of open data for your institution? (in % of institutions; N=71)
"is partly the case"
"is the case"
Major risk: extra time effort and expenses
Considerable risks: loss of control, copyright, data protection, secrecy infringements
Almost no risk: Loss of revenues
21. Crowdsourcing: Opportunities
21
6% 1% 4%
11%
3%
24%
24% 21%
20%
14%
21%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Correction and
transcription
tasks
Enhancement
and expansion
of texts
Completion of
collections
(contribution /
identification of
additional
objects)
Classification /
completion of
metadata
Co-curators Crowdfunding
(fundraising)
What are the opportunities of crowdsourcing for your institution?
(in % of institutions; N=71)
"is partly the case"
"is the case"
Crowdsourcing is most likely to be employed for classification tasks.
22. Crowdsourcing: Risks
22
35%
42%
35% 38%
30%
6%
26%
30%
35% 28%
30%
17%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Unforeseeable
results
Considerable
time/effort
needed for
preparation
and follow-up
Difficulties in
estimating the
time-effort
No guarantee
concerning
long-term data
maintenance
Low level of
planning
reliability
Fears among
employees
(job
loss, changing
roles and
tasks)
What are the risks of crowdsourcing from your point of view? (in % of institutions; N=69)
"is partly the case"
"is the case"
All the enumerated risks are rated about the same, except for fears among
employees which seem to play a minor role.
23. Economic Considerations
• Extra time effort and expenses are seen as the greatest
risks/shortcomings of Open Data and Crowdsourcing.
• Expected losses of revenue play virtually no role.
• The sale of image rights is evaluated at < 0.5 % of overall revenues
• Lending fees at 1% of overall revenues
• While the responding GLAMs may perceive at least some efficiency
gains related to Open Data, they do not perceive any potential
economies associated to Crowdsourcing (yet).
23
24. Outlook / Next Steps
• Promote the study among GLAMs and political actors in Switzerland
• Orient GLAM outreach activities in the light of the findings
• Promote “free” licensing at a large scale, cf. OpenGLAM Principles
• Foster mutual learning in the area of Crowdsourcing and (Linked) Open
Data (OpenGLAM Network); make sure that benefits are achieved and
documented; improve coordination along the supply-chain
• Examine ways to improve digitization coverage
• Evaluate the demand for follow-up studies:
• Study with a larger sample in Switzerland
• Longitudinal study in Switzerland
(e.g. similar survey in 2014 to measure the changes)
• International benchmark study
Please contact me if you are interested!
24
25. Thanks for Your Attention!
Full study report:
English: http://tinyurl.com/SwissGLAMsurvey
Deutsch: http://tinyurl.com/GLAMStudie
Contact details:
Beat Estermann
E-mail: beat.estermann@bfh.ch
Phone: +41 31 848 34 38
Affiliations:
Research Associate, E-Government Institute, Bern University of Applied Sciences
Member of opendata.ch (Swiss Chapter of the Open Knowledge Foundation)
Member of Digitale Allmend (Swiss Chapter of CreativeCommons)
Member of Wikimedia CH
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License. 25
Editor's Notes
CoordinatedDigitizationEfforts (2000)Single-Point-of-Access Offers (2000..)Web2.0, Personnalization (2005..)Crowdsourcing (2006..)Open Data (2009..)Linked Open Data (2010..)
CoordinatedDigitizationEfforts (2000)Single-Point-of-Access Offers (2000..)Web2.0, Personnalization (2005..)Crowdsourcing (2006..)Open Data (2009..)Linked Open Data (2010..)
CoordinatedDigitizationEfforts (2000)Single-Point-of-Access Offers (2000..)Web2.0, Personnalization (2005..)Crowdsourcing (2006..)Open Data (2009..)Linked Open Data (2010..)
Q: There is a trend among memory institutions to make reproductions / content of their objects freely available on the internet.Under which conditions could you imagine making reproductions / content of your objects available on the internet free of charge, without earning any extra money?(Provided that the contents are already available in digital format and are free from third parties’ copyright claims or confidentiality restrictions.)