Crowdsourcing for Libraries
- by Sarah J. McCarthy
Crowdsourcing Defined
   Webster’s dictionary defines
    crowdsourcing as “the practice of
    obtaining needs, services, ideas, or
    content by soliciting contributions
    from a large group of people and
    especially from the online
    community rather than from
    traditional employees or suppliers”
    (Crowdsourcing).
Library Crowdsourcing Origins
   Consider the Meteorological Project initiated in 1848
    by the Smithsonian’s first Secretary, Joseph Henry.
   150 volunteers across the country submitted monthly
    weather reports. The reports were published in an
    1861 compilation of climatic data and storm
    observations (Bruno, 2011).
   This project is considered the precursor to the National
    Weather Bureau. (SIA RU000060, Smithsonian
    Institution Meteorological Project, 2011).
Crowdsourcing Benefits
   Accomplish tasks the library doesn’t            ways.
    have staff, time, or finances to achieve.      Validate the worth of the library by
   Add value to/improve the quality of data.       allowing for a high level of public
   Engage the community with the library/          involvement.
    collection and make use of their               Encourage a sense of public ownership
    expertise.                                      and responsibility for the collections
                                                    (Holley, 2010).


   Allow data to be discoverable in new
Case Studies


 Tag/Identif   Build Upon/   Transcribe
 y Materials   Create Data      Text
Tag / Identify Materials
                       Smithsonian
                       Institution
                •   Flickr Commons visitors can add keywords and tags to
                    images; these are then vetted.
                •   Select words are added to the metadata, enhancing the
                    search function and making images easier to find
                    (Oliver, 2011).
                •   Sometimes this leads to new collection materials-- For
                    example, Flickr photographs from the Scopes Monkey
                    Trial prompted a donation of 10 additional photos by
                    Henriatta S. Jenrette, daughter of a trial attendee
                    (Smithsonian National Museum of Natural History, 2011).
Build Upon / Create Data
Transcribe Text
Further Examples
Fast Code Design   LibAnswers   Pay-to-Scan
Challenges with Crowdsourcing

                Error      Design &
Maintainin   Prevention   Technolog     Limited    What to Do
g Interest     / Data     y Concerns   Resources    With the
              Control                                Data
Maintaining Interest
Digitalkoot:
Mole Bridge and Mole
Hunt

“We want to compete against
others, we seek reward and
positive feedback…we want to
demonstrate we can solve tricky
problems and prove how clever
we are.”

--Claudia Pelzer, Crowdsourcing.com
founder
Error Prevention / Data Control
Australian Newspapers Digitization Program (ANDP)
   Newspaper quality was not good, and the OCR did not provide good
    results
   Crowdsourcing was found to be a good option to re-key the text
   Vandalism and lack of familiarity with the project were their primary
    concerns
   Vandalism- addressed by showing end-users the original
    image, having a roll-back plan, and having mandatory login
   Lack of familiarity with the project- addressed by using familiar
    terms, providing reference points, and offering technical help
   It was a huge success– by the end of the first 6 months, over 2 million
    lines of text in more than 100,000 articles had already been corrected
Design and Technology
Concerns
Limited Resources
   Time and resources are often quite limited, but there are collaborative options that can help.
   Flickr Commons is one option– especially useful for image identifying or tagging.
   Clickworker.com is another-- libraries can price out projects to be undertaken by their community
    of over 115,000 individuals. This might be good for time-sensitive projects.
   The George Eastman House currently has a Clickworker project to help catalog and tag over
    400,000 of their images. (George Eastman House taps Clickworker for iconic global
    crowdsourcing project).
What To Do With the Data
“In asking for the public's help in extracting the menus data, we are making an implicit
promise to do something interesting and useful with it… What it means fundamentally is
re-imagining the very roles of librarians and curators, positioning them not only as
custodians of physical collections, but as leaders of online communities.”


-Ben Vershbow, Director of NYPL Labs (Gan, 2011)
CONCLUSION
Note: for the full list of references, see my research paper on this topic.
?   And that concludes my
    presentation…
    Any Questions?
References, p. 1
   (2010). Retrieved November 5, 2011, from Dickens Journals Online: http://www.djo.org.uk/
   Capturing the power of the crowd and the challenge of community collections. (2010, July 27). Retrieved November 3, 2011, from JISC:
    http://www.jisc.ac.uk/publications/programmerelated/2010/communitycollections.aspx
   Old Weather FAQ. (2010). Retrieved November 11, 2011, from OldWeather.com: http://www.oldweather.org/faq
   WWI ships to chart past climate. (2010, October 13). Retrieved November 5, 2011, from bbc.co.uk: http://www.bbc.co.uk/news/science-environment-
    11532534
   Leafsnap, a new mobile app that identifies plants by leaf shape, is launched by Smithsonian and collaborators. (2011, May 2). Retrieved October
    23, 2011, from Smithsonian Science: http://smithsonianscience.org/2011/05/new-mobile-app-that-identifies-plants-by-leaf-shape-launched-by-
    smithsonian-and-columbia-and-maryland-universities/
   SIA RU000060, Smithsonian Institution Meteorological Project. (2011). Retrieved December 9, 2011, from Smithsonian Institution Archives: were
    analyzed by Professor of Mathematics and Natural Philosophy at Lafayette College, James H. Coffi
   About us. (n.d.). Retrieved October 23, 23, from Transcribe Bentham, UCL: http://www.ucl.ac.uk/transcribe-bentham/about/
   Benetti, T. D. (2011, June 16). The secrets of Digitalkoot: Lessons learned crowdsourcing data entry to 50,000 people (for free). Retrieved November
    4, 2011, from Microtask.com: http://blog.microtask.com/2011/06/the-secrets-of-digitalkoot-lessons-learned-crowdsourcing-data-entry-to-50000-people-
    for-free/
   Boesveld, S. (2011, May 30). Can Wikipedia improve students’ work? Retrieved October 25, 2011, from NationalPost.com:
    http://news.nationalpost.com/2011/05/30/can-wikipedia-improve-students-work/
   Bruno, E. (2011, April 14). Smithsonian crowdsourcing since 1847! Retrieved October 5, 2011, from Smithsonian Institution Archives:
    http://siarchives.si.edu/blog/smithsonian-crowdsourcing-1847
References, p. 2
   Crowdsourcing. (n.d.). Retrieved October 15, 2011, from M-W.com: http://www.merriam-webster.com/dictionary/crowdsourcing?show=0&t=1319399161
   Davey, N. (2010, January 7). Ross Dawson: Six tools to kickstart your crowdsourcing strategy. Retrieved November 11, 2011, from mycustomer.com:
    http://www.mycustomer.com/topic/customer-intelligence/ross-dawson-six-tools-start-your-crowdsourcing-strategy/109914#
   Deacon, H. (2010, April 10). Involving archive users in digitising archival collections . Retrieved October 12, 2011, from The Archival Platform:
    http://www.archivalplatform.org/blog/entry/involve_users/
   Friedman, S. (2011, July 14). Finding the future: inside NYPL's all-night scavenger hunt. Retrieved October 15, 2011, from LibraryJournal.com:
    http://www.libraryjournal.com/lj/community/libraryculture/890973-271/finding_the_future_inside_nypls.html.csp
   Front page news and development. (n.d.). Retrieved November 1, 2011, from Distributed Proofreaders: http://www.pgdp.net/c/
   Gan, V. (2011, September 16). All hands on deck: NYPL turns to the crowd to develop digital collections. Retrieved September 17, 2011, from
    HuffingtonPost.com: http://www.huffingtonpost.com/the-new-york-public-library/all-hands-on-deck-nypl-tu_b_966057.html
   Crowdsourcing. (n.d.). Retrieved October 15, 2011, from M-W.com: http://www.merriam-webster.com/dictionary/crowdsourcing?show=0&t=1319399161
   Davey, N. (2010, January 7). Ross Dawson: Six tools to kickstart your crowdsourcing strategy. Retrieved November 11, 2011, from mycustomer.com:
    http://www.mycustomer.com/topic/customer-intelligence/ross-dawson-six-tools-start-your-crowdsourcing-strategy/109914#
   Deacon, H. (2010, April 10). Involving archive users in digitising archival collections . Retrieved October 12, 2011, from The Archival Platform:
    http://www.archivalplatform.org/blog/entry/involve_users/
   Friedman, S. (2011, July 14). Finding the future: inside NYPL's all-night scavenger hunt. Retrieved October 15, 2011, from LibraryJournal.com:
    http://www.libraryjournal.com/lj/community/libraryculture/890973-271/finding_the_future_inside_nypls.html.csp
References, p. 3
   Front page news and development. (n.d.). Retrieved November 1, 2011, from Distributed Proofreaders: http://www.pgdp.net/c/
   Gan, V. (2011, September 16). All hands on deck: NYPL turns to the crowd to develop digital collections. Retrieved September 17, 2011, from
    HuffingtonPost.com: http://www.huffingtonpost.com/the-new-york-public-library/all-hands-on-deck-nypl-tu_b_966057.html
   http://lens.blogs.nytimes.com/2011/07/22/using-new-tools-mapping-old-brooklyn/
   Oliver, B. (2011, January 28). Tag! You're it! And a survey... Retrieved October 10, 2011, from Smithsonian Libraries:
    http://smithsonianlibraries.si.edu/smithsonianlibraries/2011/01/tag-youre-it.html
   Pelzer, C. (2011, September 3). How gamification drives crowdsourcing. Retrieved November 5, 2011, from Crowdsourcing.org:
    http://www.crowdsourcing.org/editorial/how-gamification-drives-crowdsourcing/6419
   Progress update, 15 to 21 October 2011. (n.d.). Retrieved October 23, 2011, from Transcribe Bentham, UCL: http://www.ucl.ac.uk/transcribe-bentham/
   Smithsonian National Museum of Natural History. (2011, April 18). Field book project now on Flickr Commons. Retrieved October 10, 2011, from Field
    book project: http://nmnh.typepad.com/fieldbooks/2011/04/flickr.html
   Sommer, L. (2011, August 16). Mapping project reveals pre-1906 quake San Francisco. Retrieved October 21, 2011, from KQED News:
    http://blogs.kqed.org/newsfix/2011/08/16/mapping-project-reveals-pre-earthquake-san-francisco/
   Springer, M., Dulabahn, B., Michel, P., Natanson, B., Reser, D., Woodward, D., et al. (2008, October 30). For the common good: the Library of Congress
    Flickr pilot project. Retrieved November 5, 2011, from LOC.gov: http://www.loc.gov/rr/print/flickr_report_final.pdf
   Stein, B. (2011, April 12). Crowdsourcing science history: NIST digital archives seeks help in identifying mystery artifacts. Retrieved October
    23, 2011, from NIST Tech Beat: http://www.nist.gov/director/archives-041211.cfm
References, p. 4
   The Great War Archive. (n.d.). Retrieved November 1, 2011, from The First World War poetry digital archive: http://www.oucs.ox.ac.uk/ww1lit/gwa
   The University of Iowa. (2011). Civil War diaries and letters transcription project. Retrieved November 5, 2011, from The University of Iowa Libraries:
    http://digital.lib.uiowa.edu/cwd/transcripts.html
   Tischler, L. (n.d.). How Twitter And Facebook helped Bing Thom design a public library. Retrieved November 11, 2011, from FastCodeDesign.com:
    http://www.fastcodesign.com/1664711/how-twitter-and-facebook-helped-bing-thom-design-a-public-library
   University of Oxford Ancient Lives. (n.d.). Retrieved November 11, 2011, from AncientLives.com: http://www.ancientlives.org/transcribe
   What is AquaBrowser Library? (n.d.). Retrieved October 31, 2011, from Serial Solutions Media Lab: http://www.medialab.nl/aquabrowser.html
   What's on the menu? (n.d.). Retrieved October 5, 2011, from NYPL.org: http://menus.nypl.org/about

Crowdsourcing

  • 1.
    Crowdsourcing for Libraries -by Sarah J. McCarthy
  • 2.
    Crowdsourcing Defined  Webster’s dictionary defines crowdsourcing as “the practice of obtaining needs, services, ideas, or content by soliciting contributions from a large group of people and especially from the online community rather than from traditional employees or suppliers” (Crowdsourcing).
  • 3.
    Library Crowdsourcing Origins  Consider the Meteorological Project initiated in 1848 by the Smithsonian’s first Secretary, Joseph Henry.  150 volunteers across the country submitted monthly weather reports. The reports were published in an 1861 compilation of climatic data and storm observations (Bruno, 2011).  This project is considered the precursor to the National Weather Bureau. (SIA RU000060, Smithsonian Institution Meteorological Project, 2011).
  • 4.
    Crowdsourcing Benefits  Accomplish tasks the library doesn’t ways. have staff, time, or finances to achieve.  Validate the worth of the library by  Add value to/improve the quality of data. allowing for a high level of public  Engage the community with the library/ involvement. collection and make use of their  Encourage a sense of public ownership expertise. and responsibility for the collections (Holley, 2010).  Allow data to be discoverable in new
  • 5.
    Case Studies Tag/Identif Build Upon/ Transcribe y Materials Create Data Text
  • 6.
    Tag / IdentifyMaterials Smithsonian Institution • Flickr Commons visitors can add keywords and tags to images; these are then vetted. • Select words are added to the metadata, enhancing the search function and making images easier to find (Oliver, 2011). • Sometimes this leads to new collection materials-- For example, Flickr photographs from the Scopes Monkey Trial prompted a donation of 10 additional photos by Henriatta S. Jenrette, daughter of a trial attendee (Smithsonian National Museum of Natural History, 2011).
  • 7.
    Build Upon /Create Data
  • 8.
  • 9.
    Further Examples Fast CodeDesign LibAnswers Pay-to-Scan
  • 10.
    Challenges with Crowdsourcing Error Design & Maintainin Prevention Technolog Limited What to Do g Interest / Data y Concerns Resources With the Control Data
  • 11.
    Maintaining Interest Digitalkoot: Mole Bridgeand Mole Hunt “We want to compete against others, we seek reward and positive feedback…we want to demonstrate we can solve tricky problems and prove how clever we are.” --Claudia Pelzer, Crowdsourcing.com founder
  • 12.
    Error Prevention /Data Control Australian Newspapers Digitization Program (ANDP)  Newspaper quality was not good, and the OCR did not provide good results  Crowdsourcing was found to be a good option to re-key the text  Vandalism and lack of familiarity with the project were their primary concerns  Vandalism- addressed by showing end-users the original image, having a roll-back plan, and having mandatory login  Lack of familiarity with the project- addressed by using familiar terms, providing reference points, and offering technical help  It was a huge success– by the end of the first 6 months, over 2 million lines of text in more than 100,000 articles had already been corrected
  • 13.
  • 14.
    Limited Resources  Time and resources are often quite limited, but there are collaborative options that can help.  Flickr Commons is one option– especially useful for image identifying or tagging.  Clickworker.com is another-- libraries can price out projects to be undertaken by their community of over 115,000 individuals. This might be good for time-sensitive projects.  The George Eastman House currently has a Clickworker project to help catalog and tag over 400,000 of their images. (George Eastman House taps Clickworker for iconic global crowdsourcing project).
  • 15.
    What To DoWith the Data “In asking for the public's help in extracting the menus data, we are making an implicit promise to do something interesting and useful with it… What it means fundamentally is re-imagining the very roles of librarians and curators, positioning them not only as custodians of physical collections, but as leaders of online communities.” -Ben Vershbow, Director of NYPL Labs (Gan, 2011)
  • 16.
    CONCLUSION Note: for thefull list of references, see my research paper on this topic.
  • 17.
    ? And that concludes my presentation… Any Questions?
  • 18.
    References, p. 1  (2010). Retrieved November 5, 2011, from Dickens Journals Online: http://www.djo.org.uk/  Capturing the power of the crowd and the challenge of community collections. (2010, July 27). Retrieved November 3, 2011, from JISC: http://www.jisc.ac.uk/publications/programmerelated/2010/communitycollections.aspx  Old Weather FAQ. (2010). Retrieved November 11, 2011, from OldWeather.com: http://www.oldweather.org/faq  WWI ships to chart past climate. (2010, October 13). Retrieved November 5, 2011, from bbc.co.uk: http://www.bbc.co.uk/news/science-environment- 11532534  Leafsnap, a new mobile app that identifies plants by leaf shape, is launched by Smithsonian and collaborators. (2011, May 2). Retrieved October 23, 2011, from Smithsonian Science: http://smithsonianscience.org/2011/05/new-mobile-app-that-identifies-plants-by-leaf-shape-launched-by- smithsonian-and-columbia-and-maryland-universities/  SIA RU000060, Smithsonian Institution Meteorological Project. (2011). Retrieved December 9, 2011, from Smithsonian Institution Archives: were analyzed by Professor of Mathematics and Natural Philosophy at Lafayette College, James H. Coffi  About us. (n.d.). Retrieved October 23, 23, from Transcribe Bentham, UCL: http://www.ucl.ac.uk/transcribe-bentham/about/  Benetti, T. D. (2011, June 16). The secrets of Digitalkoot: Lessons learned crowdsourcing data entry to 50,000 people (for free). Retrieved November 4, 2011, from Microtask.com: http://blog.microtask.com/2011/06/the-secrets-of-digitalkoot-lessons-learned-crowdsourcing-data-entry-to-50000-people- for-free/  Boesveld, S. (2011, May 30). Can Wikipedia improve students’ work? Retrieved October 25, 2011, from NationalPost.com: http://news.nationalpost.com/2011/05/30/can-wikipedia-improve-students-work/  Bruno, E. (2011, April 14). Smithsonian crowdsourcing since 1847! Retrieved October 5, 2011, from Smithsonian Institution Archives: http://siarchives.si.edu/blog/smithsonian-crowdsourcing-1847
  • 19.
    References, p. 2  Crowdsourcing. (n.d.). Retrieved October 15, 2011, from M-W.com: http://www.merriam-webster.com/dictionary/crowdsourcing?show=0&t=1319399161  Davey, N. (2010, January 7). Ross Dawson: Six tools to kickstart your crowdsourcing strategy. Retrieved November 11, 2011, from mycustomer.com: http://www.mycustomer.com/topic/customer-intelligence/ross-dawson-six-tools-start-your-crowdsourcing-strategy/109914#  Deacon, H. (2010, April 10). Involving archive users in digitising archival collections . Retrieved October 12, 2011, from The Archival Platform: http://www.archivalplatform.org/blog/entry/involve_users/  Friedman, S. (2011, July 14). Finding the future: inside NYPL's all-night scavenger hunt. Retrieved October 15, 2011, from LibraryJournal.com: http://www.libraryjournal.com/lj/community/libraryculture/890973-271/finding_the_future_inside_nypls.html.csp  Front page news and development. (n.d.). Retrieved November 1, 2011, from Distributed Proofreaders: http://www.pgdp.net/c/  Gan, V. (2011, September 16). All hands on deck: NYPL turns to the crowd to develop digital collections. Retrieved September 17, 2011, from HuffingtonPost.com: http://www.huffingtonpost.com/the-new-york-public-library/all-hands-on-deck-nypl-tu_b_966057.html  Crowdsourcing. (n.d.). Retrieved October 15, 2011, from M-W.com: http://www.merriam-webster.com/dictionary/crowdsourcing?show=0&t=1319399161  Davey, N. (2010, January 7). Ross Dawson: Six tools to kickstart your crowdsourcing strategy. Retrieved November 11, 2011, from mycustomer.com: http://www.mycustomer.com/topic/customer-intelligence/ross-dawson-six-tools-start-your-crowdsourcing-strategy/109914#  Deacon, H. (2010, April 10). Involving archive users in digitising archival collections . Retrieved October 12, 2011, from The Archival Platform: http://www.archivalplatform.org/blog/entry/involve_users/  Friedman, S. (2011, July 14). Finding the future: inside NYPL's all-night scavenger hunt. Retrieved October 15, 2011, from LibraryJournal.com: http://www.libraryjournal.com/lj/community/libraryculture/890973-271/finding_the_future_inside_nypls.html.csp
  • 20.
    References, p. 3  Front page news and development. (n.d.). Retrieved November 1, 2011, from Distributed Proofreaders: http://www.pgdp.net/c/  Gan, V. (2011, September 16). All hands on deck: NYPL turns to the crowd to develop digital collections. Retrieved September 17, 2011, from HuffingtonPost.com: http://www.huffingtonpost.com/the-new-york-public-library/all-hands-on-deck-nypl-tu_b_966057.html  http://lens.blogs.nytimes.com/2011/07/22/using-new-tools-mapping-old-brooklyn/  Oliver, B. (2011, January 28). Tag! You're it! And a survey... Retrieved October 10, 2011, from Smithsonian Libraries: http://smithsonianlibraries.si.edu/smithsonianlibraries/2011/01/tag-youre-it.html  Pelzer, C. (2011, September 3). How gamification drives crowdsourcing. Retrieved November 5, 2011, from Crowdsourcing.org: http://www.crowdsourcing.org/editorial/how-gamification-drives-crowdsourcing/6419  Progress update, 15 to 21 October 2011. (n.d.). Retrieved October 23, 2011, from Transcribe Bentham, UCL: http://www.ucl.ac.uk/transcribe-bentham/  Smithsonian National Museum of Natural History. (2011, April 18). Field book project now on Flickr Commons. Retrieved October 10, 2011, from Field book project: http://nmnh.typepad.com/fieldbooks/2011/04/flickr.html  Sommer, L. (2011, August 16). Mapping project reveals pre-1906 quake San Francisco. Retrieved October 21, 2011, from KQED News: http://blogs.kqed.org/newsfix/2011/08/16/mapping-project-reveals-pre-earthquake-san-francisco/  Springer, M., Dulabahn, B., Michel, P., Natanson, B., Reser, D., Woodward, D., et al. (2008, October 30). For the common good: the Library of Congress Flickr pilot project. Retrieved November 5, 2011, from LOC.gov: http://www.loc.gov/rr/print/flickr_report_final.pdf  Stein, B. (2011, April 12). Crowdsourcing science history: NIST digital archives seeks help in identifying mystery artifacts. Retrieved October 23, 2011, from NIST Tech Beat: http://www.nist.gov/director/archives-041211.cfm
  • 21.
    References, p. 4  The Great War Archive. (n.d.). Retrieved November 1, 2011, from The First World War poetry digital archive: http://www.oucs.ox.ac.uk/ww1lit/gwa  The University of Iowa. (2011). Civil War diaries and letters transcription project. Retrieved November 5, 2011, from The University of Iowa Libraries: http://digital.lib.uiowa.edu/cwd/transcripts.html  Tischler, L. (n.d.). How Twitter And Facebook helped Bing Thom design a public library. Retrieved November 11, 2011, from FastCodeDesign.com: http://www.fastcodesign.com/1664711/how-twitter-and-facebook-helped-bing-thom-design-a-public-library  University of Oxford Ancient Lives. (n.d.). Retrieved November 11, 2011, from AncientLives.com: http://www.ancientlives.org/transcribe  What is AquaBrowser Library? (n.d.). Retrieved October 31, 2011, from Serial Solutions Media Lab: http://www.medialab.nl/aquabrowser.html  What's on the menu? (n.d.). Retrieved October 5, 2011, from NYPL.org: http://menus.nypl.org/about