Crowd-sourcing the creation of “articles” within the Biodiversity Heritage Library<br />Bianca Crowley<br />crowleyb@si.ed...
The BHL is…<br />A consortium of 13 natural history, botanical libraries and research institutions<br />An open access dig...
Problem: Books vs. Articles<br />Librarians manage books<br />Users need articles<br />BHL<br />LITA 2011<br />
Solution: “Article-ization”<br />Creating articles manually, through the help of our users: BHL PDF Generator<br />Creatin...
LITA 2011<br />BHL<br />
Create-your-own PDF<br />BHL<br />LITA 2011<br />
Citebank today: http://citebank.org<br />BHL<br />LITA 2011<br />
What is an “article” anyway?<br />BHL<br />LITA 2011<br />
the Good, the Bad, the Ugly<br />BHL<br />LITA 2011<br />
the Good, the Bad, the Ugly<br />BHL<br />LITA 2011<br />
the Good, the Bad, the Ugly<br />BHL<br />LITA 2011<br />
Questions for Data Analysis<br />What is the quality, or accuracy, of user provided metadata?<br />What kinds of content a...
Stats<br />Jan 2010-Apr 2011 	<br />Approx 60,000 pdfs created from PDF Generator<br />40% of those (approx 24,000) were i...
Methodological approach<br />Quantitative – numerical rating system<br />Rated titles, authors, beg/end pages<br />Its “fi...
Ratings System<br />Title <br />1=has all characters in title letter for letter<br />2=does not have all characters in tit...
Ratings System<br />Author<br />1=has all characters in author(s) last name letter for letter<br />2=has at least one auth...
Ratings System<br />Article beginning & ending pages<br />1=has all text pages for an article, from start to end<br />2=su...
Analysis steps<br />LITA 2011<br />
Results<br />LITA 2011<br />BHL<br />
What did we learn?<br />Ratings were better than we expected<br />Many users took the time to create decent metadata <br /...
But of course…..<br />there’s always room for improvement<br />Other factors<br />BHL-Australia’s new portalhttp://bhl.ala...
Changes we madefor UI so far<br /><ul><li>Asking users if they want to contribute their article to CiteBank
Making article title a required field and validating it so its at least 2 or more characters
 Review button for users to review page selections and metadata (inspired by BHL-AUS)
Reduced text and increased more intuitive graphics (inspired by BHL-AUS)</li></ul>BHL<br />LITA 2011<br />
Upcoming SlideShare
Loading in...5
×

Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library

933

Published on

An analysis of crowd-sourced "article" creation and user-generated metadata for a digital repository of biodiversity literature

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
933
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
7
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Add link: http://biodiversitylibrary.org/item/54249
  • Highlight row?Show article in CB
  • Crowd-sourcing the creation of "articles" within the Biodiversity Heritage Library

    1. 1. Crowd-sourcing the creation of “articles” within the Biodiversity Heritage Library<br />Bianca Crowley<br />crowleyb@si.edu<br />Trish Rose-Sandler<br />trish.rose-sandler@mobot.org<br />
    2. 2. The BHL is…<br />A consortium of 13 natural history, botanical libraries and research institutions<br />An open access digital library for legacy biodiversity literature.<br />An open data repository of taxonomic names and bibliographic information<br />An increasingly global effort<br />BHL<br />LITA 2011<br />
    3. 3. Problem: Books vs. Articles<br />Librarians manage books<br />Users need articles<br />BHL<br />LITA 2011<br />
    4. 4. Solution: “Article-ization”<br />Creating articles manually, through the help of our users: BHL PDF Generator<br />Creating articles through automated means: BioStorhttp://biostor.org/issn/0006-324X<br />Page, R. (2011). Extracting scientific articles from a large digital archive: BioStor and the Biodiversity Heritage Library. BMC Bioinformatics, 12(187). Retrieved from http://www.biomedcentral.com/1471-2105/12/187<br />BHL<br />LITA 2011<br />
    5. 5. LITA 2011<br />BHL<br />
    6. 6. Create-your-own PDF<br />BHL<br />LITA 2011<br />
    7. 7. Citebank today: http://citebank.org<br />BHL<br />LITA 2011<br />
    8. 8. What is an “article” anyway?<br />BHL<br />LITA 2011<br />
    9. 9. the Good, the Bad, the Ugly<br />BHL<br />LITA 2011<br />
    10. 10. the Good, the Bad, the Ugly<br />BHL<br />LITA 2011<br />
    11. 11. the Good, the Bad, the Ugly<br />BHL<br />LITA 2011<br />
    12. 12. Questions for Data Analysis<br />What is the quality, or accuracy, of user provided metadata?<br />What kinds of content are users creating?<br />How can we improve the PDF generator interface?<br />BHL<br />LITA 2011<br />
    13. 13. Stats<br />Jan 2010-Apr 2011 <br />Approx 60,000 pdfs created from PDF Generator<br />40% of those (approx 24,000) were ingested into CiteBank(PDFs without user-contributedmetadata excluded)<br />5 reviewers analyzed 945 pdfs (approx 3.9% of the 24,000+ articles going into Citebank)<br />**Thanks to reviewers Gilbert Borrego, Grace Costantino, and Sue Graves from the Smithsonian Institution <br />BHL<br />LITA 2011<br />
    14. 14. Methodological approach<br />Quantitative – numerical rating system<br />Rated titles, authors, beg/end pages<br />Its “findability” within CiteBank search often determined how it was rated<br />BHL<br />LITA 2011<br />
    15. 15. Ratings System<br />Title <br />1=has all characters in title letter for letter<br />2=does not have all characters in title letter for letter but still findable in CiteBank search <br />3= does not have all characters in title letter for letter and is NOT findable via the CiteBank search<br />LITA 2011<br />BHL<br />
    16. 16. Ratings System<br />Author<br />1=has all characters in author(s) last name letter for letter<br />2=has at least one author’s last name spelled correctly<br />3=has no authors or none of the author’s last names are spelled correctly<br />LITA 2011<br />BHL<br />
    17. 17. Ratings System<br />Article beginning & ending pages<br />1=has all text pages for an article, from start to end<br />2=subset of pages from a larger article <br />3=a set of pages where the intellectual content has been compromised. <br />LITA 2011<br />BHL<br />
    18. 18. Analysis steps<br />LITA 2011<br />
    19. 19. Results<br />LITA 2011<br />BHL<br />
    20. 20. What did we learn?<br />Ratings were better than we expected<br />Many users took the time to create decent metadata <br />“good enough” is not great but is still “findable”<br />LITA 2011<br />BHL<br />
    21. 21. But of course…..<br />there’s always room for improvement<br />Other factors<br />BHL-Australia’s new portalhttp://bhl.ala.org.au/<br />BHL<br />LITA 2011<br />
    22. 22. Changes we madefor UI so far<br /><ul><li>Asking users if they want to contribute their article to CiteBank
    23. 23. Making article title a required field and validating it so its at least 2 or more characters
    24. 24.  Review button for users to review page selections and metadata (inspired by BHL-AUS)
    25. 25. Reduced text and increased more intuitive graphics (inspired by BHL-AUS)</li></ul>BHL<br />LITA 2011<br />
    26. 26. Brief survey of proposed changes<br />Overwhelmingly positive response to proposed change<br />But of course…..<br />there’s always room for improvement<br />BHL<br />LITA 2011<br />
    27. 27. Success Factors<br />Monitor the creation of the metadata to look at user behavior and patterns<br />Engage with your users<br />Incentivize your users<br />LITA 2011<br />
    28. 28. http://biodiversitylibrary.org<br />@BioDivLibrary/pages/Biodiversity-Heritage-Library/63547246565/photos/biodivlibrary/sets//group/biodiversity-heritage-library<br />Bianca Crowley<br />crowleyb@si.edu<br />Trish Rose-Sandler<br />trish.rose-sandler@mobot.org<br />BHL<br />LITA 2011<br />
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×