Small pieces loosely joined: towards a unified theory of biodiversity for the web

3,513 views

Published on

An invited talk at the American Museum of Natural History, given as part of the Richard Gilder Graduate School Program. New York, U.S.A. November 24, 2008.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
3,513
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
48
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Small pieces loosely joined: towards a unified theory of biodiversity for the web

  1. 1. Small pieces loosely joined Towards a unified theory of biodiversity for the web Vincent S. Smith
  2. 2. Macro taxonomy The big picture of taxonomic research Goal… • Inventory the Earth’s species • Document their relationships • “Publish” these data Data set… • 1.8 M described spp. (10M names) • 300M pages (over last 250 years) • 1.5-3B specimens People… • 4-6,000 scientists • 30-40,000 “pro-amateurs” • Many more citizen scientists?
  3. 3. Micro taxonomy The practice of taxonomic research Sociology… • Parochial • Specialized experts • Fragmented & distributed Methodology… • Different (domain specific) • Communities of practice • Non transferable skills Output… • Heterogeneous & scattered • High volume, low impact • Hard to find (use) How do we integrate micro & macro taxonomy for the Web?
  4. 4. http://Scratchpads.eu
  5. 5. What is a Scratchpad? A website for you & your community 1 2 3 Your data Uploaded & Published & reviewed tagged on your site
  6. 6. What is a Scratchpad? A website for you & your community 1 2 3 Your data Uploaded & Published & reviewed tagged on your site Fast Intuitive Fit for use
  7. 7. What can Scratchpads do? Import, manage, search & browse: Specimens DNA & Phylogenies Literature Images
  8. 8. What can Scratchpads do? Integration & connectivity within & between sites Specimens DNA & Phylogenies Taxonomy Literature Images
  9. 9. What can Scratchpads do? In summary: +Administration +Groups +Specimens -Change your site information -Creating a group -Creating a record -Change you front page -Subscribing to a group -Importing from a spreadsheet -Change your logo +Image -Linking specimen & location records -Activity and access logs -Uploading & basic annotation -Linking specimen & pub. records +Backup -Linking image & location records +Tasks -Backing up your data -Linking image & specimen records -Creating a tasklist -Restoring your data -Linking image & publication records +Taxonomy +Bibliography -Overlay annotations on images -Importing from a spreadsheet -Creating a record +Layout -Importing from ClassificationBank -Importing from a ref. manager -Change your theme -Starting from scratch -Exporting to a reference manager -Menus -Taxonomy manager +Blog -Blocks and sidebars -Displaying a classification -Creating and adding a blog +Locations -Adding names +Custom Content -Creating a record -Deleting names -Defining a CCK -Importing from a spreadsheet -Taxonomy & panels -Importing from a spreadsheet +Pages +Users -Creating a custom view -Creating, editing, cloning & deleting -Your settings +Fileshare -Configuring the panels template -Adding a new user -Creating and using a fileshare +Panels -User roles and permissions +Forum -Adding & configuring content -Adding and editing user profile fields -Altering the forum settings -Creating a new panel -Logging in -Creating a container for a forum -Citing a Panels page +Webform -Creating a new forum +Phylogeny -Creating and using webforms -Creating a new topic inside a forum -Adding a phylogenetic tree
  10. 10. What can Scratchpads do? Visual taskguide
  11. 11. Current Scratchpads Ants Sites: 70+ Bees Beetles Users: 850+ Big-headed flies Birds Pages: 130k Blackflies Ciliates Since March 2007 Cockroaches Dragon Trees Dung Beetles False Buttonweed Flat worms Flies Foraminifera Fossil Insects Fungus Gnats Holometabola Leaf-miner Flies Lice Lichens of Bermuda Malvaceae Megalastrum ferns Milichiid flies Mosquitoes Mosses Nannotax fossils Nepticuloid moths Palms Pearl oysters Polychaete worms Scaleworms Stick insects Sulawesi Ferns Termites Triticid grasses Weevils Wood Ferns
  12. 12. Scratchpad visitors Tracking visitors across sites Key monthly statistics - 50,000 page views - 6,000 visitors - 8 minutes on site - 50% returning visits (average per month 08’)
  13. 13. Scratchpad applications A multipurpose, flexible technology eBooks 4th Edition Howard & Moore, Birds of the world (fact checking, data compilation, 2010, funding)
  14. 14. Scratchpad applications A multipurpose, flexible technology eJournals European Mosquito Bulletin (ISSN 1460-6127), Phasmid Studies (ISSN 0966-0011) (submission, review, & dissemination of articles)
  15. 15. Scratchpad applications A multipurpose, flexible technology Image galleries Nanno fossils, Cockroaches, Stick insects, Flatworms, Grasses, Lichens & many more… (rapid upload, annotation, & display of images)
  16. 16. How do Scratchpads work? Getting a Scratchpad Requirements • Biological focus • Agree to T&C’s (click-thru) • CC license “by-nc-sa” Application http://scratchpads.eu/apply • Maintainer • Scope/Mission/API Keys • (Sub)domain name Content • Unrestricted (overlapping) • No branding (focus on authors) • Value added
  17. 17. How do Scratchpads work? Using a Scratchpad Management • User categories (maintainer, ed. contrib.) • Public / private content (flexible groups) • Admin. page (site settings & behavior) Data Input • Content types (biblio, maps, “page” etc) • Forms, managers, Excel, EndNote etc • Custom content (add or extend data types) Tagging (indexing) • Taxonomy terms (2M +) • Multiple classifications • Auto-tagging
  18. 18. Autotagging Indexing data to make it findable 1. Create content (e.g. reference) Journal citation 2. Find terms mentions taxon name (Autotag) 3. Submit (Index)
  19. 19. Autotagging Indexing data to make it findable 1. Create content (e.g. reference) 2. Find terms (Autotag) Matches taxonomy term (Drag & Drop) 3. Submit (Index)
  20. 20. Autotagging Indexing data to make it findable 1. Create content (e.g. reference) 2. Find terms (Autotag) 3. Submit Page tagged (indexed) (Index) with taxon name
  21. 21. How do Scratchpads work? Indexing data to make it findable • Tagged data can be presented differently • For example as part of a traditional bibliography • Or as small windows or “panels” of data
  22. 22. How do Scratchpads work? Integrating data & “publishing” in a Scratchpad Types of Scratchpad Panel… Built with “tagged data” Personalized Common instructions Bibliographic names literature Taxonomic Files and hierarchies documents Photographs & Specimen illustrations records Customized Phylogenetic content trees
  23. 23. How do Scratchpads work? Integrating data & “publishing” in a Scratchpad Dynamically built species pages
  24. 24. How do Scratchpads work? Integrating data & “publishing” in a Scratchpad Browsed through a taxonomy
  25. 25. How do Scratchpads work? Integrating data & “publishing” in a Scratchpad Including 3rd party content
  26. 26. How do Scratchpads work? Integrating data & “publishing” in a Scratchpad With data curation tools
  27. 27. How do Scratchpads work? Integrating data & “publishing” in a Scratchpad Listing all “authors”
  28. 28. How do Scratchpads work? Integrating data & “publishing” in a Scratchpad Dated, permanent & citable
  29. 29. How do Scratchpads work? Adjusting the panels layout Choose which panels to display
  30. 30. How do Scratchpads work? An example based on the Catalogue of Life classification 2 million taxon pages Open curation at http://catlife.myspecies.info
  31. 31. Biodiversity on the Web The informatics landscape
  32. 32. Biodiversity on the Web Scratchpads are personalizing biodiversity science
  33. 33. A unified theory of biodiversity? BHL, EOL and scholarly journals Biodiversity Heritage Library • Digitising heritage literature Encyclopedia of Life • A web page for every species Scholarly Journals • Traditional publishing
  34. 34. Biodiversity Heritage Library “Digitizing biodiversity literature” • Biodiversity publications since 1469 - 5.4 million books - 800,000 monographs - 40,000 periodicals • Held by Natural History libraries E.g., NHM holds more than 1M books, 250k monographs & periodicals, 0.5M artworks • BHL partnership of 10 Nat. Hist. libraries • Sharing the digisation of contents • Focus on out of copyright materials • Partnership with “Internet Archive” • Make the contents “findable”
  35. 35. Biodiversity Heritage Library “Digitizing biodiversity literature” 1. Scan (photograph) 2. Extract text (OCR) 3. Find keywords - Taxonomic names - Author names - Citations - Collection data - Morphological data - Descriptions - Identification keys - Illustrations - Photographs 1 scribe machine, 3,500 pages per shift per day 34 scribe machines now in operation
  36. 36. Biodiversity Heritage Library “Digitizing biodiversity literature” 1. Scan 2. Extract text (OCR) 3. Find keywords - Taxonomic names - Author names - Citations - Collection data - Morphological data - Descriptions - Identification keys - Illustrations Palma, R.L., and - Photographs R.L.C. Pilgrim. 2002. A revision of the genus Naubates (Insecta: Phthiraptera: Philopteridae). J. R. Soc. N.Z. 32:7-60.
  37. 37. Biodiversity Heritage Library “Digitizing biodiversity literature” 1. Scan 2. Extract text (OCR) 3. Find keywords - Taxonomic names - Author names - Citations - Collection data - Morphological data - Descriptions - Identification keys - Illustrations Palma, R.L., and - Photographs R.L.C. Pilgrim. 2002. A revision 4. Index of the genus Naubates (Insecta: 5. Put on the web Phthiraptera: Philopteridae). J. R. Soc. N.Z. 6. 10M pp. to date 32:7-60.
  38. 38. Scratchpads and BHL Creating a community built virtual taxonomic library Not Yes Yet? Scratchpads as a tool to add articles (and markup) to BHL?
  39. 39. Encyclopedia of Life “A web page for every species” • A web page for all 1.8M species • $25m funding (5 years) - MacArthur and Sloan Foundations • Multiple audiences - Science & outreach • Megascience mashup - Aggregating data from the web • 10 years to complete - First draft 2008, “finished” 2017! • Struggling to find an identity? - Competition, vetting, growth, credit • A possible publishing platform? - LifeDesks / Scratchpads
  40. 40. Journals Articles Scholarly communication in taxonomy & systematics • Fragmented • Mostly commercial • Data poor • Fixed audience - Hard to repurpose • Possible role for EoL? - Web publishing platform (cf Wikipedia) • Zootaxa - 15% n. spp; 50 spp. a week! • Scratchpads / EoL / Zootaxa Biodiversity - MS Word Template (markup) Journals - Simultaneous publication
  41. 41. Summary “Small pieces loosely joined” 1. Bringing data together Biodiversity studies are data rich, poorly archived & ever changing 2. Bringing people together Biodiversity researchers are few in number, fragmented & highly distributed 3. Bringing science together Biodiversity science demands a different approach to addressing BIG questions BIG IS DIFFERENT New opportunities & new challenges!
  42. 42. Thanks… Simon Rycroft Dave Roberts Kehan Harman Ben Scott Edward Baker Irina Brake Vladimir Blagoderov
  43. 43. Questions?
  44. 44. Scratchpad management Scalable & sustainable technology Hardware, software & user support Virtual machine, open-source software, self-archiving, backed-up, multi-site configuration (easy to move & upgrade, secure & reliable, citable, screencasts, low admin., low marginal costs)

×