Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.



Published on

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this


  1. 1. Encyclopedia of Life Redefining Publication and Access to the Primary Literature David P. Shorthouse Marine Biological Laboratory Woods Hole, MA
  2. 2. Imagine an electronic page for each species of organism on Earth, available everywhere by single access on command. The page contains the scientific name of the species, a pictorial or genomic presentation of the primary type specimen on which its name is based, and a summary of its diagnostic traits. The page opens out directly or by linkage with other databases such as ARKive, Ecoport, and GenBank. It comprises a summary of everything known about the species’ genome, proteome, geographic distribution, phylogenetic position, habitat, ecological relationships, and, not least, its perceived practical importance for humanity.
  3. 3. Steering Committee Biodiversity Heritage Library Executive Smithsonian Marine Biological Laboratory Biodiversity Informatics Atlas of Living Australia Missouri Botanical Garden Plants Harvard University Education and outreach Field Museum Research Community MacArthur Foundation Sloan Foundation
  4. 4. Not the first time… <ul><li>Tree of Life </li></ul><ul><li>Catalogue of Life </li></ul><ul><li>SpeciesBase </li></ul><ul><li>Discover Life </li></ul>
  5. 5. What makes this distinct… <ul><li>Grandeur of the vision </li></ul><ul><li>Taxonomically intelligent, names-based cyberinfrastructure </li></ul><ul><li>Aggregation of content </li></ul><ul><li>Participatory </li></ul><ul><li>Open source content and software </li></ul><ul><li>… not just a website </li></ul>
  6. 6. David “Paddy” Patterson Peter Mangiafico Patrick Leary David Shorthouse Kristen Lans Pam Fournier Alexey Shipunov Vitthal Kudal Jeremy Rice Dimitry Mozzherin Anne Thessen
  7. 7. Exemplar Pages Anne Pringle • Brian Farrell • Alta Buden • Margaret Thayer • Michael Ashburner • Christy Geraci • Lilibeth Miranda • Senjie Lin Rick Wilkerson Jonathan Losos • David Langor • David Shorthouse • Mary Hennen • Judy Stoffer • George Yatskievych • Kendra Buresch Tonia Hsieh • David Patterson • Christian Thompson • Rod Eastwood • Jerry Louton • Seth Bordenstein Rich Pyle • Roger Hanlon • Tamara ClARKMWendy Applequist • Grace Servat • Bob Magill • Sandy Knapp • Vicki Funk
  8. 8. Exemplar Process Current Rate ≈ 100 pages / year
  9. 9.
  10. 10. Biodiversity Heritage Library <ul><li>Missouri Botanical Garden </li></ul><ul><li>New York Botanical Garden </li></ul><ul><li>Royal Botanic Gardens, Kew </li></ul><ul><li>Field Museum </li></ul><ul><li>Natural History Museum (London) </li></ul><ul><li>Smithsonian Institution </li></ul><ul><li>American Museum of Natural History </li></ul><ul><li>Botany Libraries, Harvard University </li></ul><ul><li>Ernst Mayr Library of the Museum of Comparative Zoology, Harvard University </li></ul><ul><li>Marine Biological Laboratory / Woods Hole Oceanographic Institution Library (MBL/WHOI) </li></ul>
  11. 11. WHAT? Digitize the core literature of biodiversity. Full works, not bits & pieces. Open Access : all content can be repurposed, reused, reformatted. Congruent : must fit in to a dynamic knowledge ecology.
  12. 12. BHL Status <ul><li>9.2M pages </li></ul><ul><li>Challenges: metadata extraction & search trajectories </li></ul><ul><ul><li>Penn State collaborations </li></ul></ul><ul><ul><li>Honing name-matching algorithms, natural language processing </li></ul></ul>
  13. 13. Names are Messy Aa paleacea Limulus polyphemus Kiwa hirsuta Osedax frankpressi Kingia australis Pieris japonica Pieris rapae Trypanosoma brucei Homo sapiens
  14. 14. More than One Meaning (Polysemes) Aotus trivirgatus Aotus Illiger 1811 Aotus Aotus Smith 1805 Aotus ericoides . Resolve with intelligent disambiguation Authority, species, contextual data Contextual data Primate Monkey Eyes Food Panama Aotus nancymaae Contextual data legume plant flower Mirbeliea Australia Aotus mollis Anorexia nervosa Habeas corpus Etcetera etcetera
  15. 15. Many names for one species… Koko Горилла Guerilla Eastern Lowland Gorilla Gorilla graueri Gorilla berengei Gorilla beringei Matschie Gorilla beringei mikenensis King kong Gorilla gorilla Virunga Gorila Gorille Mountain gorilla 大猩猩 ゴリラ
  16. 16. EOL the Aggregator <ul><li>“Content partner” schema </li></ul><ul><ul><li>Media elements, species profile model </li></ul></ul><ul><ul><li>Attribution, licensing </li></ul></ul>Pyle, R. L., J. L. Earle and B. D. Greene. 2008. Five new species of the damselfish genus Chromis (Perciformes: Labroidei: Pomacentridae) from deep coral reefs in the tropical western Pacific. Zootaxa . 1671 : 3–31.
  17. 17. Licensing Policy for Content Partners All information currently in the public domain will remain in the public domain. Content providers are required to adopt a Creative Commons license for the information that they serve through the EOL. Except for public-domain content, the default and preferred license is CC-BY Content providers who request some restrictions on re-use of their information may select: CC-BY-SA CC-BY-NC CC-BY-NC-SA To the greatest extent possible, the Encyclopedia of Life promotes an open-source, open-access approach. The EOL will provide attribution information for all content that it serves. EOL will also indicate the Creative Commons license attached to each object (text, structured data, graphics, multimedia, etc.). V5.0 5 April 2008
  18. 18. EOL the Enabler <ul><li>Curate species page </li></ul><ul><li>Partial funds for post-doctoral positions </li></ul>
  19. 19. Cybertaxonomy Drivers <ul><li>Pool of active taxonomists is evaporating </li></ul><ul><li>Shift to online workflow must: </li></ul><ul><ul><li>Be meaningful (foster engagement with organisms) </li></ul></ul><ul><ul><li>Attract funding </li></ul></ul><ul><ul><li>Provide personal and institutional visibility </li></ul></ul><ul><ul><li>Be scholarly (e.g. citation metrics) </li></ul></ul><ul><ul><li>Be simple and task-oriented </li></ul></ul><ul><ul><li>Federate workloads </li></ul></ul>Hine, Christine. 2008. Systematics as Cyberscience: Computers, Change, and Continuity in Science. MIT Press. Cambridge, Massachusetts. 307pp.
  20. 20. Nearctic Spider Database: Can This be a Template? <ul><li>Meaningful </li></ul><ul><li>Provides personal & institutional visibility </li></ul><ul><li>Simple, task-oriented </li></ul><ul><li>Shares the workload </li></ul>
  21. 21. What’s in My Backyard? <?xml version=&quot;1.0&quot; encoding=&quot;windows-1252&quot;?> <!--Zoom Search Engine Version 5.0 (1002) PRO--> <rss version=&quot;2.0&quot; xmlns:opensearch=&quot;; xmlns:zoom=&quot;http://www.wrensoft/zoom/response/5.0/schema/&quot;> <channel> <title>Nearctic Spider Database</title> <description>Search species pages in The Nearctic Spider Database</description> <link></link> <opensearch:link rel=&quot;search&quot; href=&quot;./data/canada_spiders/search/search.xml“ type=&quot;application/opensearchdescription+xml&quot; /> <zoom:searchquery>pardosa moesta</zoom:searchquery> <zoom:searchcategory>All</zoom:searchcategory> <opensearch:totalResults>27</opensearch:totalResults> <opensearch:startIndex>10</opensearch:startIndex> <opensearch:itemsPerPage>10</opensearch:itemsPerPage> <item> ............ Can I Share or Get Help? Can I Track My Searches? OpenSearch Can I Grab That Image? HTML (JavaScript) & bbCode Gadgets
  22. 22. Taxa-Centric Approach
  23. 23.
  24. 24. <ul><li> </li></ul><ul><li>Mid-December alpha testing </li></ul><ul><li>Customization (skinning) </li></ul><ul><li>Image gallery </li></ul><ul><li>Facile names & classification management </li></ul><ul><li>Species Page creation: </li></ul><ul><ul><li>Images, “chapters” </li></ul></ul><ul><li>Licensing and attribution </li></ul><ul><li>Granular roles and permissions </li></ul>
  25. 25.
  26. 26. Why Data Centricity? <ul><li>Observe some features </li></ul><ul><li>Of some individuals </li></ul><ul><li>On a few occasions </li></ul><ul><li>In a few places </li></ul><ul><li>Record them incompletely </li></ul><ul><li>Convert un-interpreted data into interpreted assertions </li></ul><ul><li>Construct a narrative </li></ul><ul><li>Loss of data </li></ul>
  27. 27. The Future Raw data Correlations Triple store Filters: Faceted searches: What was that tree with pink flowers that we saw in Washington last May? Visualizations
  28. 28. EOL is… <ul><li>NOT the mother of all catalogues </li></ul><ul><li>A prelude to biocentric data management </li></ul><ul><ul><li>Enable a shift from narrative taxonomy to datacentric taxonomy </li></ul></ul><ul><li>A web hosting infrastructure for taxa-centric pursuits </li></ul>