20140317 pi b_nmbe_journal_club

712 views

Published on

Lecture presented at the Journals Club of the Naturhistorisches Museum Bern, March 17, 2014.
"Towards an (European) Open Biodiversity Knowledge Management System"

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
712
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
7
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

20140317 pi b_nmbe_journal_club

  1. 1. Towards an (European) Open Biodiversity Knowledge Management System Donat Agosti (Plazi, Bern) March 17, 2014 Berne, Journal Club @ NMBE
  2. 2. El Bulli: Cooking in Progress (2011) Ferran Adria (Actor), Gereon Wetzel (Director)
  3. 3. The cook (Ferran Adriá) wants to know when he can expect what seafood for his kitchen. He assumes that phenological data is open and accessible to anyone. He has a question and needs to know: What seafood at what time? His goal is to provide a service based on the use of observation data, i.e. treat you (and make some money).
  4. 4. The fishmonger knows when what seafood is available. He considers his knowledge of seafood phenology as his asset to make money. His goal is to make money with knowledge based on observation records and understanding the characteristics of seafood.
  5. 5. What do YOU want to know? How do YOU expect to get to your information?
  6. 6. • What are the main online resources you use? • Do you maintain your own digital library? • Do you participate in an online project, eg scratchpads, catalogue, digital archive and make your data accessible? • … ?
  7. 7. What does this mean? Meredith Lane, e-biosphere Conference, London 2009
  8. 8. Hardisty, Nature 502, 171 (2013) BUT: predictive ecology has substantial data needs Harfoot, BIH2013, Rome, 2013 The big question What is the future of the biological world? Imagine if we could: …Predict community level dynamics of ecosystems at scales from local to global, based on the ecology and biology of all individual organisms
  9. 9. Decentralized biodiversity infrastructure Plants 3,400 Herbaria worldwide 10,000 Associate curators and specialists 350,000,000 specimens in collections 180,000,000 specimens digitized 2,000,000,000 specimens including animals Source: gbif.org; http://sciweb.nybg.org/science2/IndexHerbariorum.asp
  10. 10. 200,000,000+ printed pages 1,900,000 species described 20,000,000+ species treatments 17,000 new species per year Biodiversity libraries BUT: The data are hidden Incomplete digitization Publications are unstructured Collections are incomplete Data is not linked Most data are not open
  11. 11. Nationaal Herbarium Nederland collection on GBIF Source: http://www.gbif.org/dataset/7b33b040-f762-11e1-a439-00145eb45e9a One collection’s view of the world
  12. 12. Another collection’s view of the world http://www.gbif.org/dataset/82b0f51c-f762-11e1-a439-00145eb45e9a
  13. 13. What does this mean? The Linking Open Data cloud diagram Linked Open Data Cloud
  14. 14. Names as information tags in life sciences Names Characteristics Publications GenesCollections Specimens Distribution
  15. 15. The enhanced and linked treatments, extracted, stored on Plazi.org, and served in a human readable form, are linked to the underlying data: Fisher & Smith, 2008, PLoS ONE.
  16. 16. Towards an (European) Open Biodiversity Knowledge Management System
  17. 17. Coordination and Policy Development in Preparation for a European Open Biodiversity Knowledge Management System Supported by the European Commission through its FP7 research funding programme pro-iBiosphere
  18. 18. pro-iBiosphere: Partners
  19. 19. Create digital objects + Identifiers and resolvers + Open Access + Adequate infrastructure + Sustainable and permanent infrastructure + Reliable services for partners in research projects and society Seamless Global Virtual Research Knowledge Management System (European Open Biodiversity Knowledge Management System) Biodiversity Knowledge Management System
  20. 20. Impact Support reliable and permanent open access to digital biodiversity records Create identifiers and link biodiversity literature, collections, digital objects, genes, etc. Ensure global interoperability and sharing of biodiversity data, information and knowledge Provide new services in support of open science Provide the ground for modelling biosphere Develop data policies to harness the potential of open access European Open Biodiversity Management System The envisaged will:
  21. 21. Convert data into machine readable data
  22. 22. Literature as an example
  23. 23. Text <tax:treatment> <tax:nomenclature> <tax:name> <tax:xid source="HNS" identifier="193329"/> <tax:xmldata> <dc:Genus>Mystrium</dc:Genus> <dc:Species>leonie</dc:Species> </tax:xmldata> Mystrium leonie </tax:name> Bohn & Verhaagh <tax:status>n. sp.</tax:status> Fig 1 D - F </tax:nomenclature> <tax:div type="description"> <tax:p>HOLOTYPE WORKER: TL 3.95, HL 1.02, HW 0. 1.30, SI 137, PW 0.73, ML 0.38. Mandible oute to a sharp apical tooth, the apex parallel to (Holotype with material in mandibles, so mand $ described below from paratypes.) Median cly .... </treatment> Enhanced and linked text
  24. 24. Treatment A publication or section of a publication documenting the features or distribution of a related group of organisms (called a “taxon”, plural “taxa”) in ways adhering to highly formalized conventions. http://terms.tdwg.org/wiki/tp:taxon-treatment Catapano, 2010.
  25. 25. Treatment
  26. 26. X-us c-us (Treatment) Citation Description Mate X-us b-us (Treatment) Citation Description Material cit X-us b-us n.sp (Treatment) Citation Description Material cit X-us b-us (Treatment) Citation Description Material cit Treatments
  27. 27. X-us c-us (Treatment) Citation Description Mate X-us b-us (Treatment) Citation Description Material cit X-us b-us n.sp (Treatment) Citation Description Material cit X-us b-us (Treatment) Citation Description Mateerial cit Title (Article) Bibliogra- phic references Title (Article) Bibliogra- phic references Title (Article) Bibliogra- phic references Title (Article) Bibliogra- phic references Systema naturae (Article) Bibliogra- phic references Treatments References
  28. 28. Treatments can be cited, like publications, with stable identifiers.
  29. 29. http://treatment.plazi.org/id/31F96F41E3E002BD88985A4F3A20E45A Best practices for stable URIs: http://wiki.pro-ibiosphere.eu/wiki/Best_practices_for_stable_URIs
  30. 30. Jeremy Miller, Work in Progress
  31. 31. Jeremy Miller, Work in Progress
  32. 32. Names can be linked automatically
  33. 33. Automated registration MANUSCRIPT SUBMISSION MANUSCRIPT ACCEPTED XML Response ARTICLE PUBLISHED Taxon name available/valid (effectively published) XML article metadata XML Query Peer review
  34. 34. The enhanced and linked treatments, extracted, stored on Plazi.org, and served in a human readable form, are linked to the underlying data: Fisher & Smith, 2008, PLoS ONE.
  35. 35. Penestomus egazini Miller, Haddad & Griswold, 2010 Progress Treatments (% complete): 4/4 (100%) Data summary Specimen records:41 adult female adult male other 51% 2% 46% Specimen collections Institutions: 3 Distribution Muséum National d'Histoire Naturelle, Paris California Academy of Sciences, San Francisco Albany Museum, Grahamstown 2% 5% 76% 20% Countries Lesotho South Africa Georeferenced materials citations Export species materials citations (DwC) Export treatment materials citations (DwC)
  36. 36. 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 Materials Citations Records by Researcher Other Donat Agosti David Grimaldi Toby Schuh James Carpenter Norman Platnick American Museum of Natural History Data summary Materials citations 2004-2013:111,364 Distribution Georeferenced materials citations Export species materials citations (DwC) MaterialsCitationsRecords
  37. 37. 0 500 1000 1500 2000 2500 Materials Citations Records by Institution Other Muséum National d'Histoire Naturelle, Paris Natural History Museum, London Museum of Comparative Zoology Smithsonian Institution American Museum of Natural History Zootaxa Data summary Materials citations 2004-2013:11,476 Distribution Georeferenced materials citations Export species materials citations (DwC) MaterialsCitationsRecords
  38. 38. Better: Create data as machine readable data
  39. 39. Unified marked up final output Taxon treatments, keys, images, localities PROSPECTIVE PUBLISHING | HISTORICAL LITERATURE Legacy and new taxonomic literature Content management systems & repositories (e.g., Plazi, EOL, GBIF, SCRATCHPADS, EDIT) TaxPub XML schema PENSOFT MARK UP tool Marked up publications PDF, HTML and XML archiving WIKI Species-ID, Wikispecies Wikipedia Indexing (IPNI, ZooBank, Myco- Bank, GNA) Aggregators (EOL, GBIF) Electronic archives; Data Centers END USERS TaxonX schema PLAZI’ GOLDEN GATE editor Automated submission; peer- review
  40. 40. http://biodiversitydatajournal.com/articles.php?id=995
  41. 41. Access to ant taxonomic publications through antbase.org /Smithsonian Institution, including currently the entire body of non-copyrighted publications since 1758 (>4,000 publications or 85,000 pages)
  42. 42. Open Access
  43. 43. Knowledge wants to be free
  44. 44. Before antbase.org, Harvard‘s Museum of Comparative Zoology could claim to be the only location with a complete set of ant systematics publications from 1758 - present.
  45. 45. Before antbase.org, Harvard‘s Museum of Comparative Zoology could claim to be the only location with a complete set of ant systematics publications from 1758 - present. Through antbase.org‘s digital library, access to this body of literature is worldwide, and it is actively used (>10,000 visits in one month only).
  46. 46. Knowledge has to be free
  47. 47. Bouchout Declaration, 2014 Umsetzung durch den Schweizerischen Nationalfonds, 2007 Berlin Declaration, 2003
  48. 48. • The free and open use of content, services and other digital resources about biodiversity; • Licenses that grant all users a free, irrevocable, world-wide, right to copy, use, distribute, transmit and display the work publicly as well as build on the work and making derivative works, subject to proper attribution consistent with community practices; • Policy developments that will foster free and open access to biodiversity data; • Tracking the use of information to ensure that sources and suppliers of data are assigned credit for their contributions; • An agreed infrastructure, standards and protocols to improve access to and use of open data; Bouchout Declaration, 2014 (1)
  49. 49. • Registers for content and services to allow discovery, access and use of open data; • Persistent, dereferenceable identifiers for data objects and physical objects such as specimens, images and taxonomic treatments; • Linking data using agreed vocabularies, both within and beyond biodiversity, that enable participation in the Linked Open Data Cloud; • Dialogue coordinated by the leading signatories to refine the concept, priorities and technical requirements of Open Biodiversity Knowledge Management. • A sustainable Open Biodiversity Knowledge Management that is attentive to scientific, sociological, legal, and financial aspects. Bouchout Declaration, 2014 (2)
  50. 50. Knowledge has to be made free
  51. 51. You!
  52. 52. Reduce costs – future publishing
  53. 53. Don’t waist money: Focus on Open Access enhanced linked publications – not pdf only
  54. 54. founded in 2008 Swiss based NGO with members in Switzerland, Germany, Bulgaria, US and Iran research based think tank with the mission to promote open access to scientific content five pillars: Legal advice, technical innovations and solutions, maintenance of a treatment repository and Biowikifarm, consultancy, advocacy
  55. 55. Modify copyright legislation to serve better the scientific needs
  56. 56. Taxpub TaxonX DTD Schema Prospective publications Legacy publications Constraint loose Derivative of JATS independent Self-contained Allows import of other schemas
  57. 57. Plazi workflow: overview
  58. 58. Plazi Search and Retrieval Server: Access to data Darwin Core-Archive You You You human machine
  59. 59. Biowikifarm
  60. 60. founded in 2008 Swiss based NGO with members in Switzerland, Germany, Bulgaria, US and Iran research based think tank with the mission to promote open access to scientific content five pillars: Legal advice, technical innovations and solutions, maintenance of a treatment repository and Biowikifarm, consultancy, advocacy Plazi GmbH founded in 2012 as service SME owned by Plazi
  61. 61. research based think tank with the mission to promote open access to scientific content five pillars: Legal advice, technical innovations and solutions, maintenance of a treatment repository and Biowikifarm, consultancy, advocacy Plazi GmbH founded in 2012 as service SME owned by Plazi Funding from public donors, eg. EU, corporate and private
  62. 62. Funding: EU EU-BON Pro-iBiosphere Private sector Inkind Voluntary work
  63. 63. five pillars: Legal advice, technical innovations and solutions, maintenance of a treatment repository and Biowikifarm, consultancy, advocacy Plazi GmbH founded in 2012 as service SME owned by Plazi Funding from public donors, eg. EU, corporate and private Clients are global
  64. 64. Consultancies and Services: Consulting publishers on how to produce XML semantically enhanced output (eg. EJT, Zootaxa, Smithsonian Institution) Service to mark-up literature
  65. 65. http://plazi.org Thank you very much! Donat Agosti agosti@plazi.org This project is funded under the European Union's Seventh Framework Programme (FP7/2007-2013) under grant agreement №312848.

×