Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

20140623 swets agosti_final


Published on

Open Access and the Future of (Biodiversity-) Research
"SWETS Be Open" Event Bern, Switzerland, June 23, 2013;
Donat Agosti, Plazi

Published in: Science, Technology
  • Be the first to comment

  • Be the first to like this

20140623 swets agosti_final

  1. 1. SWETS – Be Open! Donat Agosti Plazi, Bern 23.6.2014, Universität Bern Open Access and the Future of (Biodiversity-) Research
  2. 2. Future of Biodiversity Research Data Mining
  3. 3. Background Rio Earth Summit 1992
  4. 4. Background Biodiversity Crisis
  5. 5. Background Indicators
  6. 6. Indicators as powerful widely understood tool Reed Elsevier, Annual Reports and Financial Statements 2013 39% profit
  7. 7. Biodiversity research and conservation planning0 Multi-Taxon Specimen Data for Setting Conservation Priorities Source: Kremen C, et al. 2008. Science 320: 222-226. Consensus conservation priority areas and actual and proposed protected areas 2003: Madagascar announces it will triple protected land to 10% coverage
  8. 8. Politics IPBES Intergovernmental Platform on Biodiversiy & Ecosystem Services
  9. 9. EU-Political and Science Decision to support IPBES EU-BON European Biodiversity Observation Network EU-FP7 funded
  10. 10. A EU decision in support of environmental policy making EU-BON • To build a European Biodiversity Observation Network • Measure and predict change over space and time • Combine Remote Sensing data and on the ground observation data in predictive modeling • Tools to inform decision makers (EU-politicians)
  11. 11. The basic science question Hardisty, Nature 502, 171 (2013) BUT: predictive ecology has substantial data needs Harfoot, BIH2013, Rome, 2013 What is the future of the biological world? Imagine if we could: …Predict community level dynamics of ecosystems at scales from local to global, based on the ecology and biology of all individual organisms
  12. 12. Modeling life on earth Can we do it? A realistic goal?
  13. 13. Communication EU-BON a child of GEOSS Global Earth Observation System of Systems Open Access to remote sensing data from all over the world
  14. 14. The impact of remote sensing data on understanding biodiversity With sophisticated technologies we can identify different trees in the Amazon…
  15. 15. Access to data …we could create a link to the related data in our biodiversity literature
  16. 16. Names as information tags in life sciences Names Characteristics Publications GenesCollections Specimens Distribution
  17. 17. Treatments as bits of information Treatment: sections of publications documenting the features or distribution of a related group of organisms (called a “taxon”, plural “taxa”) in ways adhering to highly formalized conventions. (Catapano, 2010) Formica obsoleta, Linnaeus 1758: 580
  18. 18. Treatments as part of publications DNA Specimens Observations Institution Pharmacology/epidemiology Publication Treatment Treatment Treatment Table Appendix Biology/ecology Reference to other biota Publication Treatment Publication
  19. 19. Text (e.g. PDF) <tax:treatment> <tax:nomenclature> <tax:name> <tax:xid source="HNS" identifier="193329"/> <tax:xmldata> <dc:Genus>Mystrium</dc:Genus> <dc:Species>leonie</dc:Species> </tax:xmldata> Mystrium leonie </tax:name> Bohn & Verhaagh <tax:status>n. sp.</tax:status> Fig 1 D - F </tax:nomenclature> <tax:div type="description"> <tax:p>HOLOTYPE WORKER: TL 3.95, HL 1.02, HW 0. 1.30, SI 137, PW 0.73, ML 0.38. Mandible oute to a sharp apical tooth, the apex parallel to (Holotype with material in mandibles, so mand $ described below from paratypes.) Median cly .... </treatment> Enhanced and linked text (XML: Taxonx / Taxpub JATS) Plazi: Semantic enhanced treatments
  20. 20. Automatic extraction and visualization of treatment content Countries Madagascar Anochetus grandidieri Forel
  21. 21. Datamining of treatments Pseudomyrmex ants and Vachellia ant-acacias are a classic example of mutualism in biology. allenii melanoceras ruddiae chiapensis collinsii cookii cornigera globulifera hindsii janzenii mayana sphaerocephala boopis flavicornis hesperius ita janzeni kuenckeli mixtecus nigrocinctus nigropilosus opaciceps particeps peperi reconditus satanicus simulans spinicola subtilissimus veneficus ferrugineus gentlei gracilis Transbiotic link network Associated species linked through references in taxonomic treatments Acacia-ant species: Pseudomyrmex gracili Treatment: original description Treatment: redescription Associated ant-acacia: Acacia gentlei Ants Plants Photocredits: Alex Wild Treatment Treatments linked through citations
  22. 22. The Plazi approach From treatment to treatment repository The Plazi approach Agosti, D., W. Egloff. 2009. Taxonomic information exchange and copyright: the Plazi approach. BMC Research Notes 2009, 2:53. doi:10.1186/1756-0500-2-53
  23. 23. The Plazi approach Plazi workflow Plazi SRS find scan «OCR» markup store
  24. 24. Analyzing a large corpus of publications: Plazi repository 14,590 specimens 8900 plottable specimens from 1138 unique locations
  25. 25. Analyzing a journal: Journal of Hymenoptera Research 5170 specimens 4062 plottable specimens from 1138 unique locations
  26. 26. The biodiversity community Plants 3,400 Herbaria worldwide 10,000 Associate curators and specialists 350,000,000 specimens in collections 180,000,000 specimens digitized 2,000,000,000 specimens including animals
  27. 27. The biodiversity community 200,000,000+ printed pages 1,900,000 species described 20,000,000+ species treatments 17,000 new species per year
  28. 28. The taxonomy publishing world 12,000 Taxonomic Papers on 42,000 Spiders Since 1757 Publications widely scattered Source: Jeremy Miller
  29. 29. Why is the system broken? WHY does it NOTwork?
  30. 30. Access to data limited …we cannotcreate a link to the related biodiversity data
  31. 31. Communication 200,000,000+ printed pages 1,900,000 species described 20,000,000+ species treatments 17,000 new species per year BUT: The data are hidden Incomplete digitization Publications are not semantically enhanced Collections are incomplete Data is not linked Most data are not open
  32. 32. Why is the system broken? Access to a corpus NOT single PDF, data point
  33. 33. Why is the system broken? Access to content NOT representations
  34. 34. Why is the system broken? Legal issues Technical issues Social issues
  35. 35. Legal issues: Copyright Access to ant taxonomic publications through /Smithsonian Institution, including currently the entire body of non-copyrighted publications since 1758 (>4,000 publications or 85,000 pages)
  36. 36. Legal issues: licences Legal licences for 1000+ journals cannot be tracked by scientists
  37. 37. Technical issues: Digtial Object Identifiers DOI Missing CrossRef an exclusive club
  38. 38. Technical issues: Journal publishing workflow Journal publishing workflows: From structured data to unstructured text
  39. 39. Technical issues: Content extraction Conversion of legacy literature prohibitively expensive Mark up costs for markup including materials citations 0 5 10 15 20 25 30 35 40 0 100 200 300 400 500 600 700 Pages Minutes Source: Spider Pilot, Jeremy Miller Plazi SRS find scan «OCR» markup store Average: 6 min / page complete OCR: 0.80 EUR /page vendor
  40. 40. Social issues: data sharing The misunderstood attribution
  41. 41. Why is the system broken? WHY NOT make it work?
  42. 42. European Open Biodiversity Knowledge Management System European Open Biodiversity Knowledge Management System European Union FP7 funded project
  43. 43. European Open Biodiversity Knowledge Management System Prepare the ground for the creation of a system for intelligent management of biodiversity knowledge which will improve the present system of taxonomic literature.
  44. 44. Legal issues: Copyright: The Blue List The Blue List elements of taxonomic information that are not subject to copyright Patterson, D. J., Egloff, W., Agosti, D., Eades, D., Franz, N., Hagedorn, G., Rees, J. A. and Remsen, D. P. 2014. Scientific names of organisms: attribution, rights, and licensing BMC Research Notes 7:79 doi:10.1186/1756-0500-7-79.
  45. 45. Legal issues: Copyright: Legal exceptions for research Legal exceptions for research Egloff W, Patterson D, Agosti D, Hagedorn G 2014. Open exchange of scientific knowledge and European copyright: The case of biodiversity information. ZooKeys 414, 109-135. DOI: 10.3897/zookeys.414.7717
  46. 46. Legal issues: Copyright: Open Access Open Access
  47. 47. Legal issues: Copyright: Creative Commons Licence
  48. 48. Technical issues: DOI Persistent identifiers for data objects and physical objects Linking data using agreed vocabularies
  49. 49. Technical issues: DOI Biodiversity Literature Repository @ Zenodo public repository for legacy literature using Data Cite DOI CrossRef to cite (Zenodo) Data Cite DOI?!
  50. 50. Technical issues: semantic enhanced publishing Semantic enhanced publishing Taxpub JATS Use DOI as widely as possible
  51. 51. Technical issues: machine access (well documented) API
  52. 52. Technical issues: semantic publishing Advanced publishing and dissemination Form based Semantnic enhanced TaxPub JATS based publishing
  53. 53. Social issues: Bouchout Declaration launched June 12, 2014 10 Principles Free and open use of digital resources Use of persistent identifiers and linking of data Policy developments Developing sustainable business models
  54. 54. Social issues: Bouchout Declaration
  55. 55. Technical issues: business plan
  56. 56. Conclusions If we want to conserve the world’s biodiversity, we need one stop open shopping for biodiversity research results.
  57. 57. Conclusions We scientists are getting our acts together.
  58. 58. Conclusions Will the publishers too?
  59. 59. Thank you! Donat Agosti