Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Bibliographic References in BHL


Published on

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

Bibliographic References in BHL

  1. 1. Bibliographic references in BHLCoordination and routes forcooperation across organizations,projects and e-infrastructures23rd of May 2013William Ulate R., Missouri Botanical Garden
  2. 2. Questions to Answer1. Type of content we discuss (e.g., occurrences, genes, behaviour,morphology, etc.)2. Sources of content (from where)3. Formats of content (formats, standards)4. Methods of gathering information (e.g., harvesting, ftp uploads,protocols)5. Methods of delivery of information (e,g., free searches, API, webservices, automated exports, linking mechanisms, etc.; provide linksto API and web services documentation)6. Identifiers used (type, persistence, dereferencing, resolvability)7. Present or forthcoming interoperability features with otherplatforms8. Constraints, needs and expectations to:a) Suppliers of content, andb) Users of content9. What is needed for Bibliographic References?
  3. 3. A brief history…
  4. 4. The Biodiversity Heritage
  5. 5. Book Viewer
  6. 6. SharingBHL shares data through:APIsData ExportOpenURLOAI-PMH
  7. 7. Open Data• Downloads– Simple tab-delimited exports of core data–• Data model– DB schema as ERD–
  8. 8. Services• Names Service– Return all occurrences of a name throughout BHL digitized corpus• Documentation:– Access to 100+ million name strings using TaxonFinder & NetiNeti• 1.5 million unique names– Algorithm to detect nomenclatural & taxonomic acts• OpenURL– Facilitate links to citations: protologues, articles, references• Documentation:– Useful to Nomenclators, Reference Systems• IPNI• Tropicos
  9. 9. Services: OpenURL
  10. 10. DOIs
  11. 11. DOIs for Legacy Literature• BHL member of CrossRef through Smithsonian• Started assigning DOIs to BHL monographs– Low hanging fruit: Easy, non-controversial– 54,856 DOIs Approved to date• Next, other publication types / articles?– Process of automatically assigning CrossRef DOIsto articles has a higher potential for collisions.
  12. 12. Article-level metadata• Disambiguating and locating structural componentsin the corpus• Done by automated and crowdsourced means– Thanks Rod Page! Welcome others!• Greatly increases semantic value of the dataset• Makes data addressable and thus linkableChapter-level metadataTreatment-level metadataPart-level metadata
  13. 13. Genesis: “BHL Article Repository”• Idea first introduced at TDWG 2008, Fremantle(by BHL, many have discussed for years)• YouTube for biodiversity articles• Needed (need) a way to access articles in BHL– “BHL has no articles.”– BHL has hundreds of thousands of articles but youcan’t search for them via author, article title search– Can find via “article coordinates” using BHL’s UI &OpenURL resolver: Journal / Volume / Start Page / Year
  14. 14. CiteBank• Objectives– Create a repository for community-vettedtaxonomic bibliographies.– Ability to ingest, display, download, and indexarticles so that the BHL can operate as an articlerepository.– Provide links to content published online throughother repositories.• Launched on December 6th 2010• 185609 bibliographic records to date
  15. 15. Citations today:
  16. 16. Citations Providers
  17. 17. SpecimenDatabasesCommercialAggregatorsSoftware ToolsOpen AccessDigital LibrariesIndicesNomenclatorsSpecimenDatabasesCommercialAggregatorsSoftware ToolsOpen AccessDigital LibrariesIndicesNomenclatorsOpen AccessPublishersInternational Collaborative Projects
  18. 18. Lessons Learned• Biblio/Drupal data model insufficient for mass of dataenvisioned for all biodiversity, too flat and difficult toexpand in collaboration with Biblio developmentcommunity• Data providers want their content findable andmanaged in the Biodiversity Heritage Library, not asystem alongside BHL• Maintaining two platforms for biodiversity literaturethreatens sustainability of the literature resources overthe longer term
  19. 19. Global Names Architecture
  20. 20. What have we done?• Articles– Extended BHL data model to store article metadata– Built process to harvest data from BioStor• Created user interfaces for adding article metadataand associated files– Defined functional requirements as improvements toDrupal-based Citebank– Defined process flow for adding article metadata andassociated files– Implemented UI changes• Changed BHL UI to accommodate article search• Changed BHL UI to accommodate article display (TOC)
  21. 21. Articles in the BHL UI
  22. 22. Articles
  23. 23. Articles
  24. 24. Articles
  25. 25. Requirements for a citation repository?Admin. Interface– IMPORT AND MAPPING TOOL• Preview/Accept/Reject/Undo/Report on Import• No standard schema, MODS or Bibtex• Drag & drop GUI or mapped source and target field config.– USER MANAGEMENT• Self-Registration• Admin. Approval & Deletion• User Roles Assignment– GLOBAL UPDATES
  26. 26. Requirements for a citation repository?General User Interface– IMPORT• Upload/Preview/Accept/Reject/Undo/Report on Import– CREATE CITATION• By filling a Form, via BibTex– BROWSE• Faceted: title,author,subject, year, contributor, my citations
  27. 27. Requirements for a citation repository?• CITATION TYPES– Journal Article, Book Chapter, Conference Proceedings,Conference Paper, Thesis, Government Report, Note, etc.• OAI HARVESTING– Harvest and serve data through OAI-PMH• SPECIFICATIONS FOR DATA PROVIDERS PAGE• CONTRIBUTORS PAGE– Recognize ALL contributions• REPORTING– Statistics Page by Citation and Publication type– Recent/Latest Uploads
  28. 28. What are we doing?• Integrate BHL’s Services with ZooBank, IPNI & IF• Authoritative list of titles in common use fornomenclatural acts (“TL3”)• Harvest relevant content from Mendeley• Integrate services and interfaces with the GNUBdata model• Interoperate with citation parsing tools & services
  29. 29. Support citation reconciliation.......L. Sp. Pl. 2: 971. 1753Linneaus, C. Species Plantarum, vol. 2 p. 971. 1753Linné, Carl von. Sp. Pl. Vol. 2 Page 971. 1753Caroli Linnaei, Species Plantarum exhibentes plantas rite cognitas, ad generarelatas, cum Differentis Specificis, Nominibus Trivialibus, Synonymis Selectis,Locis Natalibus, secundum SYSTEMA SEXUALE digestas.. 2:971. 1753Zea mays
  30. 30. Questions to Answer1. Type of content - Literature, Images, OCR Textand Bibliographic Citations2. Sources of content - BHL, CB & other Repositories3. Formats of content - BibTex, MODS, DC4. Methods of gathering info - Harvesting, FTP Uploads5. Methods of delivery of info - Free Searches, API, webservices, exports, linkingmechanisms6. Identifiers used - CrossRef DOIs for Monographs7. Interoperability withother platforms - Zoobank, IPNI, IF8. Constraints, needs and expectations to suppliers of contentand users of content
  31. 31. Thank youpro-iBiosphere Meeting 3Coordination and routes for cooperation across organizations, projects and e-infrastructuresBerlin, GermanyMay 23rd, 2013William.Ulate@mobot.orgGlobal BHL Project ManagerBHL Technical DirectorSenior Project ManagerMissouri Botanical Garden