Biodiversity Heritage Library in Australia

800

Published on

Presentation given by Ely Wallis at Global BHL technical meeting, October 2010, Woods Hole, MA

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
800
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Funding
    $8.2M for ALA from NCRIS in 2006-2011
    $26.5M in-kind from ALA partners
    $30M for ALA from EIF in 2009-2011
    Partnership
    Government: CSIRO, DAFF, DEWHA
    Museums and Herbaria: AM, MV, QM, TMAG
    Councils: CHAH, CHAFC, CHAEC, CHACM, CAMD
    Universities: SCU, UA
    Integrate biodiversity data for research and policy
    Specimens and Observations
    Names and Classifications
    Descriptions, Keys, Multimedia and Sequences
  • Illustrate the range of biodiversity information
  • ALA links users to disparate data sources
  • NaThe ALA has already established an agreement to work with the Encyclopedia of Life and the University of Queensland Centre for Biological Information Technology. The aims of this project are as follows:
    Develop an open online catalogue of web identification tools – particularly diagnostic keys.
    Develop a reusable library of character state definitions.
    Explore atomisation of diagnostic keys into factual assertions concerning the relationships between taxa and character states.
    Explore innovative approaches to assisting users in identifying organisms online.
    The ALA has been in discussion with a number of international biodiversity data management projects about the possibility of adopting their solutions to establish regional hubs/mirrors. In each case the ALA would replicate their databases (with all existing and future data) and deploy an instance of their software. The ALA would then work with these international projects to enhance their (open-source) software to address national requirements and to integrate cleanly with processes and web infrastructure within Australian institutions.
    The projects in question are:
    MorphBank (http://www.morphbank.net/) – MorphBank is a library containing around 250,000 biological images. It is intended to serve as a platform for disparate projects to share their images and to manage a wide range of metadata for each image, including placement within a taxonomic hierarchy, geospatial data and morphological tags. MorphBank is interested in establishing a full mirror in Australia (images and software) and in working with the ALA to enhance and extend their open-source software.
    Barcode of Life Data Systems (http://www.barcodinglife.org/) – BOLD is a management system for sequence data, in particular COI barcode sequences but capable of supporting projects sharing any material-based sequence data sets. The system currently includes around 620,000 sequences for over 80,000 species. Many Australian projects are already contributing to thematic barcode networks (e.g. AllLeps, FishBOL, TreeBOL). An Australian node would give the opportunity to provide an integrated national view of all of these data and of data from other Australian sequencing projects. It would also provide a focus for integrating sequence data into the ALA’s GIS capabilities. BOLD also has LIMS software which may be of value to projects in managing their sequencing processes.
    Biodiversity Heritage Library (http://www.biodiversitylibrary.org/) – BHL has scanned and OCR-ed nearly 14,000,000 pages of historical biodiversity literature and is developing tools to mine these documents. The BHL platform allows all these publications to be accessed in a range of formats. New BHL projects are under way or starting in China, Europe and Japan. All of these will contribute to the global pool of accessible digital literature. BHL is willing to establish a replica node in Australia and to assist the ALA and its partners in planning and executing a scanning strategy here. The existence of such infrastructure could serve as a focus for project-based contributions of relevant literature and to explore collaborations with Australian libraries and publishers.
  • Data integration activities represent the core tasks funded under the original ALA NCRIS funding. The focus is on the development of tools and services to index primary information sources and provide integrated views to enable users to select and access those resources which are most relevant to their concerns.
    The following components are under development:
    Ontologies and vocabularies – data integration within the ALA and with other projects (e.g. AVH, OZCAM, GBIF, EOL, OBIS) depends on a shared understanding of the structure of biodiversity data and agreement about the data elements which can be shared. The ALA is working with TDWG and international projects to set up tools to engage the community in developing and maintaining the ontologies and vocabularies required for this purpose. These structures will be particularly important to the ALA Metadata Repository and will provide the models to be used within that tool for organising Australian biodiversity information.
    Metadata repository – The core component within the ALA’s data integration programme will be the Metadata Repository. This is being constructed using the Fedora open-source content management system. It will serve as a catalogue of biodiversity information resources (databases, documents, images, etc.) with provider-supplied metadata describing the origins and nature of each resource, but will be extended to link these resources to the species to which they relate, the geographic regions which they cover, etc. and to model the relationships between species, regions, habitats, descriptive characters, etc. (using information from tools such as the ALA Geospatial Data Cache). This will allow the ALA to produce web pages giving overviews of the available information relating to each species, region, habitat, etc. The Metadata Repository will therefore act as the engine serving information links to the proposed Data Dissemination components (especially Biodiversity Information Explorer and Biosecurity Portal).
    User authentication and identity management – The ALA will require the ability to authenticate users for many different purposes: to allow data providers to manage the metadata for their resources; to allow users to identify themselves to make annotations or provide additional data; for taxonomists to contribute to the Australian national checklists; etc. Building an integrated concept of the expertise of each individual will also allow the ALA to improve its use of the information supplied by each user.
    Annotation services – The ALA has received funding from the NCRIS Platforms for Collaboration capability’s NeAT programme to develop annotation services to enhance the quality of data and to enable end users to contribute new information to the network. This work is being carried out at the University of Queensland School of Information Technology and Electrical Engineering and early versions of some of the tools have been integrated into the GBIF Data Portal software at http://data.ala.org.au/. As the ALA proceeds, these tools will be used in many ways, including capturing user suggestions for corrected values within data records, free-text user comments, user tagging of species with descriptive terms, etc.
    Data quality and sensitive data tools – The ALA is currently exploring concerns around data sensitivity within state conservation agencies, natural history collections and biosecurity activities. The goal behind this is to develop best practice recommendations on the handling of occurrence records of conservation or biosecurity concern (e.g. reduction of precision of coordinates for records of species considered endangered in the state where they have been recorded), and then to provide easy-to-use services to scan sets of records (e.g. as a spreadsheet) to evaluate any possible issues and report back to the data provider. This is seen to be an important tool to assist data providers in becoming comfortable about sharing data and also to allow the broader community to develop consistent approaches to handling records for sensitive taxa. The tool can also be extended to include a wide range of additional data validation and other checks. In this form it will become a major component in the ALA’s approach to improving data quality. Records with issues can be reported to the data holders and can automatically be annotated with notes or suggested corrections. End users will also be able to use annotation tools to contribute to data quality. Ultimately all such annotations should be handled through workflows which capture responses from the data providers.
  • Biodiversity Heritage Library in Australia

    1. 1. The Atlas of Living Australia BHL-Au The Atlas of Living Australia project is a collaboration between the Australian Government, the Commonwealth Science and Industry Research Organisation (CSIRO), the Council of Heads of Australasian Herbaria, the Australian Museum, the Museum of Victoria, the Queensland Museum, Southern Cross University, the Tasmanian Museum and Art Gallery and the University of Adelaide. It is funded by the Australian Government under the National Collaborative Research Infrastructure Strategy and further supported by the Super Science Initiative of the Education Investment Fund.
    2. 2. At l as of Li vi ng Aust r al i a - shar i ng bi odi ver si t y knowl edge The Atlas of Living Australia • Government-funded (NCRIS) project to June 2012 • Mission: – To develop an authoritative, freely accessible, distributed and federated biodiversity data management system that links Australia’s biological knowledge with its scientific reference collections and other custodians of biological information – To share biodiversity knowledge to shape our future
    3. 3. At l as of Li vi ng Aust r al i a - shar i ng bi odi ver si t y knowl edge Biodiversity information Locality: Reid, ACT GPS: 35.280S 149.138E Date: 1 January 2008 Uresiphita ornithopteralis (Guenée, 1854) Kingdom: Animalia Phylum: Arthropoda Class: Insecta Order: Lepidoptera Family: Crambidae Subfamily: Pyraustinae Tribe: Pyraustini Genus: Uresiphita Hübner, 1825 English: tree lucerne moth = Mecyna ornithopteralis Guenée, 1854 Identified as Braconidae - ? Chaoilta sp. Parasitises Huntsman spider Holconia montana Preys upon Tagasaste (tree lucerne) Chamaecytisus palmensis Feeds upon Biology and ecology Molecular biology Distribution Literature
    4. 4. At l as of Li vi ng Aust r al i a - shar i ng bi odi ver si t y knowl edge Implementation Metadata (source, methods, ownership, access, etc.) Data (collections, field observations, literature, molecular, images, expert knowledge, etc.) Metadata repository Names and Classification Distribution Species Pages Regional Atlas Annotation Tools Pest Information Uses (biosecurity, land-use, climate change, crop development, resource management, materials, forensics, taxonomy, etc.) Links to international projects
    5. 5. At l as of Li vi ng Aust r al i a - shar i ng bi odi ver si t y knowl edge ALA Components Data Dissemination Conservation Portal Pest Information Portal Biodiversity Information Explorer Citizen Science Portal Spatial Data Management Spatial Toolkit Biological Data Cache Environmental Data Store Collection Data Management Field Capture of Metadata Accession Processing Digitisation and Imaging Support Database Integration Wrappers Integrated Data Sets OZCAM AMRiN AVH APPD OBIS ALA Project Office Australian National Checklists Web Services and User Interfaces Completed National Checklists (AFD, APC, etc.) Community Editing and Workflow Tools Directory of Taxonomic Expertise Legislative and Thematic Lists Data Integration Ontologies and Vocabularies Quality Control and Sensitive Data Tools Metadata Repository Annotation Services User Authentication and Identity Management Rich Data Stores Species Interactions Sequences (BOLD) Digital Literature (BHL) Descriptive Data (IdentifyLife) Images (MorphBank) Spatial Portal (Web GIS)
    6. 6. At l as of Li vi ng Aust r al i a - shar i ng bi odi ver si t y knowl edge Rich data stores
    7. 7. At l as of Li vi ng Aust r al i a - shar i ng bi odi ver si t y knowl edge Biodiversity Information Explorer http://www.ala.org.au
    8. 8. At l as of Li vi ng Aust r al i a - shar i ng bi odi ver si t y knowl edge Work to date • Mid 2010 – BHL-Au and BHL kickoff meeting. • Held in Melbourne at Museum Victoria and in Canberra at ALA HQ
    9. 9. At l as of Li vi ng Aust r al i a - shar i ng bi odi ver si t y knowl edge
    10. 10. At l as of Li vi ng Aust r al i a - shar i ng bi odi ver si t y knowl edge
    11. 11. At l as of Li vi ng Aust r al i a - shar i ng bi odi ver si t y knowl edge
    12. 12. At l as of Li vi ng Aust r al i a - shar i ng bi odi ver si t y knowl edgehttp://museumvictoria.com.au/about/mv-news/2010/bhl-visitors/
    13. 13. At l as of Li vi ng Aust r al i a - shar i ng bi odi ver si t y knowl edge
    14. 14. At l as of Li vi ng Aust r al i a - shar i ng bi odi ver si t y knowl edge Work to date • Mid 2010 – BHL-Au and BHL kickoff meeting. • Held in Melbourne at Museum Victoria and in Canberra at ALA HQ • Infrastructure setup – including development and test environments • Assessing workflows and mapping to Australian conditions • Developing new UI, without changes to functionality at this stage.
    15. 15. At l as of Li vi ng Aust r al i a - shar i ng bi odi ver si t y knowl edge http://bhl-test.ala.org.au/
    16. 16. At l as of Li vi ng Aust r al i a - shar i ng bi odi ver si t y knowl edge Human and other resources • A number of positions funded by ALA: team lead, developers, designer, project manager to be recruited. • Contractors • Existing contracts • Students and volunteers
    17. 17. At l as of Li vi ng Aust r al i a - shar i ng bi odi ver si t y knowl edge Plans • New UI for existing site • Mirroring content from other nodes • Ingestion and upload process • Metadata and links • New scanning
    18. 18. At l as of Li vi ng Aust r al i a - shar i ng bi odi ver si t y knowl edge Milestones • New UI for existing BHL site by December 2010 • Ingestion workflow decided and implemented by March 2011 • New scanning projects from 2011 • ALA funding finishes in mid 2012
    19. 19. At l as of Li vi ng Aust r al i a - shar i ng bi odi ver si t y knowl edge Getting to know you summary • Motivation for participating in BHL • Funding • Drivers/criteria for success for funding organisations • Overview of your organisation/project • Planned or required integration with other local, regional or international efforts • Human and other resources available • Dates of major milestones and deliverables
    20. 20. At l as of Li vi ng Aust r al i a - shar i ng bi odi ver si t y knowl edge Annotation services
    21. 21. At l as of Li vi ng Aust r al i a - shar i ng bi odi ver si t y knowl edge http://trove.nla.gov.au/ndp/

    ×