• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
ASIH Fishnet2 Presentation
 

ASIH Fishnet2 Presentation

on

  • 603 views

Brief overview of Fishnet2 for the 2009 ASIH meeting

Brief overview of Fishnet2 for the 2009 ASIH meeting

Statistics

Views

Total Views
603
Views on SlideShare
603
Embed Views
0

Actions

Likes
0
Downloads
2
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Apple Keynote

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    ASIH Fishnet2 Presentation ASIH Fishnet2 Presentation Presentation Transcript

    • Fishnet 2 A Network of Ichthyology Collections Offering Realtime Analysis and Visualization
    • Overview Original FishNet and related / subsequent projects demonstrated value of combining many collections into a global “virtual collection” (e.g. MaNIS, ORNIS, OBIS, GBIF) Issues of data availability, quality, and completeness (georeferencing, scientific names) and the technical performance and sustainability of networks NSF sponsored FishNet2 to evaluate enhancements and produce software tools to generally improve and further enable data sharing practices for natural history (and similar) collections Ichthyology collections used for prototypes Recent focus on performance and scalability of data portal enabling real time search and analysis by many simultaneous users ASIH - 2009 Vieglais, University of Kansas NSF-0415600
    • Sum of Sources Collectively: Specimens -> More data Taxonomic, Spatial, Temporal coverage Collections -> Combining data is beneficial to all participants, leads to increased overall data availability (repatriation), reliability and quality, and can improve the efficiency of future investigations (gap analysis). Sharing combined data is beneficial to all stakeholders (investigators, funding agencies, scientific community, public) and provides a more effective, efficient use of resources (reducing duplication of effort) ASIH - 2009 Vieglais, University of Kansas NSF-0415600
    • Many Sources Data Sources Web Portal Network / infrastructure issues: Query broadcast Data service maintenance Network vulnerabilities Multi-user scalability A fully distributed model enables up-to-date information, but suffers from severe reliability and scalability issues since each query is broadcast to all data sources and the responses collated. Maintenance costs become significant as more data sources are added, especially when fundamental changes are made to the protocol or data model. ASIH - 2009 Vieglais, University of Kansas NSF-0415600
    • Addressing Social Sustainability Operation Cost ≈ portal + n(source) Hardware 6 Hardware 1 Maintenance 3 Maintenance 1 Setup / Change 4 Setup / Change 1 Bandwidth 5 Bandwidth 1 18 4 Operation Cost ≈ 18 + n*4 After 4-5 sources, these are most expensive element. Hence one goal should be to minimize cost to data sources. ASIH - 2009 Vieglais, University of Kansas NSF-0415600
    • Portal Design Goals High performance. Query response < 1sec Scalable. > 10e7 records, multiple users Arbitrary content (focus on Darwin Core) Integration with related data Simple but effective subset and export Programmatic interfaces for extensibility Reporting of record use Provide value for participation ASIH - 2009 Vieglais, University of Kansas NSF-0415600
    • Components Source Manage UI Search UI Portal Darwin Core Data Portal Services Direct Web Server Upload Index DiGIR CSV TAPIR Record Cache Georef, Env Data Morphology, ... ASIH - 2009 Vieglais, University of Kansas NSF-0415600
    • ASIH - 2009 Vieglais, University of Kansas NSF-0415600
    • ASIH - 2009 Vieglais, University of Kansas NSF-0415600
    • Remote Administration ASIH - 2009 Vieglais, University of Kansas NSF-0415600
    • The Future Current solution forms a good basis for ongoing infrastructure. Extensible to new data, scalable to very large data sets. Technical issues are not the major problem. Viable solutions are available, and issues of data translation and transfer are for the most part well defined There are real costs associated with data sharing, so return on investment must be satisfying to participants Biggest problem is long term sustainability of the technical infrastructure, especially the data sources which comprise the bulk of the costs. Attract ongoing support from the community and direct fiscal and in-kind assistance to appropriate targets (e.g. maintenance of portal(s) versus deployment of new data sources) ASIH - 2009 Vieglais, University of Kansas NSF-0415600