Successfully reported this slideshow.

Irida immemxi hsiao

2

Share

Upcoming SlideShare
Irida bccdc dec10_2015
Irida bccdc dec10_2015
Loading in …3
×
1 of 15
1 of 15

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

Related Audiobooks

Free with a 14 day trial from Scribd

See all

Irida immemxi hsiao

  1. 1. IRIDA: Canada’s federated platform for genomic epidemiology William Hsiao, Ph.D. William.hsiao@bccdc.ca @wlhsiao BC Centre for Disease Control Public Health Laboratory and University of British Columbia
  2. 2. IRIDA Platform Overview • IRIDA= Integrated Rapid Infectious Disease Analysis • A free, open source, standards compliant, high quality genomic epidemiology analysis platform to support real-time disease outbreak investigations Core Functions: • Management of strain and genomic sequence data • Rapid processing and analysis of genomic data • Informative display of genomic results • Sample, Case, and aggregate data (“metadata”) Management Target audience: • Public health agencies who need a platform to manage and process genomic data • Public health agencies who need a platform to use genomics for outbreak investigations IRIDA Sequencing Instruments Web Application Data management Built-in Analytical Tools External Galaxy Command- line Tools
  3. 3. 10 simple rules (wish list) to build a better public health microbiology genomic epidemiology analysis system Download Latest version at https://github.com/phac-nml/irida
  4. 4. 1: Engage the Users Through the Entire Software Development Cycle National Public Health Agency Provincial Public Health Agency Academic/Public - Project Team has direct access to state of the art research in academia - Project Team is directly embedded in user organization
  5. 5. 2: Have A Simple User Interface Line List View (under testing) Timeline View (Conceptualization) Selectable fields Travel Symptoms and Onset Exposure Types Hospitalization Launch a pipeline Be Like
  6. 6. 3: Build a Robust, Extensible Platform • IRIDA uses Galaxy to manage workflows • Adding additional pipelines is relatively easy • Using a standard API to allow 3rd party tools to obtain data from IRIDA (e.g. IslandViewer and GenGIS) IRIDA ServletContainer REST API Central File Storage Web Interface ApplicationLogic Compute Cluster Galaxy $ ~ >_ Galaxy http://www.pathogenomics.sfu.ca/islandviewer/ http://kiwi.cs.dal.ca/GenGIS/Main_Page
  7. 7. 4: Have Extensive Documentation • Documentation should be available for • Users – step by step tutorial with screen shots / FAQ • System Administrators – installation instructions / issue trackers • Developers – open source, collaborative development / IRC Channel • Easily Accessible at https://irida.corefacility.ca/documentation/
  8. 8. 5: Implement QC Throughout the Whole Application • Genomics is sensitive and sequence data are inherently noisy • Genomics is a rapidly advancing technology • Standardizing pipelines difficult and can stifle innovation • Better to standardize the performance and reporting metrics and ensure any validated pipelines meet the testing criteria • Developing a general QC testing module (RCQC) that use ontology to standardize QC metrics (https://github.com/Public-Health-Bioinformatics/rcqc) • Data Provenance and Version Control (data + Pipelines) are must’s for Diagnostic Labs
  9. 9. 6: Build to Enable Collaboration • Be able to compare pipelines • Pipeline implemented using Galaxy – transparent and shareable • Define QC criteria using ontology to compare the different pipelines of the same purpose • Be able to share data in standard formats to minimize data re-entry from one platform to another • Federation of platforms using standard API to share data and analysis results
  10. 10. 7: Use Compatible Data Standards • Sequence data are more compatible / shareable but metadata are currently in silo and incompatible • Collaboration and Sharing are difficult when data are incompatible • Compatibility != Sameness • Use Ontology to allow customization of term list but all terms with same meaning (semantics) should have the same universal ID (e.g. an URL) to facilitate mapping of terms
  11. 11. 8: Implement Fine Grained Access Control Detailed View Restricted View E.g. User role permissions control visibility and editing of content Authorization • Industry-standard authentication and authorization mechanisms • Local authorization per instance. • Method-level authorization. • Object-level authorization.
  12. 12. 9: Use Technology to Safeguard Patient Privacy It’s easy to lose control of the Excel Line List - someone can make a copy of the content and pass it around without your knowledge; typos are common and cumulative! Technology can control who sees what and when Separate out sensitive patient data from pathogen sequence data but be able to bring them together when necessary without resorting to emailing of line lists!
  13. 13. 10: Have Multiple, Flexible Access Options • No one size fits all solution; Having many platforms to choose from is a good thing (but data should be portable across platforms!) • IRIDA is available in several different flavours: Local Install Virtual Machine Cloud Instance Public Version Advantages Full control of the system; your data never leave your centre Full control of the system; Easy to setup Full control of the system; does not require local computing infrastructure No setup required, upload your data and have it processed using Compute Canada Resource Disadvantages Computing infrastructure and IT support needed to main the resource Not really scalable if run on your own desktop; some performance loss Data go into a cloud environment; uploading to cloud environment can be slow Data go into a public instance (data remain private to your account); upload can be slow
  14. 14. Acknowledgements Project Leaders Fiona Brinkman – SFU Will Hsiao – PHMRL Gary Van Domselaar – NML University of Lisbon Joᾶo Carriҫo National Microbiology Laboratory (NML) Franklin Bristow Aaron Petkau Thomas Matthews Josh Adam Adam Olson Tarah Lynch Shaun Tyler Philip Mabon Philip Au Celine Nadon Matthew Stuart-Edwards Morag Graham Chrystal Berry Lorelee Tschetter Aleisha Reimer Laboratory for Foodborne Zoonoses (LFZ) Eduardo Taboada Peter Kruczkiewicz Chad Laing Vic Gannon Matthew Whiteside Ross Duncan Steven Mutschall Simon Fraser University (SFU) Melanie Courtot Emma Griffiths Geoff Winsor Julie Shay Matthew Laird Bhav Dhillon Raymond Lo BC Public Health Microbiology & Reference Laboratory (PHMRL) and BC Centre for Disease Control (BCCDC) Judy Isaac-Renton Patrick Tang Natalie Prystajecky Jennifer Gardy Damion Dooley Linda Hoang Kim MacDonald Yin Chang Eleni Galanis Marsha Taylor Cletus D’Souza Ana Paccagnella University of Maryland Lynn Schriml Canadian Food Inspection Agency (CFIA) Burton Blais Catherine Carrillo Dominic Lambert Dalhousie University Rob Beiko Alex Keddy 14 McMaster University Andrew McArthur Daim Sardar European Nucleotide Archive Guy Cochrane Petra ten Hoopen Clara Amid European Food Safety Agency Leibana Criado Ernesto Vernazza Francesco Rizzi Valentina
  15. 15. 15 15 IRIDA Annual General Meeting Winnipeg, April 8-9, 2015

Editor's Notes

  • What is IRIDA?
  • Inspired by Jenn’s keynote, I reworked my slides in the 10 simple rules format

    Many systems are and will be available for analyzing public health microbiology data and we have seen a few throughout this conference.

    So I thought I’d present what I think are some of the rules and my wishlist for building a better public health genomic epidemiology platform.
    Highlighting how some of this thinking apply to our implementation of a platform

    Some of these rules have been implemented well in others applications
  • Large Group of People who contributed to this work
  • We also have a wonderful group of advisors
  • ×