SlideShare a Scribd company logo
Bringing Standards to Life:
 Software Development by the
          Genomics
    Standards Consortium




            Renzo Kottmann
             Microbial Genomics Group
     Max Planck Institute for Marine Microbiology

      M3 SIG Stockholm July 2009                1
Genomic Standards Consortium (GSC)

Goal
  • Promote mechanisms that
         standardize the description of genomes
         exchange and integrate genomic data

Open-membership, international working body
  • Established in Sept 2005
  • Participants include DDBJ, EMBL, GenBank, Sanger,
    JCVI, JGI, EBI and a range of US, UK and EU research
    institutions
  • Organized a series of workshops


                                                                             2       2
            http://gensc.org and http://gensc.org/gc_wiki/index.php/GSC_Membership
Minimum Information about a Genome Sequence
              (MIGS) Specification

MIGS extends what DDBJ/EMBL/GenBank request
 upon submission of a genome sequence
  • Examples:
       Description of geographic location of a sample and
        habitat
       “Minimum Information about a Metagenomic Sequence”
        (MIMS)
         – Temperature
         – pH
       Description of sequence generation
         – Sequencing method
         – Assembly method

                                                             3   3
                         Field et al. Nat Biotechnol. 2008
MIGS Checklist 2.0




                                      4   4
  Field et al. Nat Biotechnol. 2008
MIGS Checklist 2.0




                                          M = mandatory




                                      5              5
  Field et al. Nat Biotechnol. 2008
Software Development for MIGS/MIMS

Mechanisms for
 achieving compliance
 are needed:
  • Such mechanisms
    involve
       an appropriate reporting
        structure for capturing
        and exchanging data,
        software,
        databases
        and controlled
        vocabularies and/or
        ontologies for defining
        the terms used in the
        annotations.

                                         6
     Field et al. Nat Biotechnol. 2008
Software Development for MIGS/MIMS

Mechanisms for                          Supporting Projects:
 achieving compliance                      • Habitat-Lite (Ontology
 are needed:                                 specification)
  • Such mechanisms
    involve
       an appropriate reporting
        structure for capturing
        and exchanging data,
        software,
        databases
        and controlled
        vocabularies and/or
        ontologies for defining
        the terms used in the
        annotations.

                                                        7
     Field et al. Nat Biotechnol. 2008
Software Development for MIGS/MIMS

Mechanisms for                          Supporting Projects:
 achieving compliance                      • Habitat-Lite (Ontology
 are needed:                                 specification)
  • Such mechanisms                        • Genomic Rosetta Stone
    involve                                  (Identifier Mapping)
       an appropriate reporting
        structure for capturing
        and exchanging data,
        software,
        databases
        and controlled
        vocabularies and/or
        ontologies for defining
        the terms used in the
        annotations.

                                                       8
     Field et al. Nat Biotechnol. 2008
Software Development for MIGS/MIMS

Mechanisms for                          Supporting Projects:
 achieving compliance                      • Habitat-Lite (Ontology
 are needed:                                 specification)
  • Such mechanisms                        • Genomic Rosetta Stone
    involve                                  (Identifier Mapping)
       an appropriate reporting           • GCDML (MIGS/MIMS
        structure for capturing
        and exchanging data,                 specification in XML)
        software,
        databases
        and controlled
        vocabularies and/or
        ontologies for defining
        the terms used in the
        annotations.

                                                       9
     Field et al. Nat Biotechnol. 2008
Software Development for MIGS/MIMS

Mechanisms for                          Supporting Projects:
 achieving compliance                      • Habitat-Lite (Ontology
 are needed:                                 specification)
  • Such mechanisms                        • Genomic Rosetta Stone
    involve                                  (Identifier Mapping)
       an appropriate reporting           • GCDML (MIGS/MIMS
        structure for capturing
        and exchanging data,                 specification in XML)
        software,                         • Genomes Catalogue
        databases                           (Database and Web
        and controlled                      Server)
        vocabularies and/or
        ontologies for defining
        the terms used in the
        annotations.

                                                       10
     Field et al. Nat Biotechnol. 2008
Aquatic Aquatic: Freshwater Acquatic: Marine Terrestrial Air Fossil Food Organism-Associated Extreme Habitat Other


                                               Habitat-Lite (= EnvO-Lite)
        Easy-to-use (small) set of terms
                • Captures high-level information about habitat
                • Derived from the Environment Ontology (EnvO).

        Meet the needs of multiple users
                • Annotators, database providers, biologists, and
                  bioinformaticians alike who need to search and
                  employ such data in comparative analyses.




                                                                  Hirschman et al. OMICS. 2008                       11   11
Habitat-Lite

            1. Level                                  2. Level
Aquatic                              soil
 Aquatic: Freshwater                 sediment
 Aquatic: Marine                     sludge
Terrestrial                          waste water
Air                                  hot spring
Fossil                               hydrothermal vent
Food                                 biofilm
Organism-Associated                  microbial mat
Extreme Habitat
Other


                       < 20 terms

                       Hirschman et al. OMICS. 2008        12    12
Habitat-Lite applied




   http://www.megx.net/genomes   13   13
Genomic Rosetta Stone (GRS)
Create a unified mapping between different genomic
 resources
Improve navigation across these resources
Enable the integration of this information in the near
 future.




                    Van Brabant et al. OMICS. 2008   14   14
Genomic Rosetta Stone (GRS)




       Van Brabant et al. OMICS. 2008   15   15
Genomic Rosetta Stone (GRS)
Enable the integration of this information in the near
 future




                    Van Brabant et al. OMICS. 2008   16   16
Genomic Contextual Data
             Markup Language (GCDML)


An Extensible Markup Language (XML)


Aim
  • Implement MIGS/MIMS
  • Provide even more descriptors
  • Facilitate exchange and integration of genomic data




                      Kottmann et al. OMICS. 2008   17    17
GCDML Example (excerpt)



<gcdml:originalSample>
  <gcdml:physicalMaterial>
    <gcdml:samplingTime><gcdml:notGiven>unknown</gcdml:notGiven></gcdml:samplingTime>

    <gcdml:samplePointLocation>
      <gml:LocationKeyWord>Baltic Sea</gml:LocationKeyWord>
      <gml:LocationString>Kiel Fjord, Baltic Sea, Germany</gml:LocationString>
      <gcdml:pos2D>54.329 10.149</gcdml:pos2D>
      <gcdml:determinationMethod>derived from literature</gcdml:determinationMethod>
    </gcdml:samplePointLocation>

    <gcdml:marineHabitat>
      <gcdml:waterBody>
         <gcdml:depth>
           <gcdml:measure min="0.00" max="0.05“><gcdml:values uom="m">0.00 0.05</gcdml:values></gcdml:measure>
         </gcdml:depth>
      </gcdml:waterBody>
    </gcdml:marineHabitat>

     <gcdml:materialType>seawater</gcdml:materialType>
     <gcdml:amount><gcdml:measure><gcdml:values uom="ml">100</gcdml:values></gcdml:measure></gcdml:amount>
  </gcdml:physicalMaterial>
</gcdml:originalSample>                                                                 18
                                             Kottmann et al. OMICS. 2008                                         18
GCDML Example (excerpt)



<gcdml:originalSample>
  <gcdml:physicalMaterial>
    <gcdml:samplingTime><gcdml:notGiven>unknown</gcdml:notGiven></gcdml:samplingTime>

    <gcdml:samplePointLocation>
      <gml:LocationKeyWord>Baltic Sea</gml:LocationKeyWord>
      <gml:LocationString>Kiel Fjord, Baltic Sea, Germany</gml:LocationString>
      <gcdml:pos2D>54.329 10.149</gcdml:pos2D>
      <gcdml:determinationMethod>derived from literature</gcdml:determinationMethod>
    </gcdml:samplePointLocation>

    <gcdml:marineHabitat>
      <gcdml:waterBody>
         <gcdml:depth>
           <gcdml:measure min="0.00" max="0.05“><gcdml:values uom="m">0.00 0.05</gcdml:values></gcdml:measure>
         </gcdml:depth>
      </gcdml:waterBody>
    </gcdml:marineHabitat>

     <gcdml:materialType>seawater</gcdml:materialType>
     <gcdml:amount><gcdml:measure><gcdml:values uom="ml">100</gcdml:values></gcdml:measure></gcdml:amount>
  </gcdml:physicalMaterial>
</gcdml:originalSample>                                                                 19
                                             Kottmann et al. OMICS. 2008                                         19
GCDML Example (excerpt)



<gcdml:originalSample>
  <gcdml:physicalMaterial>
    <gcdml:samplingTime><gcdml:notGiven>unknown</gcdml:notGiven></gcdml:samplingTime>

    <gcdml:samplePointLocation>
      <gml:LocationKeyWord>Baltic Sea</gml:LocationKeyWord>
      <gml:LocationString>Kiel Fjord, Baltic Sea, Germany</gml:LocationString>
      <gcdml:pos2D>54.329 10.149</gcdml:pos2D>
      <gcdml:determinationMethod>derived from literature</gcdml:determinationMethod>
    </gcdml:samplePointLocation>

    <gcdml:marineHabitat>
      <gcdml:waterBody>
         <gcdml:depth>
           <gcdml:measure min="0.00" max="0.05“><gcdml:values uom="m">0.00 0.05</gcdml:values></gcdml:measure>
         </gcdml:depth>
      </gcdml:waterBody>
    </gcdml:marineHabitat>

     <gcdml:materialType>seawater</gcdml:materialType>
     <gcdml:amount><gcdml:measure><gcdml:values uom="ml">100</gcdml:values></gcdml:measure></gcdml:amount>
  </gcdml:physicalMaterial>
</gcdml:originalSample>                                                                 20
                                             Kottmann et al. OMICS. 2008                                         20
Genome Catalogue
Online system for capturing MIGS/MIMS compliant
 reports




                    Field et al. Nature 2008   21   21
Genome Catalogue
Requirements
  • A Rich toolkit/user-friendly
  • Designed to give credit to all contributors
  • XML-based (GCDML)
        Able to maintain all versions of GCDML schemas
  • Web services-based
        Supporting the automated exchange of content
  • Serve as the international GCAT identifier authority
  • Comprehensive
        Containing reports for all taxa and metagenomes
  • Ontology-supportive
  • Shared by the GSC

                                                 22        22
Current Status
We have specifications:
  • MIGS/MIMS
  • Habitat-Lite
  • Genomic Rosetta Stone
Work on supporting software is ongoing:
  • Genomes Catalogue is in prototype status
  • Funding
        This is a long-term endeavour that can not be done on a
         voluntary basis




                                                  23               23
Disscusion
Need of software for:
  • Creation of MIGS/MIMS data
  • Storage
  • Analysis
Expand standardization efforts to
  • Software specification/development
  • Work on a standardized genomic data management
    architecture / cyberinfrastructure
Data intensive science is successful if it works
 towards one community with one vision
  • World Wide Genomics project

                                          24         24
Acknowledgements

All Members of GSC incl.
       Dawn Field
       Peter Sterk
       Saul Kravitz
       Tanya Gray

Megx.net team
       Frank Oliver Glöckner
       Ivaylo Kostadinov
       Melissa Beth Duhaime
       Pier Luigi Buttigieg
       Wolfgang Hankeln
       Pelin Yilmaz


                                            25
END



Looking forward to the discussion

          Join the GSC
         http://gensc.org


                            26       26

More Related Content

Similar to Software Development by the Genomics Standards Consortium

The MIBBI Foundry and its Modules
The MIBBI Foundry and its ModulesThe MIBBI Foundry and its Modules
The MIBBI Foundry and its ModulesMIBBI Checklists
 
Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...
Jian Qin
 
The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...
The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...
The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...
Larry Smarr
 
Cambridge University Geospatial Metadata Workshop 20110524
Cambridge University Geospatial Metadata Workshop 20110524Cambridge University Geospatial Metadata Workshop 20110524
Cambridge University Geospatial Metadata Workshop 20110524EDINA, University of Edinburgh
 
2011Field talk at iEVOBIO 2011
2011Field talk at iEVOBIO 20112011Field talk at iEVOBIO 2011
2011Field talk at iEVOBIO 2011
MIBBI Checklists
 
Tim Malthus_Towards standards for the exchange of field spectral datasets
Tim Malthus_Towards standards for the exchange of field spectral datasetsTim Malthus_Towards standards for the exchange of field spectral datasets
Tim Malthus_Towards standards for the exchange of field spectral datasetsTERN Australia
 
Leeds University Geospatial Metadata Workshop 20110617
Leeds University Geospatial Metadata Workshop 20110617Leeds University Geospatial Metadata Workshop 20110617
Leeds University Geospatial Metadata Workshop 20110617EDINA, University of Edinburgh
 
AI and Machine Learning for Secondary Metabolite Prediction
AI and Machine Learning for Secondary Metabolite PredictionAI and Machine Learning for Secondary Metabolite Prediction
AI and Machine Learning for Secondary Metabolite Prediction
Yannick Djoumbou
 
Oxford University Geospatial Metadata Workshop 20110415
Oxford University Geospatial Metadata Workshop 20110415Oxford University Geospatial Metadata Workshop 20110415
Oxford University Geospatial Metadata Workshop 20110415EDINA, University of Edinburgh
 
Northumbria University Geospatial Metadata Workshop 20110505
Northumbria University Geospatial Metadata Workshop 20110505Northumbria University Geospatial Metadata Workshop 20110505
Northumbria University Geospatial Metadata Workshop 20110505
EDINA, University of Edinburgh
 
Geospatial Metadata and Spatial Data: It's all Greek to me!
Geospatial Metadata and Spatial Data: It's all Greek to me!Geospatial Metadata and Spatial Data: It's all Greek to me!
Geospatial Metadata and Spatial Data: It's all Greek to me!
EDINA, University of Edinburgh
 
Henry&Hobbs, 'Developing long-term agro-ecological trial datasets for C and N...
Henry&Hobbs, 'Developing long-term agro-ecological trial datasets for C and N...Henry&Hobbs, 'Developing long-term agro-ecological trial datasets for C and N...
Henry&Hobbs, 'Developing long-term agro-ecological trial datasets for C and N...TERN Australia
 
Data cycle microbes
Data cycle microbesData cycle microbes
Data cycle microbes
jyotikhadake
 
Cpascoe pimms or2012_
Cpascoe pimms or2012_Cpascoe pimms or2012_
Cpascoe pimms or2012_
Charlotte Pascoe
 
Dr. Ying Xiao: Radiation Therapy Oncology Group Bioinformatics
Dr. Ying Xiao: Radiation Therapy Oncology Group BioinformaticsDr. Ying Xiao: Radiation Therapy Oncology Group Bioinformatics
Dr. Ying Xiao: Radiation Therapy Oncology Group Bioinformatics
National Cancer Institute National Cancer Informatics Program
 
Human genome project the mitre corporation - jason program office
Human genome project   the mitre corporation - jason program officeHuman genome project   the mitre corporation - jason program office
Human genome project the mitre corporation - jason program officePublicLeaks
 
Human genome project the mitre corporation - jason program office
Human genome project   the mitre corporation - jason program officeHuman genome project   the mitre corporation - jason program office
Human genome project the mitre corporation - jason program officePublicLeaker
 
BioDec Srl Company Profile
BioDec Srl Company ProfileBioDec Srl Company Profile
BioDec Srl Company Profile
BioDec
 

Similar to Software Development by the Genomics Standards Consortium (20)

The MIBBI Foundry and its Modules
The MIBBI Foundry and its ModulesThe MIBBI Foundry and its Modules
The MIBBI Foundry and its Modules
 
Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...
 
The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...
The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...
The OptIPlanet Collaboratory Supporting Microbial Metagenomics Researchers Wo...
 
Cambridge University Geospatial Metadata Workshop 20110524
Cambridge University Geospatial Metadata Workshop 20110524Cambridge University Geospatial Metadata Workshop 20110524
Cambridge University Geospatial Metadata Workshop 20110524
 
2011Field talk at iEVOBIO 2011
2011Field talk at iEVOBIO 20112011Field talk at iEVOBIO 2011
2011Field talk at iEVOBIO 2011
 
iRODS
iRODSiRODS
iRODS
 
Tim Malthus_Towards standards for the exchange of field spectral datasets
Tim Malthus_Towards standards for the exchange of field spectral datasetsTim Malthus_Towards standards for the exchange of field spectral datasets
Tim Malthus_Towards standards for the exchange of field spectral datasets
 
Leeds University Geospatial Metadata Workshop 20110617
Leeds University Geospatial Metadata Workshop 20110617Leeds University Geospatial Metadata Workshop 20110617
Leeds University Geospatial Metadata Workshop 20110617
 
AI and Machine Learning for Secondary Metabolite Prediction
AI and Machine Learning for Secondary Metabolite PredictionAI and Machine Learning for Secondary Metabolite Prediction
AI and Machine Learning for Secondary Metabolite Prediction
 
Oxford University Geospatial Metadata Workshop 20110415
Oxford University Geospatial Metadata Workshop 20110415Oxford University Geospatial Metadata Workshop 20110415
Oxford University Geospatial Metadata Workshop 20110415
 
Northumbria University Geospatial Metadata Workshop 20110505
Northumbria University Geospatial Metadata Workshop 20110505Northumbria University Geospatial Metadata Workshop 20110505
Northumbria University Geospatial Metadata Workshop 20110505
 
Geospatial Metadata and Spatial Data: It's all Greek to me!
Geospatial Metadata and Spatial Data: It's all Greek to me!Geospatial Metadata and Spatial Data: It's all Greek to me!
Geospatial Metadata and Spatial Data: It's all Greek to me!
 
Henry&Hobbs, 'Developing long-term agro-ecological trial datasets for C and N...
Henry&Hobbs, 'Developing long-term agro-ecological trial datasets for C and N...Henry&Hobbs, 'Developing long-term agro-ecological trial datasets for C and N...
Henry&Hobbs, 'Developing long-term agro-ecological trial datasets for C and N...
 
Data cycle microbes
Data cycle microbesData cycle microbes
Data cycle microbes
 
Cpascoe pimms or2012_
Cpascoe pimms or2012_Cpascoe pimms or2012_
Cpascoe pimms or2012_
 
Dr. Ying Xiao: Radiation Therapy Oncology Group Bioinformatics
Dr. Ying Xiao: Radiation Therapy Oncology Group BioinformaticsDr. Ying Xiao: Radiation Therapy Oncology Group Bioinformatics
Dr. Ying Xiao: Radiation Therapy Oncology Group Bioinformatics
 
Human genome project the mitre corporation - jason program office
Human genome project   the mitre corporation - jason program officeHuman genome project   the mitre corporation - jason program office
Human genome project the mitre corporation - jason program office
 
Human genome project the mitre corporation - jason program office
Human genome project   the mitre corporation - jason program officeHuman genome project   the mitre corporation - jason program office
Human genome project the mitre corporation - jason program office
 
BioDec Srl Company Profile
BioDec Srl Company ProfileBioDec Srl Company Profile
BioDec Srl Company Profile
 
Brizio rossibiodec
Brizio rossibiodecBrizio rossibiodec
Brizio rossibiodec
 

Recently uploaded

20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 

Recently uploaded (20)

20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 

Software Development by the Genomics Standards Consortium

  • 1. Bringing Standards to Life: Software Development by the Genomics Standards Consortium Renzo Kottmann Microbial Genomics Group Max Planck Institute for Marine Microbiology M3 SIG Stockholm July 2009 1
  • 2. Genomic Standards Consortium (GSC) Goal • Promote mechanisms that  standardize the description of genomes  exchange and integrate genomic data Open-membership, international working body • Established in Sept 2005 • Participants include DDBJ, EMBL, GenBank, Sanger, JCVI, JGI, EBI and a range of US, UK and EU research institutions • Organized a series of workshops 2 2 http://gensc.org and http://gensc.org/gc_wiki/index.php/GSC_Membership
  • 3. Minimum Information about a Genome Sequence (MIGS) Specification MIGS extends what DDBJ/EMBL/GenBank request upon submission of a genome sequence • Examples:  Description of geographic location of a sample and habitat  “Minimum Information about a Metagenomic Sequence” (MIMS) – Temperature – pH  Description of sequence generation – Sequencing method – Assembly method 3 3 Field et al. Nat Biotechnol. 2008
  • 4. MIGS Checklist 2.0 4 4 Field et al. Nat Biotechnol. 2008
  • 5. MIGS Checklist 2.0 M = mandatory 5 5 Field et al. Nat Biotechnol. 2008
  • 6. Software Development for MIGS/MIMS Mechanisms for achieving compliance are needed: • Such mechanisms involve  an appropriate reporting structure for capturing and exchanging data,  software,  databases  and controlled vocabularies and/or ontologies for defining the terms used in the annotations. 6 Field et al. Nat Biotechnol. 2008
  • 7. Software Development for MIGS/MIMS Mechanisms for Supporting Projects: achieving compliance • Habitat-Lite (Ontology are needed: specification) • Such mechanisms involve  an appropriate reporting structure for capturing and exchanging data,  software,  databases  and controlled vocabularies and/or ontologies for defining the terms used in the annotations. 7 Field et al. Nat Biotechnol. 2008
  • 8. Software Development for MIGS/MIMS Mechanisms for Supporting Projects: achieving compliance • Habitat-Lite (Ontology are needed: specification) • Such mechanisms • Genomic Rosetta Stone involve (Identifier Mapping)  an appropriate reporting structure for capturing and exchanging data,  software,  databases  and controlled vocabularies and/or ontologies for defining the terms used in the annotations. 8 Field et al. Nat Biotechnol. 2008
  • 9. Software Development for MIGS/MIMS Mechanisms for Supporting Projects: achieving compliance • Habitat-Lite (Ontology are needed: specification) • Such mechanisms • Genomic Rosetta Stone involve (Identifier Mapping)  an appropriate reporting • GCDML (MIGS/MIMS structure for capturing and exchanging data, specification in XML)  software,  databases  and controlled vocabularies and/or ontologies for defining the terms used in the annotations. 9 Field et al. Nat Biotechnol. 2008
  • 10. Software Development for MIGS/MIMS Mechanisms for Supporting Projects: achieving compliance • Habitat-Lite (Ontology are needed: specification) • Such mechanisms • Genomic Rosetta Stone involve (Identifier Mapping)  an appropriate reporting • GCDML (MIGS/MIMS structure for capturing and exchanging data, specification in XML)  software, • Genomes Catalogue  databases (Database and Web  and controlled Server) vocabularies and/or ontologies for defining the terms used in the annotations. 10 Field et al. Nat Biotechnol. 2008
  • 11. Aquatic Aquatic: Freshwater Acquatic: Marine Terrestrial Air Fossil Food Organism-Associated Extreme Habitat Other Habitat-Lite (= EnvO-Lite) Easy-to-use (small) set of terms • Captures high-level information about habitat • Derived from the Environment Ontology (EnvO). Meet the needs of multiple users • Annotators, database providers, biologists, and bioinformaticians alike who need to search and employ such data in comparative analyses. Hirschman et al. OMICS. 2008 11 11
  • 12. Habitat-Lite 1. Level 2. Level Aquatic soil Aquatic: Freshwater sediment Aquatic: Marine sludge Terrestrial waste water Air hot spring Fossil hydrothermal vent Food biofilm Organism-Associated microbial mat Extreme Habitat Other < 20 terms Hirschman et al. OMICS. 2008 12 12
  • 13. Habitat-Lite applied http://www.megx.net/genomes 13 13
  • 14. Genomic Rosetta Stone (GRS) Create a unified mapping between different genomic resources Improve navigation across these resources Enable the integration of this information in the near future. Van Brabant et al. OMICS. 2008 14 14
  • 15. Genomic Rosetta Stone (GRS) Van Brabant et al. OMICS. 2008 15 15
  • 16. Genomic Rosetta Stone (GRS) Enable the integration of this information in the near future Van Brabant et al. OMICS. 2008 16 16
  • 17. Genomic Contextual Data Markup Language (GCDML) An Extensible Markup Language (XML) Aim • Implement MIGS/MIMS • Provide even more descriptors • Facilitate exchange and integration of genomic data Kottmann et al. OMICS. 2008 17 17
  • 18. GCDML Example (excerpt) <gcdml:originalSample> <gcdml:physicalMaterial> <gcdml:samplingTime><gcdml:notGiven>unknown</gcdml:notGiven></gcdml:samplingTime> <gcdml:samplePointLocation> <gml:LocationKeyWord>Baltic Sea</gml:LocationKeyWord> <gml:LocationString>Kiel Fjord, Baltic Sea, Germany</gml:LocationString> <gcdml:pos2D>54.329 10.149</gcdml:pos2D> <gcdml:determinationMethod>derived from literature</gcdml:determinationMethod> </gcdml:samplePointLocation> <gcdml:marineHabitat> <gcdml:waterBody> <gcdml:depth> <gcdml:measure min="0.00" max="0.05“><gcdml:values uom="m">0.00 0.05</gcdml:values></gcdml:measure> </gcdml:depth> </gcdml:waterBody> </gcdml:marineHabitat> <gcdml:materialType>seawater</gcdml:materialType> <gcdml:amount><gcdml:measure><gcdml:values uom="ml">100</gcdml:values></gcdml:measure></gcdml:amount> </gcdml:physicalMaterial> </gcdml:originalSample> 18 Kottmann et al. OMICS. 2008 18
  • 19. GCDML Example (excerpt) <gcdml:originalSample> <gcdml:physicalMaterial> <gcdml:samplingTime><gcdml:notGiven>unknown</gcdml:notGiven></gcdml:samplingTime> <gcdml:samplePointLocation> <gml:LocationKeyWord>Baltic Sea</gml:LocationKeyWord> <gml:LocationString>Kiel Fjord, Baltic Sea, Germany</gml:LocationString> <gcdml:pos2D>54.329 10.149</gcdml:pos2D> <gcdml:determinationMethod>derived from literature</gcdml:determinationMethod> </gcdml:samplePointLocation> <gcdml:marineHabitat> <gcdml:waterBody> <gcdml:depth> <gcdml:measure min="0.00" max="0.05“><gcdml:values uom="m">0.00 0.05</gcdml:values></gcdml:measure> </gcdml:depth> </gcdml:waterBody> </gcdml:marineHabitat> <gcdml:materialType>seawater</gcdml:materialType> <gcdml:amount><gcdml:measure><gcdml:values uom="ml">100</gcdml:values></gcdml:measure></gcdml:amount> </gcdml:physicalMaterial> </gcdml:originalSample> 19 Kottmann et al. OMICS. 2008 19
  • 20. GCDML Example (excerpt) <gcdml:originalSample> <gcdml:physicalMaterial> <gcdml:samplingTime><gcdml:notGiven>unknown</gcdml:notGiven></gcdml:samplingTime> <gcdml:samplePointLocation> <gml:LocationKeyWord>Baltic Sea</gml:LocationKeyWord> <gml:LocationString>Kiel Fjord, Baltic Sea, Germany</gml:LocationString> <gcdml:pos2D>54.329 10.149</gcdml:pos2D> <gcdml:determinationMethod>derived from literature</gcdml:determinationMethod> </gcdml:samplePointLocation> <gcdml:marineHabitat> <gcdml:waterBody> <gcdml:depth> <gcdml:measure min="0.00" max="0.05“><gcdml:values uom="m">0.00 0.05</gcdml:values></gcdml:measure> </gcdml:depth> </gcdml:waterBody> </gcdml:marineHabitat> <gcdml:materialType>seawater</gcdml:materialType> <gcdml:amount><gcdml:measure><gcdml:values uom="ml">100</gcdml:values></gcdml:measure></gcdml:amount> </gcdml:physicalMaterial> </gcdml:originalSample> 20 Kottmann et al. OMICS. 2008 20
  • 21. Genome Catalogue Online system for capturing MIGS/MIMS compliant reports Field et al. Nature 2008 21 21
  • 22. Genome Catalogue Requirements • A Rich toolkit/user-friendly • Designed to give credit to all contributors • XML-based (GCDML)  Able to maintain all versions of GCDML schemas • Web services-based  Supporting the automated exchange of content • Serve as the international GCAT identifier authority • Comprehensive  Containing reports for all taxa and metagenomes • Ontology-supportive • Shared by the GSC 22 22
  • 23. Current Status We have specifications: • MIGS/MIMS • Habitat-Lite • Genomic Rosetta Stone Work on supporting software is ongoing: • Genomes Catalogue is in prototype status • Funding  This is a long-term endeavour that can not be done on a voluntary basis 23 23
  • 24. Disscusion Need of software for: • Creation of MIGS/MIMS data • Storage • Analysis Expand standardization efforts to • Software specification/development • Work on a standardized genomic data management architecture / cyberinfrastructure Data intensive science is successful if it works towards one community with one vision • World Wide Genomics project 24 24
  • 25. Acknowledgements All Members of GSC incl.  Dawn Field  Peter Sterk  Saul Kravitz  Tanya Gray Megx.net team  Frank Oliver Glöckner  Ivaylo Kostadinov  Melissa Beth Duhaime  Pier Luigi Buttigieg  Wolfgang Hankeln  Pelin Yilmaz 25
  • 26. END Looking forward to the discussion Join the GSC http://gensc.org 26 26