Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

OSFair2017 Workshop | OmicsDI: Omics discovery index

103 views

Published on

Rafael C Jimenez presents the Omics Discovery Index | OSFair2017 Workshop

Workshop title: How FAIR friendly is your data catalogue?

Workshop overview:
This workshop will build upon the work planned by the EOSCpilot data interoperability task and the BlueBridge workshop held on April 3 at the RDA meeting. We will investigate common mechanisms for interoperation of data catalogues that preserve established community standards, norms and resources, while simplifying the process of being/becoming FAIR. Can we have a simple interoperability architecture based on a common set of metadata types? What are the minimum metadata requirements to expose FAIR data to EOSC services and EOSC users?

DAY 3 - PARALLEL SESSION 6 & 7

Published in: Science
  • Be the first to comment

  • Be the first to like this

OSFair2017 Workshop | OmicsDI: Omics discovery index

  1. 1. www.elixir-europe.org @ELIXIREurope /company/elixir-europe OmicsDI Omics discovery index Rafael C Jimenez
  2. 2. How we can find a publication?
  3. 3. How we can find a dataset?
  4. 4. Aims • Facilitate discovery of high-quality ‘omics’ datasets (genomics, proteomics, transcriptomics and metabolomics). • Provide added value not just integration • Provide the infrastructure to register and integrate data repositories and their datasets.
  5. 5. http://www.omicsdi.org 15 repositories 90,880 datasets 3,840 diseases 2,781 tissues 4,605 species
  6. 6. Datasets search
  7. 7. Similarity visualization Metadata SimilarityBiological Similarity
  8. 8. Dataset metadata format Mandatory Fields: • Dataset Id • Dataset Title • Publication date • Submitter information (Name, Affiliation) • Original URL Recommended Fields: • Description/Abstract • Sample and Data Protocols • PubMed Id • Organism, Tissue, Disease. Additional Fields: • Protein Id (Ensembl or Uniprot) • Metabolite Id (ChEMBL) • More… https://github.com/BD2K-DDI/specifications/blob/master/docs/schema/fileds.md
  9. 9. OmicsDI XMLDatabases EBI Search Indexer INDEXING ENGINE EBI CLUSTER RESTWS SEARCH ENGINE End points: • Statistics • Datasets RESTFUL WS DATABASE WEB APP SEARCH STATISTICS TAGGING DDIApp Enrichment Annotation
  10. 10. Metadata processing • Input data Validation https://github.com/BD2K-DDI/xml-validator • Metadata enrichment https://github.com/BD2K-DDI/ddi-annotation • Identifier mapping ie. DOI to puimde IDs • Ontologies mapping ie. Text to NCBI taxonomy • Ontology enrichment ie. Disease synonymous and parent terms (annotator) • Analysis of dataset entries ie. Overlap of biological entries between datasets
  11. 11. Thanks for your attention

×