SlideShare a Scribd company logo
1 of 31
Data integration in ENFIN using standards
The EnCore DAS service
7–9 April 2010
Rafael Jimenez
rafael@ebi.ac.uk
EnCORE
presentation
Genesis 11:1-9
1 And the whole earth was of one language, and of one speech. 2 And it
came to pass, as they journeyed from the east, that they found a plain in the
land of Shinar; and they dwelt there. 3 And they said one to another, Go to,
let us make brick, and burn them thoroughly. And they had brick for stone,
and slime had they for mortar. 4 And they said, Go to, let us build us a
city and a tower, whose top may reach unto heaven;
and let us make us a name, lest we be scattered abroad upon the face of the
whole earth. 5 And the Lord came down to see the city and the tower, which
the children built. 6 And the Lord said, Behold, the people is one, and they
have all one language; and this they begin to do; and now nothing will be
restrained from them, which they have imagined to do. 7 Go to, let us go
down, and there confound their language, that they
may not understand one another's speech. 8 So the Lord
scattered them abroad from thence upon the face of all the earth: and they
left off to build the city. 9 Therefore is the name of it called Babel; because
the Lord did there confound the language of all the earth: and from thence did
the Lord scatter them abroad upon the face of all the earth.
People
God
Tower of Babel
Pieter Bruegel the Elder
Diverse service world
SOAP, REST,
Java API, Perl
API, FTP,
GUI, …
External data sources
Different formats
Access interfaces
User
?integration
• Multiple manual connections
• Multiple technologies
• Multiple result files which have to be combined manually
• Much work to reproduce
XML, CSV,
Plain Text,
JSON, …
23.08.185
Utility of bioinformaticsScientificimpact
Too little
bioinformatics
Too many databases
Too diverse interfaces
Tim Hubbard
23.08.18 6
Utility of bioinformatics
Scientificimpact
Too little
bioinformatics
Too many databases
Too diverse interfaces
Integration of
ENFIN Network of Excellence
• Brings together
experimentalists and
computational biologists to
develop the next generation of
informatics resources for
systems biology
• Funded by the European
Commission within its FP6
programme under the
thematic area ‘Life sciences,
genomics and biotechnology
for health’
• 20 partners in 13 countries
• www.enfin.org
EnCore
• Brings together
experimentalists and
computational biologists to
develop the next generation of
informatics resources for
systems biology
• Funded by the European
Commission within its FP6
programme under the
thematic area ‘Life sciences,
genomics and biotechnology
for health’
• 20 partners in 13 countries
• www.enfin.org
ENFIN Network of Excellence
EnCore
• ENFIN Platform to enable mining data across various domains,
sources, formats and types
• Integrates database resources and analysis tools across different
disciplines
EnXML
EnCORE services
EnVISION pages
Standard EnXML format
User
input output
SOAP
Standardized EnCORE world
Heterogeneous
external world
Standardised
EnCORE world
EnXML
External data sources
EnCORE services
EnVISION pages
API, WS access
Standard EnXML format
User
input output
EnCORE services
From Inputs to Outputs
Positive Negative
Input/Query
Output/Results
Program/Service
EnCORE dataset
EnCORE
results
EnCORE webservice
• Enfin-IntAct
• Enfin-PRIDE
• Enfin-Affy2UniProt
• Enfin-PICR
• Enfin-Reactome
• Enfin-ArrayExpress
• Enfin-UniProt
• Enfin-BioModels
• Enfin-KEGG
• Enfin-G:GOSt
• Enfin-CellMINT
• Enfin-DOMAINATION
• Enfin-FuncNet
• Enfin-molecularInteractions
• Enfin-proteinAnnotations
• Database IDs
• Sequences
• Experiment: Identifies the result
• Sets: Contains the structure of the result
• Molecules: Includes the results
• Features: Describe details of the result
Experiment
Set
Molecule
Feature
EnXML structure
Identifies the result
Contains the structure of the result
Includes the results
Describe details of the result
EnCORE services
Example
Positive Negative
Input/Query
Output/Results
Program/Service
EnCORE dataset
EnCORE
results
EnCORE webservice
• Encore webservice
Enfin-IntAct
• Database ID (Uniprot ID)
P37173
• Experiment: ID4
• Sets: (1)EBI-296235, (2)EBI-1033040, (3) EBI-
902913, EBI-902937, (4) EBI-296166, EBI-296246,
(5)EBI-902913
• Molecules: (1)O35613, (2)P10600, (3)P07200,
(4)Q9UER7, (5)Q99K41
• Features: No features
EnCORE services
Building workflows
Input Result Positive result Negative resultWebservice Input selection
ENFIN Network of Excellence
• Brings together
experimentalists and
computational biologists to
develop the next generation of
informatics resources for
systems biology
• Funded by the European
Commission within its FP6
programme under the
thematic area ‘Life sciences,
genomics and biotechnology
for health’
• 20 partners in 13 countries
• www.enfin.org
EnCore
Adapting EnCORE to Standards and Federation
Molecular Biology Database resources
Human Genes and
Diseases
14%
Proteomics Resources
(20)
0%
Other Molecular
Biology Databases
3%
Immunological
databases
2%
Plant databases
8%
Organelle databases
2%
Human and other
Vertebrate Genomes
8%
Nucleotide
Sequence Databases
9%
RNA
sequence
databases
Protein
sequence
databases
Structure Databases
9%
Genomics
-Databases (non
(vertebrate
Metabolic and
Signaling Pathways
9%
Nucleic Acids Research annual
Database Issue and the NAR online
Molecular Biology Database Collection
in 2009MY Galperin, GR Cochrane -
Nucleic Acids Research, 2008
~1440
resources
~1440
resources
Traditional EnCore approach
Domain 5 Domain …Domain 4
Domain 2 Domain 3Domain 1
New EnCore approach
Standards and Federation
Domain 1
External data sources
Federated systems / Standards
EnVISION pages
WS
WS
Web interface
EnCORE wrapper
New EnCore approach
Standards and Federation
Domain 5 Domain …Domain 4
Domain 2 Domain 3Domain 1
New EnCore approach
Standards and Federation
• Less development
• More sources
• Domain data integration
• Comparable results
• Automatic inclusion of new data sources
• Less maintenance
• More stable formats
• Easy to control changes
• Facilitates validation
• Extra value to the original data
New role for EnCore and EnVision
Extra value to the original data
• Integration of sources.
• Filtering redundancy (whenever possible)
• Interconnect results.
• Data analysis
• More visualization
Domain 5 Domain …Domain 4
Domain 2 Domain 3Domain 1
Standards and Federation in EnCORE
EnCore DAS service
for protein sequence annotations
Protein DAS
annotation sources
Protein DAS
annotation sources
Experiment
Set
Molecule
Feature
Uniprot DAS
reference source
Uniprot DAS
annotation source
Protein
information
Protein feature
information
Protein DAS
annotation sources
Protein DAS
annotation sources
Protein DAS
annotation sources
• Service:
• Name: uniprot2proteinannotations
• URL: http://www.ebi.ac.uk/enfin-srv/encore/uniprot2proteinannotations/service
• Input: List of Uniprot Acc numbers
• Options: DAS Sources to query
• Direct input (DAS feature URL) [0,*]
• Registry LABEL [0,1]
• Registry source URI (DS_XXX) [0,*]
ENFIN Network of Excellence
• Brings together
experimentalists and
computational biologists to
develop the next generation of
informatics resources for
systems biology
• Funded by the European
Commission within its FP6
programme under the
thematic area ‘Life sciences,
genomics and biotechnology
for health’
• 20 partners in 13 countries
• www.enfin.org
EnVision
EnVision interface
Input Form
Default
workflow
BioModels
CellMint
IntAct
Reactome
PICR
Pride
Query
P07200,Q99K41,P3717
3,P37023,Q13131,A3Q
NQ0,Q9Y6C2,P98170,
A2AI38,Q8CGZ0,Q132
87,Q8WTW2,P61812
P07200,Q99K41,P37173,P37023,Q13131,A3QNQ0,Q9Y6C2,P98170,A2AI38,
Q8CGZ0,Q13287,Q8WTW2,P61812
Example
Envison interface
• Results for Pride, Uniprot, Intact, Reactome, CellMint, PICR, Biomodels, …
http://www.ebi.ac.uk/~rafael/enfin/presentations/EnVISION2_01.ppt
http://www.enfin.org/dokuwiki/
EnCORE
tutorial
Results per service
Example
EnVISION Pathways result
Positive results
Negative results
Example
Integrating standards into EnCore/EnVision
Molecular interactions service
Preview
Integrating standards into EnCore/EnVision
Molecular interactions service
Preview
Integrating standards into EnCore/EnVision
Molecular interactions service
Preview
Thank you!
Questions?
ENFIN partners:
• Pascal Kahlem (project coordinator)
• Bernd Brandt (IBIVU)
• Christine Orengo (UCL)
• Andrew Clegg (UCL)
• Ioannis Xenarios (SIB)
• Heinz Stockinger (SIB)
• Jaak Vilo (QURETEC)
• Jüri Reimand (QURETEC)
• Gianni Cesareni (UNITOR)
• Arnaud Ceol (UNITOR)
• James Procter (UNIVDUN)
• Ana Rojas Mendoza (CNIO)

More Related Content

Similar to Data integration in ENFIN using standards. The EnCore DAS service.

Apollo: Scalable & collaborative curation of genomes - Biocuration 2015
Apollo: Scalable & collaborative curation of genomes - Biocuration 2015Apollo: Scalable & collaborative curation of genomes - Biocuration 2015
Apollo: Scalable & collaborative curation of genomes - Biocuration 2015
Monica Munoz-Torres
 
OCCI - The Open Cloud Computing Interface – flexible, portable, interoperable...
OCCI - The Open Cloud Computing Interface – flexible, portable, interoperable...OCCI - The Open Cloud Computing Interface – flexible, portable, interoperable...
OCCI - The Open Cloud Computing Interface – flexible, portable, interoperable...
Alan Sill
 
The World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC DatagridThe World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC Datagrid
Swiss Big Data User Group
 

Similar to Data integration in ENFIN using standards. The EnCore DAS service. (20)

COPO - Collaborative Open Plant Omics, by Rob Davey
COPO - Collaborative Open Plant Omics, by Rob DaveyCOPO - Collaborative Open Plant Omics, by Rob Davey
COPO - Collaborative Open Plant Omics, by Rob Davey
 
Ontologies neo4j-graph-workshop-berlin
Ontologies neo4j-graph-workshop-berlinOntologies neo4j-graph-workshop-berlin
Ontologies neo4j-graph-workshop-berlin
 
Apollo: Scalable & collaborative curation of genomes - Biocuration 2015
Apollo: Scalable & collaborative curation of genomes - Biocuration 2015Apollo: Scalable & collaborative curation of genomes - Biocuration 2015
Apollo: Scalable & collaborative curation of genomes - Biocuration 2015
 
On chemical structures, substances, nanomaterials and measurements
On chemical structures, substances, nanomaterials and measurementsOn chemical structures, substances, nanomaterials and measurements
On chemical structures, substances, nanomaterials and measurements
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven Research
 
OCCI - The Open Cloud Computing Interface – flexible, portable, interoperable...
OCCI - The Open Cloud Computing Interface – flexible, portable, interoperable...OCCI - The Open Cloud Computing Interface – flexible, portable, interoperable...
OCCI - The Open Cloud Computing Interface – flexible, portable, interoperable...
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
EMBL-EBI
EMBL-EBIEMBL-EBI
EMBL-EBI
 
Open innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectOpen innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts project
 
Open innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectOpen innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts project
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
Semantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBISemantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBI
 
Next Generation Sequencing - An Overview
Next Generation Sequencing - An OverviewNext Generation Sequencing - An Overview
Next Generation Sequencing - An Overview
 
iMicrobe_ASLO_2015
iMicrobe_ASLO_2015iMicrobe_ASLO_2015
iMicrobe_ASLO_2015
 
Summary of Technical Coordinators discussions
Summary of Technical Coordinators discussionsSummary of Technical Coordinators discussions
Summary of Technical Coordinators discussions
 
Presentation on entrez as used in bioinformatics
Presentation on entrez as used in bioinformaticsPresentation on entrez as used in bioinformatics
Presentation on entrez as used in bioinformatics
 
The World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC DatagridThe World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC Datagrid
 
NHM Data Portal: first steps toward the Graph-of-Life
NHM Data Portal: first steps toward the Graph-of-LifeNHM Data Portal: first steps toward the Graph-of-Life
NHM Data Portal: first steps toward the Graph-of-Life
 
NHM Data Portal: first steps toward the Graph-of-Life
NHM Data Portal: first steps toward the Graph-of-LifeNHM Data Portal: first steps toward the Graph-of-Life
NHM Data Portal: first steps toward the Graph-of-Life
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 

More from Rafael C. Jimenez

The European life-science data infrastructure: Data, Computing and Services ...
The European life-science data infrastructure: Data, Computing and Services ...The European life-science data infrastructure: Data, Computing and Services ...
The European life-science data infrastructure: Data, Computing and Services ...
Rafael C. Jimenez
 
Life science requirements from e-infrastructure: initial results from a joint...
Life science requirements from e-infrastructure:initial results from a joint...Life science requirements from e-infrastructure:initial results from a joint...
Life science requirements from e-infrastructure: initial results from a joint...
Rafael C. Jimenez
 

More from Rafael C. Jimenez (20)

BMB Resource Integration Workshop
BMB Resource Integration WorkshopBMB Resource Integration Workshop
BMB Resource Integration Workshop
 
Proteomics repositories integration using EUDAT resources
Proteomics repositories integration using EUDAT resourcesProteomics repositories integration using EUDAT resources
Proteomics repositories integration using EUDAT resources
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
The European life-science data infrastructure: Data, Computing and Services ...
The European life-science data infrastructure: Data, Computing and Services ...The European life-science data infrastructure: Data, Computing and Services ...
The European life-science data infrastructure: Data, Computing and Services ...
 
Standardisation in BMS European infrastructures
Standardisation in BMS European infrastructuresStandardisation in BMS European infrastructures
Standardisation in BMS European infrastructures
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
Standards
StandardsStandards
Standards
 
ELIXIR TCG update
ELIXIR TCG updateELIXIR TCG update
ELIXIR TCG update
 
An introduction to programmatic access
An introduction to programmatic accessAn introduction to programmatic access
An introduction to programmatic access
 
Life science requirements from e-infrastructure: initial results from a joint...
Life science requirements from e-infrastructure:initial results from a joint...Life science requirements from e-infrastructure:initial results from a joint...
Life science requirements from e-infrastructure: initial results from a joint...
 
Technical activities in ELIXIR Europe
Technical activities in ELIXIR EuropeTechnical activities in ELIXIR Europe
Technical activities in ELIXIR Europe
 
Challenges of big data. Summary day 1.
Challenges of big data. Summary day 1.Challenges of big data. Summary day 1.
Challenges of big data. Summary day 1.
 
Challenges of big data. Aims of the workshop.
Challenges of big data. Aims of the workshop.Challenges of big data. Aims of the workshop.
Challenges of big data. Aims of the workshop.
 
Data submissions and archiving raw data in life sciences. A pilot with Proteo...
Data submissions and archiving raw data in life sciences. A pilot with Proteo...Data submissions and archiving raw data in life sciences. A pilot with Proteo...
Data submissions and archiving raw data in life sciences. A pilot with Proteo...
 
ELIXIR and data grand challenges in life sciences
ELIXIR and data grand challenges in life sciencesELIXIR and data grand challenges in life sciences
ELIXIR and data grand challenges in life sciences
 
SASI, A lightweight standard for exchanging course information
SASI, A lightweight standard for exchanging course informationSASI, A lightweight standard for exchanging course information
SASI, A lightweight standard for exchanging course information
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
Introduction to the BioJS project
Introduction to the BioJS projectIntroduction to the BioJS project
Introduction to the BioJS project
 
ELIXIR . Technical Coordinator
ELIXIR. Technical CoordinatorELIXIR. Technical Coordinator
ELIXIR . Technical Coordinator
 
BioJS introduction
BioJS introductionBioJS introduction
BioJS introduction
 

Recently uploaded

Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Sérgio Sacani
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
RohitNehra6
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
Sérgio Sacani
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Lokesh Kothari
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
ssuser79fe74
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 

Recently uploaded (20)

Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 

Data integration in ENFIN using standards. The EnCore DAS service.

  • 1. Data integration in ENFIN using standards The EnCore DAS service 7–9 April 2010 Rafael Jimenez rafael@ebi.ac.uk EnCORE presentation
  • 2. Genesis 11:1-9 1 And the whole earth was of one language, and of one speech. 2 And it came to pass, as they journeyed from the east, that they found a plain in the land of Shinar; and they dwelt there. 3 And they said one to another, Go to, let us make brick, and burn them thoroughly. And they had brick for stone, and slime had they for mortar. 4 And they said, Go to, let us build us a city and a tower, whose top may reach unto heaven; and let us make us a name, lest we be scattered abroad upon the face of the whole earth. 5 And the Lord came down to see the city and the tower, which the children built. 6 And the Lord said, Behold, the people is one, and they have all one language; and this they begin to do; and now nothing will be restrained from them, which they have imagined to do. 7 Go to, let us go down, and there confound their language, that they may not understand one another's speech. 8 So the Lord scattered them abroad from thence upon the face of all the earth: and they left off to build the city. 9 Therefore is the name of it called Babel; because the Lord did there confound the language of all the earth: and from thence did the Lord scatter them abroad upon the face of all the earth. People God
  • 3. Tower of Babel Pieter Bruegel the Elder
  • 4. Diverse service world SOAP, REST, Java API, Perl API, FTP, GUI, … External data sources Different formats Access interfaces User ?integration • Multiple manual connections • Multiple technologies • Multiple result files which have to be combined manually • Much work to reproduce XML, CSV, Plain Text, JSON, …
  • 5. 23.08.185 Utility of bioinformaticsScientificimpact Too little bioinformatics Too many databases Too diverse interfaces Tim Hubbard
  • 6. 23.08.18 6 Utility of bioinformatics Scientificimpact Too little bioinformatics Too many databases Too diverse interfaces Integration of
  • 7. ENFIN Network of Excellence • Brings together experimentalists and computational biologists to develop the next generation of informatics resources for systems biology • Funded by the European Commission within its FP6 programme under the thematic area ‘Life sciences, genomics and biotechnology for health’ • 20 partners in 13 countries • www.enfin.org EnCore
  • 8. • Brings together experimentalists and computational biologists to develop the next generation of informatics resources for systems biology • Funded by the European Commission within its FP6 programme under the thematic area ‘Life sciences, genomics and biotechnology for health’ • 20 partners in 13 countries • www.enfin.org ENFIN Network of Excellence
  • 9. EnCore • ENFIN Platform to enable mining data across various domains, sources, formats and types • Integrates database resources and analysis tools across different disciplines EnXML EnCORE services EnVISION pages Standard EnXML format User input output SOAP
  • 10. Standardized EnCORE world Heterogeneous external world Standardised EnCORE world EnXML External data sources EnCORE services EnVISION pages API, WS access Standard EnXML format User input output
  • 11. EnCORE services From Inputs to Outputs Positive Negative Input/Query Output/Results Program/Service EnCORE dataset EnCORE results EnCORE webservice • Enfin-IntAct • Enfin-PRIDE • Enfin-Affy2UniProt • Enfin-PICR • Enfin-Reactome • Enfin-ArrayExpress • Enfin-UniProt • Enfin-BioModels • Enfin-KEGG • Enfin-G:GOSt • Enfin-CellMINT • Enfin-DOMAINATION • Enfin-FuncNet • Enfin-molecularInteractions • Enfin-proteinAnnotations • Database IDs • Sequences • Experiment: Identifies the result • Sets: Contains the structure of the result • Molecules: Includes the results • Features: Describe details of the result
  • 12. Experiment Set Molecule Feature EnXML structure Identifies the result Contains the structure of the result Includes the results Describe details of the result
  • 13. EnCORE services Example Positive Negative Input/Query Output/Results Program/Service EnCORE dataset EnCORE results EnCORE webservice • Encore webservice Enfin-IntAct • Database ID (Uniprot ID) P37173 • Experiment: ID4 • Sets: (1)EBI-296235, (2)EBI-1033040, (3) EBI- 902913, EBI-902937, (4) EBI-296166, EBI-296246, (5)EBI-902913 • Molecules: (1)O35613, (2)P10600, (3)P07200, (4)Q9UER7, (5)Q99K41 • Features: No features
  • 14. EnCORE services Building workflows Input Result Positive result Negative resultWebservice Input selection
  • 15. ENFIN Network of Excellence • Brings together experimentalists and computational biologists to develop the next generation of informatics resources for systems biology • Funded by the European Commission within its FP6 programme under the thematic area ‘Life sciences, genomics and biotechnology for health’ • 20 partners in 13 countries • www.enfin.org EnCore Adapting EnCORE to Standards and Federation
  • 16. Molecular Biology Database resources Human Genes and Diseases 14% Proteomics Resources (20) 0% Other Molecular Biology Databases 3% Immunological databases 2% Plant databases 8% Organelle databases 2% Human and other Vertebrate Genomes 8% Nucleotide Sequence Databases 9% RNA sequence databases Protein sequence databases Structure Databases 9% Genomics -Databases (non (vertebrate Metabolic and Signaling Pathways 9% Nucleic Acids Research annual Database Issue and the NAR online Molecular Biology Database Collection in 2009MY Galperin, GR Cochrane - Nucleic Acids Research, 2008 ~1440 resources
  • 17. ~1440 resources Traditional EnCore approach Domain 5 Domain …Domain 4 Domain 2 Domain 3Domain 1
  • 18. New EnCore approach Standards and Federation Domain 1 External data sources Federated systems / Standards EnVISION pages WS WS Web interface EnCORE wrapper
  • 19. New EnCore approach Standards and Federation Domain 5 Domain …Domain 4 Domain 2 Domain 3Domain 1
  • 20. New EnCore approach Standards and Federation • Less development • More sources • Domain data integration • Comparable results • Automatic inclusion of new data sources • Less maintenance • More stable formats • Easy to control changes • Facilitates validation • Extra value to the original data
  • 21. New role for EnCore and EnVision Extra value to the original data • Integration of sources. • Filtering redundancy (whenever possible) • Interconnect results. • Data analysis • More visualization Domain 5 Domain …Domain 4 Domain 2 Domain 3Domain 1
  • 23. EnCore DAS service for protein sequence annotations Protein DAS annotation sources Protein DAS annotation sources Experiment Set Molecule Feature Uniprot DAS reference source Uniprot DAS annotation source Protein information Protein feature information Protein DAS annotation sources Protein DAS annotation sources Protein DAS annotation sources • Service: • Name: uniprot2proteinannotations • URL: http://www.ebi.ac.uk/enfin-srv/encore/uniprot2proteinannotations/service • Input: List of Uniprot Acc numbers • Options: DAS Sources to query • Direct input (DAS feature URL) [0,*] • Registry LABEL [0,1] • Registry source URI (DS_XXX) [0,*]
  • 24. ENFIN Network of Excellence • Brings together experimentalists and computational biologists to develop the next generation of informatics resources for systems biology • Funded by the European Commission within its FP6 programme under the thematic area ‘Life sciences, genomics and biotechnology for health’ • 20 partners in 13 countries • www.enfin.org EnVision
  • 26. Envison interface • Results for Pride, Uniprot, Intact, Reactome, CellMint, PICR, Biomodels, … http://www.ebi.ac.uk/~rafael/enfin/presentations/EnVISION2_01.ppt http://www.enfin.org/dokuwiki/ EnCORE tutorial Results per service Example
  • 27. EnVISION Pathways result Positive results Negative results Example
  • 28. Integrating standards into EnCore/EnVision Molecular interactions service Preview
  • 29. Integrating standards into EnCore/EnVision Molecular interactions service Preview
  • 30. Integrating standards into EnCore/EnVision Molecular interactions service Preview
  • 31. Thank you! Questions? ENFIN partners: • Pascal Kahlem (project coordinator) • Bernd Brandt (IBIVU) • Christine Orengo (UCL) • Andrew Clegg (UCL) • Ioannis Xenarios (SIB) • Heinz Stockinger (SIB) • Jaak Vilo (QURETEC) • Jüri Reimand (QURETEC) • Gianni Cesareni (UNITOR) • Arnaud Ceol (UNITOR) • James Procter (UNIVDUN) • Ana Rojas Mendoza (CNIO)

Editor's Notes

  1. The Tower of Babel - Story Summary: Up until this point in the Bible, the whole world had one language - one common speech for all people. The people of the earth became skilled in construction and decided to build a city with a tower that would reach to heaven. By building the tower they wanted to make a name for themselves and also prevent their city from being scattered. God came to see their city and the tower they were building. He perceived their intentions, and in His infinite wisdom, He knew this "stairway to heaven" would only lead the people away from God. He noted the powerful force within their unity of purpose. As a result, God confused their language, causing them to speak different languages so they would not understand each other. By doing this, God thwarted their plans. He also scattered the people of the city all over the face of the earth. God says in Genesis 11:6, "If as one people speaking the same language they have begun to do this, then nothing they plan to do will be impossible for them." (NIV) God realizes that when people are unified in purpose they can accomplish impossible feats, both noble and ignoble. This is why unity in the body of Christ is so important.
  2. We are exposed to a very diverse service world
  3. The idea behind EnCORE is simplified in this picture Input (our query) is contained in a XML standard format called EnXML We can run different services over this input. We get results contained in the same EnXML format The Outputs can be use as inputs of other services.
  4. This is a generic example of how an EnCORE service work
  5. An specific example The query is a protein Acc We run the Intact service We get the interactions result defined by the EnXML terminology
  6. EnCORE facilitates building workflows
  7. EnVISION results are nice, but do not forget our initial integration problem For one domain (protein interaction, pathways, protein sequence …) we might have several databases providing data
  8. EnCORE provides a great solution however it is not complete if it can not include more resources For EnCORE it is not feasible to develop and maintain so many wrappers. Nonetheless EnCORE can overcome this problem using standards and federated systems
  9. EnVISION is an EnCORE interface With just one click user can run different services get a quick overview for a dataset This example shows result for …
  10. Here an example of the potential of EnVISION In this example we used a dataset of more than 300 protein Acc. In this screenshot EnVISION was able to find more than 500 pathways for this dataset. EnVISION is capable to link and display positive results in a pathway map.
  11. Integration of biological data of various types and development of adapted bioinformatics tools represent critical objectives to enable research at the systems level. The European Network of Excellence ENFIN is engaged in developing an adapted infrastructure to connect databases, and platforms to enable both generation of new bioinformatics tools and experimental validation of computational predictions. Beyond the use of common standards to format individual datasets, there is a need for sophisticated informatics platforms to enable mining data across various domains, sources, formats and types. The aim of the EnCORE project is to integrate across different disciplines an extensive list of database resources and analysis tools in a computationally accessible and extensible manner, facilitating automated data retrieval and processing with a special focus on systems biology. The EnCORE platform is available as a collection of webservices with a common standard format easy to integrate in Workflow management software such as Taverna. Additionally EnCORE services are also accessible thought EnVISION, a web graphical user interface providing elaborated information such as molecular interaction, biological pathways and computational models of pathways.