Supported byProminent international speakers from             h"p://workshop.eisbm.eu1
WP#6#Epigene+cs#and#targeted#                         proteomics#Stephane#Ballereau#       Ste# EISBM#Workshop##          ...
Epidemiology of allergy
MeDALL aims•  iden+fy#causes&for&allergy,#eg#asthma#and#atopic#derma++s#•  in#par+cular#in#childhood#•  to#improve#current...
MeDALL cohorts
Classical&approach&                  Birth&cohorts&                            Novel&approach&                            ...
Classical&approach&                   Birth&cohorts&                            Novel&approach&                           ...
Classical&approach&                   Birth&cohorts&                            Novel&approach&                           ...
Classical&approach&                   Birth&cohorts&                            Novel&approach&                           ...
Classical&approach&                   Birth&cohorts&                            Novel&approach&                           ...
Classical&approach&                   Birth&cohorts&                                 Novel&approach&                      ...
Classical&approach&                   Birth&cohorts&                                 Novel&approach&                      ...
Classical&approach&                   Birth&cohorts&                                 Novel&approach&                      ...
Integrative knowledge management Classical&approach&                   Birth&cohorts&                                 Nove...
From fingerprints to handprints•  Develop#classifiers#and#predictors#using:#   –    univariate#and#mul+variate#sta+s+cal#an...
Now&on&to&Dieter&
Data Analysis and Knowledge                    Management using BioXM                 MeDALL - AirPROM - Synergy-COPD     ...
Biomax – Connecting unrelated                 information for efficient decision                             support      ...
Why „Knowledge Management“?                     Knowledge:  “the realisation and                   understanding of patter...
Knowledge Management aspects• Data integration• Knowledge representation• Knowledge extraction• Collaboration and project ...
Public knowledge integration
Clinical data harmonisation                                           Multi-scale Data14 birth cohorts                    ...
Multi-Scale Modelling
Traditional semantic mapping                            KEGGPubMed    Gene Ontology   UniProt
Working with semantic networks• Connected data,  meta-data and  knowledge• Query, view, report• Integrate with  analysis
Knowledge Network Representation   Dynamic network representation in BioXM         Each node or edge of the network may se...
Knowledge Network Expansion  Dynamic network representation in BioXM
Concept - Agile Solution Building                                                 Step 1:                                 ...
Building Blocks                                       Experiment   Text mining           Graphs                           ...
Feature bundlesClinical data access                       Modell integration support                                      ...
Solution deployment                    Step 1:                 Specification               Designing the data             ...
-- 48 month, 1.12.10 - 30.11.14, HEALTH-2010-2.4.5-1- Partners: 21- http://medall-fp7.eu/Supplementing classic epidemiolog...
MeDALL knowledge• Public resources    80 sources + ontologies, 500k nodes, 3 m edges• Literature  review  “phenotypes”• L...
https://ssl.biomax.de/medall/     Registration requests: medall@biomax.com
https://ssl.biomax.de/medall/
Collaboration Network
Researchers, Organisations, WPs,          Tasks, Data
Browse and search partners
Edit your collaborations,responsibilities, Skype, ….
Mining public resources          3 hierarchic levels          3 main categories: trigger, organ,          type          50...
Literature mining
Literature mining, results
MeDALL literature review
… projected onto public knowledge
Systematic literature review
Knowledge model: documentmanagement specific adaptations
Review results:Web Input form
Searches supporting  the review flow
Results Stage 3: "include"
Stage 4: Data extraction form
Cohort variable harmonisation
Harmonisation participants
Harmonisation knowledge sub-model
Step 3: Suggest categories
ENRIECO results
MeDALL “Phenotype database”
Phenotype database, details
Connecting the 4yr and 8yr DB
Document management
Full-text search and folders
Full-text search and folders
New: Geographic visualisationOzone-level 2010 average: metropolitan areas - rural areas                                   ...
Airway Disease PRedicting Outcomes through Patient Specific                    Computational Modelling (AirPROM)- 50 month...
Multi-Scale lung modelling
Multi-scale model integration         patient-specific integrated multi-scale models to predict the natural history       ...
AirPROM ratio
AirPROM partner                                Consortium Membership                                •11 EU countries      ...
Clinial partnersMulti-scale DataCross-sectional,  Longitudinal   Intervention     Studies
AirPROM automated data flows                                        WP1: clinical data                                    ...
AirPROM Knowledge•   Collaboration network (partners, tasks, data/models)•   8 computational models I/O parameter semantic...
AirPROM KM tasks•   Ensure data flow: provide a secure federated data retrieval, exchange, processing    and warehousing i...
AirPROM knowledge model
Simple data mapping                                                Set rules                                              ...
Study cohort variable                harmonisation  Aim:  To provide a template to facilitate  harmonization between pre-e...
Data Schema Structure:   a nested hierarchical structureModules                         M1
Data Schema• Factor analysis carried out on BTS severe  asthma data set to determine underlying  structure/characteristics...
Integrated computational         models
Clinical data
Statistics
Report customisation
Quality control
Linked, secure image data         access
High-performance storage systemTera- to Petabyte data storage for image and image analysis data• AN1-PZ1.storage.pionier.n...
Public Knowledge, disease         centred
Semantically integrated     knowledge
Immune response associated          genes
Immune response associated          genes
Pathway - compound centred knowledge
Disease - pathway network
Gene centred knowledge
News and alerts
R interactive view item                          79
R interactive view item                          80
Synergy-COPD                    “Modelling  and  simulation  environment  for  systems  medicine                  (Chronic...
Structure
SYNERGY, KM tasks   Clinical data from BioBridge, PAC-COPD + ECLIPSE   Experimental methods:     –   Phenotypes: (respir...
Knowledge model                 for semantic mappingAims: Find model and experimental data parameters which  are similari...
Model and data parameter                       concept                           Data parameterModel parameter            ...
Parameter semantic annotation                              Context specificGeneric                       • context specfic...
Parameter semantic annotationMolecular level model (SBML import) MIRIAM mapping based reference entity association check...
Semantic annotation concept
Mapping conceptUse experimental data to validate theoretical models Connect Element:Model Parameter:Instanceswith  Elemen...
Mapping method     Context:Parameter Description:Instance_B                  Ontology:A:54645     Element:Parameter:Instan...
Network Similarity Search
Parameter semantic description to     annotation mapping
Parameter semantic description to     annotation mapping
Synergy-COPD Knowledge Portalhttps://synergy.linkcare.es/
Semantically integrated information - Types   semantically described deterministic models   probabilistic networks   ex...
Semantically integrated information -                                    Results   All clinical and experimental data fro...
Public knowledge sources   80 000 genes   112 000 proteins   2 826 Pathways (677 KEGG, 1 276 biochemical, 202 SBML, 55 ...
Resulting knowledge network   1.5 M protein-protein interactions (80k experimental)   330k gene - compound associations...
Interpreting semantically integrated                                  information   generate new probabilistic networks f...
Concept                      for details see WP4 & 5 presentations                      Glycolysis                      NA...
Selected candidates to explore connections                                                            TNFRSF25            ...
Searching connecting networks by PPI                             102
PPI based networks        Glycolysis candidates        Glycolysis model                            103
Searching connecting networks by PPI Glycolysis 11 model proteins, 27 candidates   - PPI with good experimental evidence ...
Overrepresentation in connecting networkCentral metabolism model - Glycolysis candidates: 167 pathways                    ...
Summary•   Flexible knowledge modelling•   Different levels of access•   Exchange within, between and outside of projects•...
Data Analysis and Knowledge Management using BioXM in MeDALL, AirPROM and Synergy-COPD
Data Analysis and Knowledge Management using BioXM in MeDALL, AirPROM and Synergy-COPD
Data Analysis and Knowledge Management using BioXM in MeDALL, AirPROM and Synergy-COPD
Upcoming SlideShare
Loading in …5
×

Data Analysis and Knowledge Management using BioXM in MeDALL, AirPROM and Synergy-COPD

3,573 views

Published on

Presentation made during the EISBM workshop, 13-15 June 2012 by Stephane Ballereau (EISBM) and Dieter Maier (Biomax).

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
3,573
On SlideShare
0
From Embeds
0
Number of Embeds
337
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Data Analysis and Knowledge Management using BioXM in MeDALL, AirPROM and Synergy-COPD

  1. Supported byProminent international speakers from h"p://workshop.eisbm.eu1
  2. WP#6#Epigene+cs#and#targeted# proteomics#Stephane#Ballereau# Ste# EISBM#Workshop## #
  3. Epidemiology of allergy
  4. MeDALL aims•  iden+fy#causes&for&allergy,#eg#asthma#and#atopic#derma++s#•  in#par+cular#in#childhood#•  to#improve#current#diagnos1c#and#preven1on&tools#•  using#a#system&biology&approach#•  combining&various&types#of&biomarker&profiles&or# fingerprints#into#a#handprints&
  5. MeDALL cohorts
  6. Classical&approach& Birth&cohorts& Novel&approach& IgE&arrays&and&follow=up& Integra1on&of&all&data&for&determina1on&of&biomarkers&for&early&diagnosis,&& preven1on&and&iden1fica1on&of&targets&for&therapy&of&allergy&
  7. Classical&approach& Birth&cohorts& Novel&approach& IgE&arrays&and&follow=up& Analysis&of&Classical#phenotypes# risk&factors& defined#by#experts# and&GxE& Integra1on&of&all&data&for&determina1on&of&biomarkers&for&early&diagnosis,&& preven1on&and&iden1fica1on&of&targets&for&therapy&of&allergy&
  8. Classical&approach& Birth&cohorts& Novel&approach& IgE&arrays&and&follow=up& Analysis&of&Classical#phenotypes# Novel&phenotypes&& risk&factors& defined#by#experts# defined#using#sta+stal#methods# and&GxE& Integra1on&of&all&data&for&determina1on&of&biomarkers&for&early&diagnosis,&& preven1on&and&iden1fica1on&of&targets&for&therapy&of&allergy&
  9. Classical&approach& Birth&cohorts& Novel&approach& IgE&arrays&and&follow=up& Analysis&of&Classical#phenotypes# Novel&phenotypes&& risk&factors& defined#by#experts# defined#using#sta+stal#methods# and&GxE& Selec+on#of#extreme#phenotypes# Gene1cs& Epigene1cs& Transcriptomics& Targeted&proteomics& Ig&arrays& Integra1on&of&all&data&for&determina1on&of&biomarkers&for&early&diagnosis,&& preven1on&and&iden1fica1on&of&targets&for&therapy&of&allergy&
  10. Classical&approach& Birth&cohorts& Novel&approach& IgE&arrays&and&follow=up& Analysis&of&Classical#phenotypes# Novel&phenotypes&& risk&factors& defined#by#experts# defined#using#sta+stal#methods# and&GxE& Selec+on#of#extreme#phenotypes# Gene1cs& Epigene1cs& Karelia&cross=sec1onal&study& Transcriptomics& Finland&and&Russia& Targeted&proteomics& Ig&arrays& Integra1on&of&all&data&for&determina1on&of&biomarkers&for&early&diagnosis,&& preven1on&and&iden1fica1on&of&targets&for&therapy&of&allergy&
  11. Classical&approach& Birth&cohorts& Novel&approach& IgE&arrays&and&follow=up& Analysis&of&Classical#phenotypes# Novel&phenotypes&& risk&factors& defined#by#experts# defined#using#sta+stal#methods# and&GxE& Selec+on#of#extreme#phenotypes# Gene1cs& Epigene1cs& Karelia&cross=sec1onal&study& Transcriptomics& Finland&and&Russia& Targeted&proteomics& Ig&arrays& 1)#Iden+fica+on#of#fingerprints# 2)#Valida+on#in#birth#cohorts#samples############ 3)#Replica+on#in#birth#cohort#followKup### Analysis&of& risk&factors& and&GxE& Integra1on&of&all&data&for&determina1on&of&biomarkers&for&early&diagnosis,&& preven1on&and&iden1fica1on&of&targets&for&therapy&of&allergy&
  12. Classical&approach& Birth&cohorts& Novel&approach& IgE&arrays&and&follow=up& Analysis&of&Classical#phenotypes# Novel&phenotypes&& risk&factors& defined#by#experts# defined#using#sta+stal#methods# and&GxE& Selec+on#of#extreme#phenotypes# Gene1cs& Epigene1cs& Karelia&cross=sec1onal&study& Transcriptomics& Finland&and&Russia& Targeted&proteomics& Ig&arrays& 1)#Iden+fica+on#of#fingerprints# Confirma1on:& 2)#Valida+on#in#birth#cohorts#samples############ &=&in&animal&models& 3)#Replica+on#in#birth#cohort#followKup### =&by&in#vitro#immunology& Analysis&of& risk&factors& and&GxE& Integra1on&of&all&data&for&determina1on&of&biomarkers&for&early&diagnosis,&& preven1on&and&iden1fica1on&of&targets&for&therapy&of&allergy&
  13. Classical&approach& Birth&cohorts& Novel&approach& IgE&arrays&and&follow=up& Analysis&of&Classical#phenotypes# Novel&phenotypes&& risk&factors& defined#by#experts# defined#using#sta+stal#methods# and&GxE& Selec+on#of#extreme#phenotypes# Gene1cs& Epigene1cs& Karelia&cross=sec1onal&study& Transcriptomics& Finland&and&Russia& Targeted&proteomics& Ig&arrays& 1)#Iden+fica+on#of#fingerprints# Confirma1on:& 2)#Valida+on#in#birth#cohorts#samples############ &=&in&animal&models& 3)#Replica+on#in#birth#cohort#followKup### =&by&in#vitro#immunology& Analysis&of& risk&factors& Mathema1cal&modelling& and&GxE& Integra1on&of&all&data&for&determina1on&of&biomarkers&for&early&diagnosis,&& preven1on&and&iden1fica1on&of&targets&for&therapy&of&allergy&
  14. Integrative knowledge management Classical&approach& Birth&cohorts& Novel&approach& IgE&arrays&and&follow=up& Analysis&of&Classical#phenotypes# Novel&phenotypes&& risk&factors& defined#by#experts# defined#using#sta+stal#methods# and&GxE& Selec+on#of#extreme#phenotypes# Gene1cs& Epigene1cs& Karelia&cross=sec1onal&study& Transcriptomics& Finland&and&Russia& Targeted&proteomics& Ig&arrays& 1)#Iden+fica+on#of#fingerprints# Confirma1on:& 2)#Valida+on#in#birth#cohorts#samples############ &=&in&animal&models& 3)#Replica+on#in#birth#cohort#followKup### =&by&in#vitro#immunology& Analysis&of& risk&factors& Mathema1cal&modelling& and&GxE& Integra1on&of&all&data&for&determina1on&of&biomarkers&for&early&diagnosis,&& preven1on&and&iden1fica1on&of&targets&for&therapy&of&allergy&
  15. From fingerprints to handprints•  Develop#classifiers#and#predictors#using:# –  univariate#and#mul+variate#sta+s+cal#analyses# –  clustering#of#omics#data# –  network#and#pathway#modelling# –  simula+on#and#visualiza+on#with#graphical#interfaces##•  Iden+fy#most#informa+ve#types#of#experiments#and# analyses#•  Refine#handprints#predic+ve#of:# –  disease#progression#and#exacerba+on# –  response#to#treatment#in#allergic#pa+ents#
  16. Now&on&to&Dieter&
  17. Data Analysis and Knowledge Management using BioXM MeDALL - AirPROM - Synergy-COPD EISBM Workshop 13-15.6.12 Dr. Dieter Maier Biomax Informatics AGwww.biomax.com
  18. Biomax – Connecting unrelated information for efficient decision support Biomax Vision Biomax ProfileMaster scientific complexity Headquartered in Martinsried GermanyEnsure ease of useIncrease speed of development In business for more than 12 yearsReduce cost and time World wide customer base BioXM is a configurable knowledge management platform Enable centers of excellence for to flexibly interconnect isolated personalized medicine silos of information in biomedical Support for Systems Biology research
  19. Why „Knowledge Management“? Knowledge:  “the realisation and understanding of patterns and their implications existing in information” Need to mine information for patterns A pattern often only emerge when information from different silos is combined e.g. Expression with gene function, SNPs with clinical history of patients, ... Need semantically integrated information e.g. Information about identical or “equivalent”  objects  and  “meaning”   becomes integrated requires framework for integration methods  to  find  “equivalent”   “meaning”
  20. Knowledge Management aspects• Data integration• Knowledge representation• Knowledge extraction• Collaboration and project management• Multivariate data analysis
  21. Public knowledge integration
  22. Clinical data harmonisation Multi-scale Data14 birth cohorts Cross-sectional,1 cross-sectional 1 cross-sectional Longitudinal 2 interventional Intervention Studies
  23. Multi-Scale Modelling
  24. Traditional semantic mapping KEGGPubMed Gene Ontology UniProt
  25. Working with semantic networks• Connected data, meta-data and knowledge• Query, view, report• Integrate with analysis
  26. Knowledge Network Representation Dynamic network representation in BioXM Each node or edge of the network may serve as entry point for further exploration!
  27. Knowledge Network Expansion Dynamic network representation in BioXM
  28. Concept - Agile Solution Building Step 1: Specification • Designing the data model Query the knowledge network, Define the domain-specific dataexplore the graph and report query model results Step 3: Use Step 2: • Query building Implementation and information retrieval • Importing information Instantiate the knowledge network with data and information from external resources
  29. Building Blocks Experiment Text mining Graphs repositoryPublic databases R statistics Network search
  30. Feature bundlesClinical data access Modell integration support Collaboration net
  31. Solution deployment Step 1: Specification Designing the data Web applications framework fueled model by BioXM for quick access Step 3: Use Step 2:Query building and Implementation information Importing retrieval information Step 4: WebApps for Information, Retrieval, Reporting and Annotation
  32. -- 48 month, 1.12.10 - 30.11.14, HEALTH-2010-2.4.5-1- Partners: 21- http://medall-fp7.eu/Supplementing classic epidemiology with molecular data integrated analysis
  33. MeDALL knowledge• Public resources  80 sources + ontologies, 500k nodes, 3 m edges• Literature  review  “phenotypes”• Literature  review  “Allergy  genes”• Cohort variable harmonisation  14 birth cohorts + 1 cross-sectional study• Partners, tasks, documents• “Omics”  data  (primary  analysis  results)
  34. https://ssl.biomax.de/medall/ Registration requests: medall@biomax.com
  35. https://ssl.biomax.de/medall/
  36. Collaboration Network
  37. Researchers, Organisations, WPs, Tasks, Data
  38. Browse and search partners
  39. Edit your collaborations,responsibilities, Skype, ….
  40. Mining public resources 3 hierarchic levels 3 main categories: trigger, organ, type 50 sub-categories 6522 manually validated terms
  41. Literature mining
  42. Literature mining, results
  43. MeDALL literature review
  44. … projected onto public knowledge
  45. Systematic literature review
  46. Knowledge model: documentmanagement specific adaptations
  47. Review results:Web Input form
  48. Searches supporting the review flow
  49. Results Stage 3: "include"
  50. Stage 4: Data extraction form
  51. Cohort variable harmonisation
  52. Harmonisation participants
  53. Harmonisation knowledge sub-model
  54. Step 3: Suggest categories
  55. ENRIECO results
  56. MeDALL “Phenotype database”
  57. Phenotype database, details
  58. Connecting the 4yr and 8yr DB
  59. Document management
  60. Full-text search and folders
  61. Full-text search and folders
  62. New: Geographic visualisationOzone-level 2010 average: metropolitan areas - rural areas 47
  63. Airway Disease PRedicting Outcomes through Patient Specific Computational Modelling (AirPROM)- 50 month, 1.3.11-28.2.16- Partners: 34- call: ICT-2010.5.3 VPH call Image analysis and omics based computational models of the airways to unravel the pathophysiological mechaims in asthma and COPD
  64. Multi-Scale lung modelling
  65. Multi-scale model integration patient-specific integrated multi-scale models to predict the natural history & response to therapy in airway disease Genes & Proteins Cell structure- function: Smooth muscle cells,airway  epithelial  cells… Tissue structure- function: Airway remodelling in asthma & COPD Organ structure- function: Mechanics, ventilation, perfusion Clinical medicine: Asthma COPD
  66. AirPROM ratio
  67. AirPROM partner Consortium Membership •11 EU countries •22 Academic partners ■ •3 SMEs •2 Large industry partners •European Respiratory Society ■ ■ •2 patient organisations ELF, EFA ■ ■ ■ •WP Leads from 6 EU Countries ■ ■ ■ ■ European Approach Essential■ •Breadth of expertise •Clinical validation (14 clinical centres) •Exploitation
  68. Clinial partnersMulti-scale DataCross-sectional, Longitudinal Intervention Studies
  69. AirPROM automated data flows WP1: clinical data CT WP2: omics data morphology WP4: computational patient anatomy tools model, simulation result WP7: KM WP5: macro scale large airway WP3: micro scale inform, constrain model validate WP6: macro scale WP8: patient specific small airway multi scale model
  70. AirPROM Knowledge• Collaboration network (partners, tasks, data/models)• 8 computational models I/O parameter semantic descriptions  Cell model  Tissue model  Perfusion model …• AirPROM clinical data  15 control, 57 asthma  Anthropometrics, Spirometry• Link to image data• Full text document search• Public knowledge  Gene function (EntrezGene, UniProt, MGI)  Gene - disease association (OMIM, CTD, PubMed)  Gene - compound association (CTD, PubChem, PubMed)  Pathways (KEGG, Reactome)  Protein-protein interactions (MINT, DIP, IntAct)  ~100 data sources  Network of ~2 million connections• Omics data
  71. AirPROM KM tasks• Ensure data flow: provide a secure federated data retrieval, exchange, processing and warehousing infrastructure• Semantically integrate the clinical, biobanking physiological, genetic, experimental and imaging data• Enable data analysis by providing data matrices and integration with algorithms and tools for network inference• Formats to support e.g. ANSYS, ISA-TAB, CGNS, MAGE, SBML, CDISC• Ontologies to support SNOMED, FMA anatomy ontology, Bio-Physical Ontology• MIBBI meta-information definitions• Expected data volume: lower Terabyte region
  72. AirPROM knowledge model
  73. Simple data mapping Set rules for import Data to be imported (e.g. from an Excel spreadsheet)  Example:Tabular Data Import Define import script or select existing script
  74. Study cohort variable harmonisation Aim: To provide a template to facilitate harmonization between pre-existing cohorts and support the design of emerging ones. Bank 2 Bank 3 DataBank1 common Pool Bank 4 Bank 5
  75. Data Schema Structure: a nested hierarchical structureModules M1
  76. Data Schema• Factor analysis carried out on BTS severe asthma data set to determine underlying structure/characteristics of the dataset• The underlying structure/ factors were then used to inform the domains and themes to order the data.
  77. Integrated computational models
  78. Clinical data
  79. Statistics
  80. Report customisation
  81. Quality control
  82. Linked, secure image data access
  83. High-performance storage systemTera- to Petabyte data storage for image and image analysis data• AN1-PZ1.storage.pionier.net.pl• Certificate based• sFTP, SSHFS, GridFTP, WebDAV• access with e.g.• CT-image data for 35 subjects• Initial image analysis data
  84. Public Knowledge, disease centred
  85. Semantically integrated knowledge
  86. Immune response associated genes
  87. Immune response associated genes
  88. Pathway - compound centred knowledge
  89. Disease - pathway network
  90. Gene centred knowledge
  91. News and alerts
  92. R interactive view item 79
  93. R interactive view item 80
  94. Synergy-COPD “Modelling  and  simulation  environment  for  systems  medicine (Chronic obstructive pulmonary disease -COPD- as  a  use  case)”- Start: 1.2.11- Duration: 3 years- Partners: 9- call: ICT-2010.5.3 VPH call- see: www.Synergy-COPD.org Integration of models at metabolic (muscle TCA, Respiratory chain, ATP diffusion) cellular (immune system) and organ (lung biophysics, gas diffusion blood flow) level Clinical decision support Software with translation into clinical praxis
  95. Structure
  96. SYNERGY, KM tasks Clinical data from BioBridge, PAC-COPD + ECLIPSE Experimental methods: – Phenotypes: (respiratory symptoms (wheezing, asthma), rhinitis, dermatitis, IgE? to common inhaled allergens, and their longitudinal changes) – transcriptome – proteomics (targeted) – metabolomics Data matrices for and integration with algorithms and tools for network inference Integration of models (SBML, CellML)
  97. Knowledge model for semantic mappingAims: Find model and experimental data parameters which are similarily described Use experimental data to validate theoretical models Connect Models which share similar Model Parameters
  98. Model and data parameter concept Data parameterModel parameter • instantiates a certain instantiates a certain parameter in the Life Science parameter in a model World has ontological • occurs as descriptor or description measurable in experimental or anthropometric data • has ontological description
  99. Parameter semantic annotation Context specificGeneric • context specfic parameter general parameter information, true for a given information, true in any model/study only context • shared assignment to assigned to parameter parameter + model/study only • e.g. unit, Input/Output e.g. semantic description
  100. Parameter semantic annotationMolecular level model (SBML import) MIRIAM mapping based reference entity association check for identical MIRIAM mapping before re-using existing model parameter by name create  model  specific  name  “parameter_model”  if  non-identicalSupra-molecular level model (manualparameter generation) create semantic annotation based on search result for free text - ontology mapping search for existing parameter with same semantics i.e. on-the-fly network similarity search between search result + Model parameter context
  101. Semantic annotation concept
  102. Mapping conceptUse experimental data to validate theoretical models Connect Element:Model Parameter:Instanceswith Element:Parameter:InstancesConnect models which share similar ModelParameters Connect Element:Model Parameter:Instanceswith Element:Model Parameter:Instances
  103. Mapping method Context:Parameter Description:Instance_B Ontology:A:54645 Element:Parameter:Instance_BOntology:A:5461 Ontology:B:987723 Ontology:C:21365 Ontology:A:54632 Element:Compound:Oxygen Element:Model Parameter:Instance_A Context:Model Parameter Description:Instance_A
  104. Network Similarity Search
  105. Parameter semantic description to annotation mapping
  106. Parameter semantic description to annotation mapping
  107. Synergy-COPD Knowledge Portalhttps://synergy.linkcare.es/
  108. Semantically integrated information - Types semantically described deterministic models probabilistic networks existing knowledge (PPI, Pathways, ..) experimental data clinical data primary data analysis results (differential expression, overrepresented pathways, ..) 95
  109. Semantically integrated information - Results All clinical and experimental data from the BioBridge 8-week study Differential expression analysis – 6 analyses, 5422 mapped proteins in total Probabilistic network – 1 network, 4989 mapped proteins, 14 physiologic parameters=> 1895 common proteins 5 deterministic models – Electron chain 18/9/1 (proteins/common proteins to PN/Diff expression) – TCA 16/11/5 – Central metabolism 11/8 – Gas exchange 13/2 PaO2, VO2max – Spatial heterogeneities 13/3 PaO2, VO2max, Ventilation 96
  110. Public knowledge sources 80 000 genes 112 000 proteins 2 826 Pathways (677 KEGG, 1 276 biochemical, 202 SBML, 55 COPD, 8 user defined, 608 Reactome) 26 Ontologies (1.3 M concepts) > 80 public databases ( > 500 M objects including > 20 000 diseases, > 2 M compounds) 97
  111. Resulting knowledge network 1.5 M protein-protein interactions (80k experimental) 330k gene - compound associations 225k gene - function associations 120k gene - disease associations 36k gene - pathway associations 250 semantic model and clinical parameter descriptions 98
  112. Interpreting semantically integrated information generate new probabilistic networks from the KB explore the connection between probabilistic network(s) and deterministic models based on concepts (genes, physiology) with direct but especially indirect connections e.g. via Pathways, PPI, .. explore the connections between data analysis results to nitroso-redox related knowledge 99
  113. Concept for details see WP4 & 5 presentations Glycolysis NAD GlcClinical ADP Resulting connecting network data mechanic Myofibrils Glycolysis work ATP TCA cycle Cit NAD Glc NADH Pyr AcCoA NAD ADP OAA NADH Succ mechanic work ADP ATP O2 NAD Lac NADH Pyr transport Electron chain CrP ATP diffusion CrP ROS NAD Lac Deterministic models COPD knowledge base ATP Data clinical/ CrP experimental Selection of hubs Oxidative phosphorylation TCA COPD KB Cycle network Glycolysis search Probabilistic network Physiological measurments
  114. Selected candidates to explore connections TNFRSF25 HDAC7 IL11RA PaO2 TCA cycle TP63 IL17D SIRT3 MEF2D HDAC9 VO2maxkg Glycolysis SIRT5 Complex1 IL1R1 MEF2D Complex3 TNFRSF21 Complex5 CXCR4 MYC FGFR1 Complex 4 IGF1Protein ITGB11Carbonylation IL1SIRT4 VO2max IL22 VE IL17A HDAC1 HDAC4 FOXO4 SMAD1 IL1RAPL1 SIRT2 HIF1A SMAD4 FOXO1 Electron chain
  115. Searching connecting networks by PPI 102
  116. PPI based networks Glycolysis candidates Glycolysis model 103
  117. Searching connecting networks by PPI Glycolysis 11 model proteins, 27 candidates - PPI with good experimental evidence (78 456) - 428 protein net, 7 Glycolysis model - 10 candidates - PPI Two Hybrid (3 743), 55 protein net - all PPI (>1.5 Mio), 757 protein net Electron Chain 18 model proteins, 23 candidates, 193 protein net TCA cycle 16 model proteins, 21 candidates, 199 protein net 104
  118. Overrepresentation in connecting networkCentral metabolism model - Glycolysis candidates: 167 pathways 105
  119. Summary• Flexible knowledge modelling• Different levels of access• Exchange within, between and outside of projects• Knowledge  network  “background”  for  data   analysis and mining

×