BioMANTA Network
The Modelling and Analysis of Biological
                    Activity
       Melissa Davis and Andrew New...
Background
• Systems biology holds promise to understand
  complexity of biological networks.
• Computational modelling an...
BioMANTA project overview
• Development of Semantic Interactome Model
  and Semantic Web infrastructure
  – Knowledge repr...
ACB Programs
• Phenotype-informed discovery of networks
  and systems
  – Biomanta network analysis to uncover
    interac...
Current vs Semantic Web
• Current web: Most
  information is natural
  language
   – Meaningful to human users
     who un...
Knowledge Representation
• Structured presentation of concepts, relationships
  and data
   – Relational Database (schema,...
BioMANTA ontology




•   OWL based ontology with imported modules from relevant
    ontologies
•   Limit creation of new ...
Semantic Interactome Model
Integrating Biological Data
• Huge variety in project sizes.
• Very large data sets (TBs and PBs).
• Computationally inten...
Pathways




 PPI Data                         Diseases



                  Data for
                Integration
        ...
Too Many Names
• Global IDs (such as LSID, BioPAX) have largely
  failed to gain acceptance for a variety of reasons.
• We...
Semantic Web Technologies
• Resource Description Framework (RDF) and RDF Schema
  (RDFS)
   – RDF for resources; RDFS for ...
Architecture
Sample Merged Data
Output and Visualisation
• Improved data sets:
  – Removing data redundancy,
  – Better query results,
  – Fewer false pos...
“Hairball” Problem
Genes Associated
               with Rheumatoid
                   Arthritis
                                          STA...
Acknowledgements
                  Chris Bouton
                                   Pfizer Research and Technology
        ...
Upcoming SlideShare
Loading in …5
×

Final Acb All Hands 26 11 07.Key

1,225 views

Published on

Brief rundown of the BioMANTA project.

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,225
On SlideShare
0
From Embeds
0
Number of Embeds
27
Actions
Shares
0
Downloads
19
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Final Acb All Hands 26 11 07.Key

  1. 1. BioMANTA Network The Modelling and Analysis of Biological Activity Melissa Davis and Andrew Newman (ACB, IMB, ITEE, UQ)
  2. 2. Background • Systems biology holds promise to understand complexity of biological networks. • Computational modelling and analysis of large-scale protein-protein interaction (PPI) and compound activity networks. • Knowledge representation using SW standards Resource Description Framework (RDF) and Web Ontology Language (OWL) enables machine inference and facilitates knowledge discovery.
  3. 3. BioMANTA project overview • Development of Semantic Interactome Model and Semantic Web infrastructure – Knowledge representation using Ontology instanced with public interaction data – RDF triple storage, inferencing and querying • Network Inference and Knowledge Discovery – Network meta-analysis – Global network inference • Output, Visualisation of Results and Software – COBALT visualisation software – Data: High quality data sets and ontologies
  4. 4. ACB Programs • Phenotype-informed discovery of networks and systems – Biomanta network analysis to uncover interactions between (disease state) phenotypes and biological networks. • Modelling dynamic cellular processes – Integration of time course expression data. • Algorithms for graphs and networks – Avoiding sub-graph isomorphism. – Object co-identification based on attribute matching.
  5. 5. Current vs Semantic Web • Current web: Most information is natural language – Meaningful to human users who understand the meaning of natural language • Semantic Web: standards for publishing machine- readable information on the web – Standard formats for integration and exchange of data (RDF) – Formal language to express semantics (the meaning of concepts) - OWL – Unambiguous representation
  6. 6. Knowledge Representation • Structured presentation of concepts, relationships and data – Relational Database (schema, tables, identifiers, etc.) – Ontology (Classes, properties/attributes, relationships) – Machine interpretation of representation (XML) • Inference errors frequently caused by errors in the representation – Incorrectly modelled domain knowledge – Missing assumptions • Currently, the majority of PPI data are stored in DB available online
  7. 7. BioMANTA ontology • OWL based ontology with imported modules from relevant ontologies • Limit creation of new classes in BioMANTA ontology and use classes from existing ontologies wherever possible
  8. 8. Semantic Interactome Model
  9. 9. Integrating Biological Data • Huge variety in project sizes. • Very large data sets (TBs and PBs). • Computationally intensive. • Different specialities. • Different levels of semantics in the technologies used (XML, CSV, OWL, etc.) • Mostly suspect, duplicated, inapplicable, poorly and incorrectly modelled.
  10. 10. Pathways PPI Data Diseases Data for Integration in BioMANTA Drug Targets Gene Function Experimental Data
  11. 11. Too Many Names • Global IDs (such as LSID, BioPAX) have largely failed to gain acceptance for a variety of reasons. • We didn’t want to come up with another ID. • A huge number of local IDs including: – MPact, DIP, IntACT, MINT. • Properties include: – Database, – Sequence information, – Species, – Subcellular location, – Expression, cross references, etc.
  12. 12. Semantic Web Technologies • Resource Description Framework (RDF) and RDF Schema (RDFS) – RDF for resources; RDFS for vocabulary – Data presented as triples, graph or xml <subject> <predicate> <object> <Protein> <rdf:type> <rdfs:Class> <Emerin> <rdf:type> <Protein> • OWL – Web Ontology Language for description of ontology – OWL is RDF: can be expressed in triples, graph or xml – cf. Open Biomedical Ontology (OBO) format – familiar to users of GO • Inference engines – software to reason about information in a knowledge representation and infer new data from what is presented • SPARQL – query language for RDF – Returns results sets or RDF graphs
  13. 13. Architecture
  14. 14. Sample Merged Data
  15. 15. Output and Visualisation • Improved data sets: – Removing data redundancy, – Better query results, – Fewer false positives, – Matrix of protein-protein interactions. • Published and peer-reviewed methodologies. • Input to network visualisation tools.
  16. 16. “Hairball” Problem
  17. 17. Genes Associated with Rheumatoid Arthritis STAT3 Pathway Members Disease Modifying Anti- Rheumatoid Non-Steroid Drug Targets Anti- Inflammatory Drug Targets
  18. 18. Acknowledgements Chris Bouton Pfizer Research and Technology Victor Farutin Center, Cambridge, Mike Schaffer Massachusetts, US Fred Jerva Mark Ragan Jane Hunter Melissa Davis IMB & ARC Centre Andrew Newman School of of Excellence in ITEE, UQ Kevin Burrage Imran Khan Bioinformatics Shoaib Sehgal Yuan-Fang Li

×