• Save
X-omics Data Integration Challenges
Upcoming SlideShare
Loading in...5
×
 

X-omics Data Integration Challenges

on

  • 1,632 views

From http://www.seqahead.it/cost-bcn-2013 - day 1, session 1

From http://www.seqahead.it/cost-bcn-2013 - day 1, session 1

Statistics

Views

Total Views
1,632
Views on SlideShare
1,614
Embed Views
18

Actions

Likes
0
Downloads
0
Comments
0

1 Embed 18

http://www.seqahead.it 18

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

X-omics Data Integration Challenges X-omics Data Integration Challenges Presentation Transcript

  • x-omics Data Integration Challenges Dr. Michael Lappe, Ph.D. Senior Bioinformatics Scientist - Functional Genomics and Systems Biology CLCbio, DenmarkThursday, February 14, 13
  • Michael’s Social Network (partial)Thursday, February 14, 13
  • No more cargo-cult http://en.wikipedia.org/wiki/Cargo_cult_science http://en.wikipedia.org/wiki/Cargo_cultThursday, February 14, 13
  • Form follows function http://www.youtube.com/watch?v=pQHX-SjgQvQ Do not follow empty ancient rituals that do not serve a useful purpose anymore! Do NOT confuse the container with its content. Database systems are NOT the DATA!Thursday, February 14, 13
  • Data Integration • involves combining data • residing in different sources and • providing users with a unified view [...] (combining research results from different bioinformatics repositories, for example) http://en.wikipedia.org/wiki/Data_integrationThursday, February 14, 13
  • • Different Levels of Resolution Ecosystem • Population • Organism • Organ • Tissue • Cell • Organelle • Complexes • Assemblies • Molecule • Atoms www.sciencephoto.comThursday, February 14, 13
  • Different experimental sources Kühner et al. “Proteome organization in a genome-reduced bacterium.” Science (2009) vol. 326 (5957) pp. 1235Thursday, February 14, 13
  • Thursday, February 14, 13
  • www.abcam.com/cancer Henning Stehr*, Seon-Hi J. Jang*, Jose M. Duarte, Christoph Wierling, Hans Lehrach, Michael Lappe, Bodo M.H. Lange(2011) "The structural impact of cancer-associated mutations in oncogenes and tumor suppressors" Molecular CancerThursday, February 14, 13
  • www.abcam.com/cancer What are the typical mechanisms at the structural level that cause the de/activation of cancer genes? Henning Stehr*, Seon-Hi J. Jang*, Jose M. Duarte, Christoph Wierling, Hans Lehrach, Michael Lappe, Bodo M.H. Lange(2011) "The structural impact of cancer-associated mutations in oncogenes and tumor suppressors" Molecular CancerThursday, February 14, 13
  • Mapping mutations to (modelled) structures ERBB2 MLH1Thursday, February 14, 13
  • Structural Analysis surface vs. core - binding site - stability - clustering ...ERBB2 MLH1Thursday, February 14, 13
  • A simple yet robust classification IN-Thursday, February 14, 13
  • • Oncogenes • Tumor-suppressor genes activating gain-of-function de-activating loss-of-function mutations (surface, near mutations (in the core, functional/binding sites) destabilising the structure)ERBB2 MLH1Thursday, February 14, 13
  • biological Networks - getting to grips with COMPLEXITY Complex (biological) Systems as Networks of Interacting Elements. Graph Life is a graph! G=(V, E) records records Nodes organize Relationships (Vertices) (Edges) have have PropertiesThursday, February 14, 13
  • The human disease network.Goh KI, Cusick ME, Valle D, Childs B, Vidal M, Barabási AL.Proc Natl Acad Sci U S A.2007 May 22;104(21):8685-90.Thursday, February 14, 13
  • Graph Databases Think of Graphs not as a visualization but as a DATA STRUCTURE http://en.wikipedia.org/wiki/Graph_database http://nosql-database.org/ http://www.neo4j.org/learn#graphsThursday, February 14, 13
  • Proteins as 1a1m - (Ca 8 A) ResidueInteractionGraphs - Anisotropic Network Model eigen-mode 3 capturing dynamics1a1m (Xray) 1jnj (NMR, 20 models) oGNM: A protein dynamics online calculation engine using the Gaussian Network Model" Yang, L.-W., Rader, A.J., Liu, X.,  Jursa, C.J., Chen S.C., Karimi, H, Bahar, I. Nucleic Acids Res, 34, W24-31, 2006Thursday, February 14, 13
  • Geometry & StructurePDB: 1KX5 http://vimeo.com/24047115 S.Daujat, T. Weiss, F.Mohn, U.C.Lange, C.Ziegler-Birling, U.Zeissler, M.Lappe, D.Schubeler, M.E.Torres-Padilla, R.Schneider (2009). "H3K64 trimethylation marks heterochromatin and is dynamically remodeled during developmental reprogramming" Nature Structural and Molecular BiologyThursday, February 14, 13
  • x-omics = Proteomics Metabolomics Regulation [...] + x-Seq Data ChIP = RNA BS ...Thursday, February 14, 13
  • x-omics = Proteomics Metabolomics Regulation [...] + x-Seq Data ChIP = RNA BS ...Thursday, February 14, 13
  • some challenges ... different experiments, protocols, samples, coverage ... isolated information silos different data formats mapping & identifier chaos error propagation / annotation bottleneck statistical criteria for (dis-)similarity knowledge lock-up, literature access redundancy / implicit co-ordination TMI & essential info ?Thursday, February 14, 13 ...
  • "Blind monks examining an elephant" by Itcho Hanabusa 題「衆瞽探象之圖」。英一蝶(はなぶさ・いっちょう 1652 – 1724)の作。Thursday, February 14, 13
  • Let’s move on ...Thursday, February 14, 13
  • http://5stardata.info/ 5★ Open Data Tim Berners-Lee, the inventor of the Web and Linked Data initiator, suggested a 5 star deployment scheme for Open Data.Thursday, February 14, 13
  • http://5stardata.info/ 5★ Open Data ★ make your stuff available on the Web (whatever format) under an Open LicenseThursday, February 14, 13
  • http://5stardata.info/ 5★ Open Data ★ ★ make it available as structured data (machine REadable, e.g. Excel*) * http://dontuseexcel.wordpress.com/2013/02/07/dont-use-excel-for-biological-data/Thursday, February 14, 13
  • http://5stardata.info/ 5★ Open Data ★ ★ ★ use non-proprietary Open Formats (e.g. CSV instead of Excel)Thursday, February 14, 13
  • http://5stardata.info/ 5★ Open Data ★ ★ ★ ★ use URIs to denote things, so that people can point at your stuffThursday, February 14, 13
  • http://5stardata.info/ 5★ Open Data ★ ★ ★ ★ ★ Link your Data to other data to provide (networked)Thursday, February 14, 13
  • http://5stardata.info/ 5★ Open DataThursday, February 14, 13
  • Giant Global Graph important related concept that overlaps with GGG is that of the "Semantic Web" - relates to decentralized Information. (≄Web3.0)Thursday, February 14, 13
  • Thursday, February 14, 13
  • The next Web of open, linked data: Tim Berners-Lee on TED.com http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html http://www.ted.com/talks/tim_berners_lee_the_year_open_data_went_worldwide.htmlThursday, February 14, 13
  • Web of biological Data linked open scientific data grass-roots movementThursday, February 14, 13
  • scale-free Protein Interaction Networks small-world Park, J., M. Lappe, et al. (2001). "Mapping protein family interactions: intramolecular and intermolecular protein family interaction repertoires in the PDB and yeast." Journal of Molecular Biology 307(3): 929-38Thursday, February 14, 13
  • modelling information gain: Tandem-Affinity Purifications in-silicoThursday, February 14, 13
  • modelling information gain: Tandem-Affinity Purifications in-silico Michael Lappe and Liisa Holm "Unraveling protein interaction networks with near-optimal efficiency." (2004) Nature Biotechnology 22(1): 98-103Thursday, February 14, 13
  • Toward interoperable bioscience data Susanna-Assunta Sansone et al., Nature Genetics, Feb 2012 “to make full use of research data, the bioscience community needs to adopt technologies and reward mechanisms that support interoperability and promote the growth of an open ‘data commoning’ culture.” The open source ISA metadata tracking tools facilitates standards compliant collection, curation, local management and reuse of datasets in an increasingly diverse set of life science domains. http://www.isa-tools.org/ http://www.nature.com/ng/journal/v44/n2/pdf/ng.1054.pdfThursday, February 14, 13
  • Free your data ... Biology and BioInformatics are data-driven sciences think beyond your own harddrive and the current paper evaluate and embrace new technologies (LOD, GraphDBs) rethink current incentive systems : no more cargo-cult make it useful, re-useable and sustainable Open Access, Open Source Open Linked Data Mash-Ups focus on your scienceThursday, February 14, 13
  • Thank you! wood engraving by an unknown artist, in “Latmosphère: météorologie populaire” (1888) Camille FlammarionHubble Space Telescope / NASAThursday, February 14, 13