Your SlideShare is downloading. ×
0
Data Integration: What I Haven't Yet Achieved
Data Integration: What I Haven't Yet Achieved
Data Integration: What I Haven't Yet Achieved
Data Integration: What I Haven't Yet Achieved
Data Integration: What I Haven't Yet Achieved
Data Integration: What I Haven't Yet Achieved
Data Integration: What I Haven't Yet Achieved
Data Integration: What I Haven't Yet Achieved
Data Integration: What I Haven't Yet Achieved
Data Integration: What I Haven't Yet Achieved
Data Integration: What I Haven't Yet Achieved
Data Integration: What I Haven't Yet Achieved
Data Integration: What I Haven't Yet Achieved
Data Integration: What I Haven't Yet Achieved
Data Integration: What I Haven't Yet Achieved
Data Integration: What I Haven't Yet Achieved
Data Integration: What I Haven't Yet Achieved
Data Integration: What I Haven't Yet Achieved
Data Integration: What I Haven't Yet Achieved
Data Integration: What I Haven't Yet Achieved
Data Integration: What I Haven't Yet Achieved
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Data Integration: What I Haven't Yet Achieved

325

Published on

Data integration is a hot topic in bioinformatics, but the term means different things to different people. What do we think it means? Talk given at CSIRO Bioinformatics & Biostatistics group meeting, …

Data integration is a hot topic in bioinformatics, but the term means different things to different people. What do we think it means? Talk given at CSIRO Bioinformatics & Biostatistics group meeting, November 21 2012.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
325
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
16
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Data Integration: what I haven’t yet achieved Neil Saunders MATHEMATICS, INFORMATICS AND STATISTICS www.csiro.au
  • 2. My main project Ludwig colorectal cancer study Data integration 2 of 21
  • 3. Multiple “omics” platforms exon expression Data integration 3 of 21 methylation copy number
  • 4. We want to “integrate” these data but what does that mean? Data integration 4 of 21
  • 5. Integration can mean “portals” Data integration 5 of 21
  • 6. Integration can mean “visualization” Data integration 6 of 21
  • 7. Integration can mean “correlation” Data integration 7 of 21
  • 8. What do we think integration means? A + B + C More information when combined than when separate Data integration 8 of 21
  • 9. What’s already “out there”? PubMed PubMed Search: "data integration" q q q q articles / 100 000 12 q q 8 q q q 4 q q 2002 2004 2006 Year Data integration 9 of 21 2008 2010
  • 10. What’s already “out there”? CiteULike http://www.citeulike.org/user/neils/tag/integration Data integration 10 of 21
  • 11. Buzz-word compliant Data integration 11 of 21
  • 12. Quote from integIRTy paper These methods can be roughly grouped into four categories: stepwise, regression-based, correlation-based and latent variable models integIRTy: a method to identify genes altered in cancer by accounting for multiple mechanisms of regulation using item response theory Bioinformatics, Vol. 28, No. 22. (15 November 2012), pp. 2861-2869 Data integration 12 of 21
  • 13. Regression: SIM Integrated analysis of DNA copy number and gene expression microarray data using gene sets BMC Bioinformatics 2009, 10:203 Data integration 13 of 21
  • 14. 1 2 3 4 5 6 7 8 10 9 11 12 13 14 15 16 17 18 19 20 21 22 0 0 Data integration 14 of 21 0.2 0.4 2 0.6 0.8 4 1 Correlation 010 026 142 011 115 018 037 145 017 009 023 002 116 117 120 003 036 029 040 114 118 121 112 006 113 119 034 035 028 004 007 013 014 016 024 012 019 021 015 001 067 068 072 077 048 058 064 050 075 080 086 051 061 070 076 087 092 096 099 101 104 110 093 097 100 089 109 091 103 127 130 131 135 133 136 134 137 125 128 138 146 032 033 043 038 041 042 140 141 144 153 152 147 122 123 132 126 139 069 074 085 055 095 005 066 010 026 142 011 115 018 037 145 017 009 023 002 116 117 120 003 036 029 040 114 118 121 112 006 113 119 034 035 028 004 007 013 014 016 024 012 019 021 015 001 067 068 072 077 048 058 064 050 075 080 086 051 061 070 076 087 092 096 099 101 104 110 093 097 100 089 109 091 103 127 130 131 135 133 136 134 137 125 128 138 146 032 033 043 038 041 042 140 141 144 153 152 147 122 123 132 126 139 069 074 085 055 095 005 066 Chr Correlation: DR-Integrator
  • 15. Latent variable: iCluster (file under impractical) Data integration 15 of 21
  • 16. Basics that are never explained 1/2 Integration across groups or description of samples? Data integration 16 of 21
  • 17. Basics that are never explained 2/2 Genes x Samples Data integration 17 of 21
  • 18. Conclusions 1/3 We’re not the first people doing this... ...but it’s becoming a “hot topic” Data integration 18 of 21
  • 19. Conclusions 2/3 Room for improvement in software, much of which is: • Poorly-written • Poorly-documented • Difficult to implement Data integration 19 of 21
  • 20. Conclusions 3/3 Too much for one individual! Data integration 20 of 21
  • 21. CSIRO Mathematics, Informatics and Statistics Neil Saunders t +61 2 9325 3144 e Neil.Saunders@csiro.au w Mathematics, Informatics and Statistics web MATHEMATICS, INFORMATICS AND STATISTICS www.csiro.au

×