Your SlideShare is downloading. ×
  • Like
Data	Analytics for Personalized Medicine by Aryya Gangopadhyay, PhD
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Data Analytics for Personalized Medicine by Aryya Gangopadhyay, PhD

  • 75 views
Published

Presented at the 3rd International Conference on Personalized Medicine, June 26-29, 2014. Dr. Gangopadhyay is Chair of the Department of Information Systems at University of Maryland Baltimore …

Presented at the 3rd International Conference on Personalized Medicine, June 26-29, 2014. Dr. Gangopadhyay is Chair of the Department of Information Systems at University of Maryland Baltimore County.

Published in Healthcare , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
75
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Data  Analy)cs  for  Personalized   Medicine   Aryya  Gangopadhyay   UMBC   Presented  at  the  3rd  Interna7onal   Conference  on  Personalized  Medicine,   June  26-­‐29,  2014  
  • 2. Scope   •  Big  data  promise  (Pentland  et  al  2013)   –  US  Healthcare  industry  can  save  $200  billion  per  year   •  Need  complete  picture   –  Reality  mining  (MIT  Tech.  Review  2008)   –  Socio-­‐demographics   –  EMRs   –  Biological  data   •  Interac7ons  in  the  network   –  Topology-­‐based  analysis   –  Centrality-­‐based  analysis   –  Perturba)ons  (diseases  as  network  perturba)ons:  del  Sol  et  al   2010)     •  Network  par77oning   •  Visualiza7on  
  • 3. •  “Within  10  years  every  healthcare  consumer  will  be  surrounded  by  a  virtual   cloud  of  billions  of  data  points”  [Hood  et  al.  2013]     Big  data  in  healthcare  
  • 4. Interconnec)ons   –  Biological  processes  are  interconnected  systems   –  Analyze  interac)ons   –  Resilient  against  random            perturba)ons   –  Vulnerable  to  targeted            aXacks   CIDeR:  Large,  mul7-­‐dimensional,   mul7modal,  dynamic  
  • 5. Extensions  to  our  previous  work   –  Updated  the  network   •  Nodes:  5168  to  9767   •  Edges:  14410  to  27744   –  Previous  analysis   •  Network  characteris)cs:  CC,  diameter,  path  lengths,  etc.   •  Node-­‐based  analysis   – Developed  a  new  method  for  iden)fying  effectors  and   receptors   •  Perturba)on  analysis   – Extensions   •  How  do  we  par))on  the  network?   •  What  criteria  to  use  and  why?   •  What  are  the  effects  of  such  par))oning?  
  • 6. Network  extracted  from  CIDeR:  2014   •  Nodes:  9767   •  Edges:  27744   •  Diameter:  15   •  #  CC:  89   •  Avg.  PL:  4.7   •  Avg.  degree:  2.8  
  • 7. Node  Centrality  measures:  correla)ons   x  =  Authority   Y  =  Betweenness  Centrality   Correla)on:  0.8   x  =  Clustering  Coefficient   Y  =  Betweenness  Centrality   Correla)on:  -­‐0.02   x  =  Hub   Y  =  Authority   Correla)on:  0.88   x  =  PageRank   Y  =  Authority   Correla)on:  0.92  
  • 8. Correla)ons  of  Node  Centrality  measures   Clustering.Coefficient   Clustering.Coefficient   Hub   Hub   Authority   Authority   PageRank   PageRank   Eigenvector.Centrality   Eigenvector.   Centrality   Betweenness.Centrality   Betweenness.   Centrality   Eccentricity   Eccentricity  
  • 9. Overall  network  characteris)cs   •  PageRank,  hub  and  authority  scores  are  strongly   correlated   •  Clustering  coefficient  is  nega)vely  correlated  with   other  node  centrality  measures   •  Implica7ons:   1.  Nodes  that  are  strong  effectors  are  also  strong  receptors   2.  Less  central  nodes  are  not  connected  to  each  other  but   mainly  with  an  influen)al  node   3.  Influen7al  nodes  are  mostly  connected  to  each  other   4.  Fully  connected  sub-­‐graphs  are  small  and  rare  
  • 10. Par))oning  the  graph   •  How  can  we  capture  the  above  characteris)cs?   •  Modularity:         •  The  objec)ve  is  to  maximize  Q     •  Intui)on:     –  Put  influen)al  nodes  in  separate  clusters   –  Create  dense  sub-­‐communi)es  (common  neighbors)     •  Algorithms  (op)mal  solu)on  is  NP-­‐hard:  Brandes   2007):   –  Spectral  clustering  based  (Newman  2006)   –  Greedy  algorithm    (Blondel  et  al.  2008)   Q = 1 2m (Aij − didj 2m ) i∈Cl , j∈Cl ∑ l=1 k ∑
  • 11. Clusters  formed  by  maximizing  modularity  
  • 12. Dendrogram  of  top  8  Disease  Clusters   C   C  
  • 13. Cluster  100   Nodes:  1177   Edges:  2122  
  • 14. Cluster  82   Nodes:  1200   Edges:  2554  
  • 15. K-­‐core   •  Objec7ve:  Restrict  analysis  to  regions  of  increased   centrality  and  connectedness   •  K-­‐core:  largest  sub-­‐graph  where  all  nodes  have  a   minimum  degree  of  k  (Batagelj  2002).   •  K=5  (mode=2  for  the  en)re  network)   •  Protein  Interac)on  Networks  (Wuchty  et  al  2005,   Hamelin  et  al  2008)   Taken  from  Hamelin  et  al  2008  
  • 16. 5-­‐core  graph:  color  code-­‐Type  
  • 17. 5-­‐core  graph:  color-­‐code:  Modularity  class  
  • 18. Disease  Clusters  (top  5)  dendrogram   C   C  
  • 19. 5-­‐core  graph:  Cluster  5  (26%)  
  • 20. 5-­‐core  graph:  Cluster  6  (22%)  
  • 21. 5-­‐core  graph:  Cluster  0  (16%)  
  • 22. 5-­‐core  graph:  Cluster  3  (13%)  
  • 23. 5-­‐core  graph:  Cluster  4  (12.5%)  
  • 24. Comparison  of  clusters  
  • 25. •  Contribu7ng  areas   •  Biology,  bioinforma)cs,  sociology,  SNA,  Physics,  applied   mathema)cs,  Computer  and  informa)on  sciences     •  Summary   •  Holis)c    analysis  of  health  data   •  Analysis  based  on  node  centrality   •  Network  par))oning   •  Studying  the  effect  of  perturba)on   •  Where  do  we  go  from  here   •  Create  a  taxonomic  structure  of  elements  and  interac)ons   •  Search  tool     •  Biological  and  clinical  implica)ons   Conclusion