Who’s this guy?
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Car...
Outline
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carvalho
Sc...
Knowledge Discovery in Databases (KDD)
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodyna...
Well. . . Let’s see
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de...
[We may use] Complex Networks
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz...
Yeah, but how?
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carv...
Ok, Let’s work then
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de...
Analysis
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carvalho
S...
Example 1: Chitin pathway phylogeny
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamic...
Example 1: Some results
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F...
Example 1: Some more results
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz ...
Example 1: The expected Network(s)
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics...
Example 2: Foot-and-mouth disease virus in South
America
Knowledge
Discovery in
Databases
through
Complex
Networks:
applic...
Indexes × σ
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carvalh...
A nice network
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carv...
Some more developments
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F....
Related Work
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carval...
Future Directions
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de C...
Thank You!
Knowledge
Discovery in
Databases
through
Complex
Networks:
application to
phylodynamics
Luiz Max F.
de Carvalho...
Upcoming SlideShare
Loading in …5
×

Complex Networks and Data Mining on genetic databases

331 views

Published on

This is my presentation in WaFIS last year, about the use of complex networks on data mining in genetic databases.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
331
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Complex Networks and Data Mining on genetic databases

  1. 1. Who’s this guy? Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-and-Mouth Disease (PAHO/WHO) WaFiS 2012 September 28, 2012
  2. 2. Outline Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) 1 Knowledge Discovery in Databases (KDD) 2 Complex Networks 3 Example 1: Chitin pathway phylogeny 4 Example 2: Foot-and-mouth disease virus in South America WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex
  3. 3. Knowledge Discovery in Databases (KDD) Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex Lots of data human brain very limited processing capacity Information → Knowledge Increasing number of molecular data (sequences, 3D structures, antigenicity,. . . ) Is it possible to explore these databases to discover useful stuff?
  4. 4. Well. . . Let’s see Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex
  5. 5. [We may use] Complex Networks Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex Graphs → G = (V , E )
  6. 6. Yeah, but how? Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex We can explore the ”dynamic signature” of these Complex Networks, i.e., study and compare their structural properties. Some useful formulas: Clustering Coefficient < c >: Degree distribution PK = Diameter: max(d(i, j)) 3×#triangles #triples ∞ K =K pK
  7. 7. Ok, Let’s work then Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex 1 Grab n sequences; 2 Create an n × n matrix using some kind of (normalized) distance (say, S); 3 For each σ ∈ [0, 1] build M(σ) such that: mij (σ) = 1 if Sij > σ, 0 if Sij < σ. In a sense, we are transforming a single network in a family of networks.
  8. 8. Analysis Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex We shall explore the relationships between these networks: First, define a higher-order neighborhood indicator function, such that you binarize the adjacency matrix with regard the ˆ path length , obtaining a matrix M = D M( ). Then =1 δ(α, β) = 1 N2 N N ( i=1 j=1 ˆ mij (α) mij (β) ˆ − ) D(α) D(β) Evaluating δ(σ, σ + ∆σ) can give some interesting insights. (1)
  9. 9. Example 1: Chitin pathway phylogeny Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) Proteins related to the chitin metabolic pathway from 1605 complete genomes; WaFiS 2012 BLAST distances (which are asymmetric); Knowledge Discovery in Databases (KDD) Complex Search for phylogenetic relationships
  10. 10. Example 1: Some results Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex
  11. 11. Example 1: Some more results Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex
  12. 12. Example 1: The expected Network(s) Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex
  13. 13. Example 2: Foot-and-mouth disease virus in South America Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) S was built with phylogenetic (TN93) distances for NT and JTT distances for AA; Try to make sense of a somewhat big data set (167 seqs); Extract some nice patterns; WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex
  14. 14. Indexes × σ Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) (a) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex (b)
  15. 15. A nice network Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex
  16. 16. Some more developments Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex
  17. 17. Related Work Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex Identify transmission clusters (HIV, HCV) (Lewis et al, 2008,Plos Medicine) Explore scale-free behavior in phylodynamics (Shiino, 2012, Frontiers in Microbiology)
  18. 18. Future Directions Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex Explore the spatial aspect in the construction of S ˆ Maybe S = µ + S(G )α Power law analysis Implement assortativity Suggestions. . .
  19. 19. Thank You! Knowledge Discovery in Databases through Complex Networks: application to phylodynamics Luiz Max F. de Carvalho Scientific Computing Programme (PROCC), Fiocruz Pan American Center for Foot-andMouth Disease (PAHO/WHO) WaFiS 2012 Knowledge Discovery in Databases (KDD) Complex

×