Your SlideShare is downloading. ×
iEvoBio_SplitDB
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

iEvoBio_SplitDB

336
views

Published on

iEvoBio 2011 Talk

iEvoBio 2011 Talk


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
336
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Motivation:Conflicting result What motivated us is facing with discrepancy between results of running different methods or parameters for same set of taxa. for example the difference between the support values of this node are so high.So the first step to deal with that is understanding the source of problem. what causes the conflict?
  • resource that cause these conflict between results:1) computational methods: most common algorithms are Markov chain Monte Carlo (MCMC) method, Baysian inference. The tree obtained by these methods can be completely different. 2) Models: different substitution models, free parameters calculated during stochastic methods etc can also influence the result. 3) Biological problem: Different gene trees can be obtained foe some fixed set of species, so the tree obtained from same set of taxa can have different topology.
  • To solve that we developed a tool. This tool has two components:Storageand manipulation component.Storage: Data Base instead of large flat files because of these property of Database.Data Manipulation: Using Graphical component to visualize result to better identification of conflicts such as ….
  • The storage component isMySQL database running on Linux Server. The core codes are written in Python. some libraries are used as:The web is used for user access. combination of Joomla Application framework and PHP files. The core Python codes are executed from PHP codes.Joomla CMS: content management system
  • The tool accept the tree files from different packages such as :
  • Dataset : consist of any arbiturary set of leaves.Analysis : consist of tree topology and split frequency valueFirst step is to make name for a new dataset, and the dataset consists of any arbitrary set of leaves.The Second step is to import Analysis for the set of leaves that have been generated by any of that software packages that I previously mentioned. remember the purpose of this program is to store the split frequency for tree topologies.You can export the tree topology and split frequencies to be imported to any other tree drawing program such as Figtree for creating publication quality trees that contains spilt frequency values for a set of analysis.(To highlight conficting results you can make your own query to only show nodes with the difference between multiple analysis is greater than some 10% or any value. furthermore you can refine the query to display nodes with higher split frequency value than some criteria.)To highlight conflicting result you can customize the query to only display nodes where analysis differ by some percentage value.You can also costomize the query to only show nodes having split frequency value greater than some arbitarary value.The last feature I am demonstrat is how you can visually compare all of split frequencies for the particular dataset for all pair wise comparison of all analysis. The plots include idealized line as a reference.
  • Our approach was to;
  • Transcript

    • 1. A Mechanism for Storing, Retrieving and Evaluating Statistical Properties of PhylogeneticTrees
      Haleh Ashki, James C. Wilgenbusch, and Paul van der Mark.
      Department of Scientific Computing, Florida State University, Tallahassee, FL
    • 2. Motivation
    • 3. Causes of conflicting results:
      • Computational methods: (MCMC, Bayesian)
      • 4. Models: (substitution models, free parameters calculated during stochastic methods )
      • 5. Biological problem : (Gene Tree vs Species Tree)
    • Solution
      Developing a tool to identify conflicting result.
      • Store Data: Database: Better data retention; Speed ; Interoperability ; Fault tolerance.
      • 6. Data Manipulate and Visualize: Tables, Plots, Tree views.
    • Application
      Core:
      Python:
      Dendropy(Sukumaran, J. and Mark T. Holder. 2010 )
      MySQLdb(http://mysql-python.sourceforge.net/MySQLdb.html)
      Matplotlib(http://matplotlib.sourceforge.net/)
      Database:
      MySQL 5.0 (Running on Linux Server)
      Web User Interface:
      Joomla: CMS & Application Framework
      PHP, JavaScript, GoogleAPI
    • 7. Accept Nexus Tree files as Input From:
      MrBayes(Huelsenbeck, J. P. and F. Ronquist. 2001 )
      Paup(Swofford, D. L. )
      RAxML(Stamatakis, A., Ludwig, T., and Meier, H. )
      Garli(Zwickl,D.J.,2006.)
    • 8. Software
      Available :
      https://bpd.sc.fsu.edu/
      The Sourceforge page:
      http://sourceforge.net/projects/splitdb/
    • 9. Thanks To
      Department of Scientific Computing, Florida State University.
      The National Science Foundation for funding to support some of this work(EF-0849861)
    • 10. Demo