Motivation:Conflicting result What motivated us is facing with discrepancy between results of running different methods or parameters for same set of taxa. for example the difference between the support values of this node are so high.So the first step to deal with that is understanding the source of problem. what causes the conflict?
resource that cause these conflict between results:1) computational methods: most common algorithms are Markov chain Monte Carlo (MCMC) method, Baysian inference. The tree obtained by these methods can be completely different. 2) Models: different substitution models, free parameters calculated during stochastic methods etc can also influence the result. 3) Biological problem: Different gene trees can be obtained foe some fixed set of species, so the tree obtained from same set of taxa can have different topology.
To solve that we developed a tool. This tool has two components:Storageand manipulation component.Storage: Data Base instead of large flat files because of these property of Database.Data Manipulation: Using Graphical component to visualize result to better identification of conflicts such as ….
The storage component isMySQL database running on Linux Server. The core codes are written in Python. some libraries are used as:The web is used for user access. combination of Joomla Application framework and PHP files. The core Python codes are executed from PHP codes.Joomla CMS: content management system
The tool accept the tree files from different packages such as :
Dataset : consist of any arbiturary set of leaves.Analysis : consist of tree topology and split frequency valueFirst step is to make name for a new dataset, and the dataset consists of any arbitrary set of leaves.The Second step is to import Analysis for the set of leaves that have been generated by any of that software packages that I previously mentioned. remember the purpose of this program is to store the split frequency for tree topologies.You can export the tree topology and split frequencies to be imported to any other tree drawing program such as Figtree for creating publication quality trees that contains spilt frequency values for a set of analysis.(To highlight conficting results you can make your own query to only show nodes with the difference between multiple analysis is greater than some 10% or any value. furthermore you can refine the query to display nodes with higher split frequency value than some criteria.)To highlight conflicting result you can customize the query to only display nodes where analysis differ by some percentage value.You can also costomize the query to only show nodes having split frequency value greater than some arbitarary value.The last feature I am demonstrat is how you can visually compare all of split frequencies for the particular dataset for all pair wise comparison of all analysis. The plots include idealized line as a reference.
Our approach was to;
A Mechanism for Storing, Retrieving and Evaluating Statistical Properties of PhylogeneticTrees Haleh Ashki, James C. Wilgenbusch, and Paul van der Mark. Department of Scientific Computing, Florida State University, Tallahassee, FL
Biological problem : (Gene Tree vs Species Tree)
Solution Developing a tool to identify conflicting result.
Store Data: Database: Better data retention; Speed ; Interoperability ; Fault tolerance.
Data Manipulate and Visualize: Tables, Plots, Tree views.
Accept Nexus Tree files as Input From: MrBayes(Huelsenbeck, J. P. and F. Ronquist. 2001 ) Paup(Swofford, D. L. ) RAxML(Stamatakis, A., Ludwig, T., and Meier, H. ) Garli(Zwickl,D.J.,2006.)
Software Available : https://bpd.sc.fsu.edu/ The Sourceforge page: http://sourceforge.net/projects/splitdb/
Thanks To Department of Scientific Computing, Florida State University. The National Science Foundation for funding to support some of this work(EF-0849861)