Aloha, Willi Hennig Society members My name is Ross Mounce and I'm here representing the University of Bath (UK) to present the initial results from the first 6 months of my PhD research My first project was a meta-analysis investigating the congruence of cranial and postcranial characters in vertebrate systematics
I would hope that here [pause] at Willi Hennig Society meeting, I would be preaching to converted when talking about the utility of morphology in phylogenetic analyses but just to re-iterate it: The inference of evolutionary relationships between taxa requires the assessment of many different independent or pseudo-independent sources of data. So in principle (if not practice), the more relevant data, (including morphological data) that is evaluated, the better the test of the hypothesis; under the philosophical principle of Total Evidence So what are these different types of evidence I'm talking of? [CLICK]
A typical combined analysis might involve morphological and molecular data However, this is rather an oversimplication of the matter because the “molecular data” in the equation, can actually be composed of many independent subsets of molecular data... [CLICK]
A typical combined analysis might involve morphological and molecular data However, this is rather an over-simplication of the matter because the “molecular data” in the equation, can actually be composed of many independent subsets of data... [CLICK] Nuclear genes, mitochondrial genes, non-coding genes... These data subsets or 'partitions' are logically independent of each to a significant extent. Indeed it is commonly observed that mitochondrial and nuclear genes can have a very different signal relative to each other. Likewise coding and non-coding regions differ, as do 1 st 2 nd and 3 rd codon positions. There are many logically and empirically justified ways of partitioning molecular data and these have all been compared relative to each other to explore data conflict. However, the congruence (or otherwise) of subsets of morphological data remain largely unexplored [CLICK]
Here are a few morphological data partitions that I think are worth investigating relative to each other: In this study I've chosen to divide up datasets into cranial and postcranial characters But there are actually a multitude of other unexplored ways in which the congruence of morphological data could be compared and for a great variety of taxa; animal, insect, plant, extant or fossil. So why did I chose Cranial / Postcranial to investigate [CLICK]
So why did I chose Cranial / Postcranial to investigate... Many vertebrate systematists seem to hold unproven theories about the value of postcranial evidence, indeed it is rather neglected in studies of some taxa such as lower ray-finned fish Following on from that last quote, one of the first things I assessed in the literature was cranial character bias... [CLICK]
And on average, from what I've looked at. I think there is a numerical bias towards cranial characters in phylogenetic analyses. On average and for a sample size of 70 different datasets Greater than 60% of the characters described in those analyses are cranial. When I say 'different' I should clarify that I stratified my sampling so that I didnt end up with 50 dinosaur datasets, 10 mammalian analyses and a few studies on fish and reptiles. Now given that the majority of morphology-based phylogenetic analyses are equally weighted (giving each and every character the same impact), clearly if there tend to be more cranial characters then the overall signal in an analysis will derive disproportionately from the cranial region! But does an imbalance of contribution really matter? It would only matter if the signal from each partition was different [CLICK]
Finally; instead of asking whether the partitions are congruent with each other. Perhaps we should ask how congruent they are to the combined analysis result: the whole tree? The trouble with this is that as I have shown earlier, one partition (usually the cranial) is almost always larger than the other and thus in an equal weights analysis; will be more likely to be similar to the overall result just by virtue of having contributed more to that result. So, we overcame this bias by bootstrapping the number of characters from the smaller partition so that each partition contributes the same number of characters to the overall result. [NOTES:] Tests how similar each partition is (topologically by tree-to-tree distance measures) to the whole analysis tree. ENTIRE of ONE, BOOTSTRAP OTHER: Takes the number of chars in the smaller partition and the same number from the larger so each partition is equal, bootstrapped. BOOTSTRAP BOTH: Then 1/2 the number of chars from the smaller partition and the same number from the larger partition (again so each partition is equal), both bootstrapped. Get tree-to-tree distances (whole tree to bootstrap tree) of (1.) majority-rule trees and (2.) mean minimum neighbour net (most similar to most similar) of all the MPTs also calculated (3. strict and 4. semistrict just because we could computationally but they're gibberish so ignore) perform a Mann-Whitney U test comparing the set of bootstrap (100 cranial v 100 postcranial) distance-to-whole-tree results. If significant, then it means the medians of one or the other groups is significantly different to the medians of the other (and thus in plain english, that one of the sets either postcranial or cranial has a significantly different tree-distance from the topology of the whole analysis topology, relative to the other partition's distance from the whole tree topology. Question: given the tree-to-tree distance is a number, could each partition be equally (and/or consistently equally over the set of bootreps) far away numerically from the whole tree and yet in different directions (different topologically) relative to each other rather than the whole tree? How about a shared node-identity based measure? AgD1 - argh! Plea to fix AgD1 algorithm?
So my next step was to test the similarity or 'congruence' of signal between the partitions. To do this, I used three different methods, namely: incongruence length difference, topological incongruence length difference and a new test my supervisor has developed called Incongruence Relative Distance
The incongruence length difference is a randomisation-based test measuring the significance of the difference between the length of the most parsimonious solution from the analysis of the whole data matrix [the left hand side of the slide], With the length of the most parsimonious solutions resulting from analyses of each partititions on their own, added together [illustrated on the right hand side of the slide].
This slide shows a mock-up of the observed cranial/postcranial characters, and 3 randomised partitions. In the bottom right hand of the slide you can see a distribution of a thousand replications, the red arrow indicates the ILD for the specified partition and clearly it is not significantly different from the randomly partitioned sets. On the next slide I'll show you some real ILD results... [CLICK] To determine the significance of the observed length difference, one compares it to the length difference from hundreds of other randomised partitions of the same size. In this example the first randomized partition in the top right has a length difference of just 1, as does the randomized partition scheme of the bottom left, whilst random partition scheme 999 is less congruent than the original split. The bar chart in the bottom right shows the full incongruence length distribution of 999 randomized partitions. From this we can see that an ILD of 2 is less congruent than average but is NOT significantly incongruent relative to all possible partitions. Now, I'll show you the results I obtained on my datasets using the ILD test [CLICK]
With a sample size of 63, (carefully thinned from a much larger pool of possible studies to both minimize terminal overlap, and maximize vertebrate tree coverage we found some surprising results: A quarter of the datasets tested showed strongly significant incongruence between cranial and postcranial character partitions, and there were a few more that were mildly but not statistically significant. [Pre-emptively defend heuristic search settings? Used over 5000 hrs of computing time in the end, one dataset took over 500 hours to complete!] The ILD comes not without criticism; examples of false positives, examples of false negatives; interested readers should refer to papers by Paul Planet Martin Ramirez and (my former supervisor) Donald Quicke among many others. So, given, the ILD test has it's detractors, what do the other tests say about the same sample data?
Undeterred by the apparent poor performance of TILD, my supervisor has devised a different ILD-like test using topological data. We call it the Incongruence Relationship Difference test: It is much like the ILD, except instead of tree length, one compares the difference in topology using tree-to-tree distance measures such as Robinson-Foulds distance (aka symmetric difference) and the agreement subtree distance (AgD1). For technical reasons, I shall only discuss the Robinson Foulds results. Regrettably, I found out the hard way that the maximum agreement subtree distance algorithm (AgD1) in PAUP* seems to crash quite consistently if one attempts to calculate AgD1 distances for trees with over a threshold of taxa.
The results are much like that of the ILD test in terms of proportion significant, however in some cases the ILD and IRD identify different datasets as having incongruent partitions. This is a good thing. If ILD and IRD were exactly the same there'd be no point. It could be that they are detecting different types of incongruence, further work needs to be done to determine exactly what IRD is seeing when ILD doesn't agree.
Phylogenetic Congruence between Cranial and Postcranial Characters in Archosaur Systematics
Phylogenetic Congruence between Cranial and Postcranial Characters in Archosaur Systematics Ross Mounce and Matthew Wills [email_address] @rmounce #SVP2011
Introduction Total Evidence sensu Carnap (1950) The strongest test of a phylogenetic hypothesis is provided by the comparison of multiple lines of independent evidence
Example Data Partitions Molecular data + Morphological data
Example Data Partitions Molecular data [nuclear genes] [mitochondrial genes] [coding and non-coding] + Morphological data
Morphological Data Partitions: a few examples <ul><ul><li>Cranial | Postcranial </li></ul></ul><ul><ul><li>(for Archosaurs, this talk) </li></ul></ul><ul><ul><li>Genital | Non-genital </li></ul></ul><ul><ul><li>(for insects, Song & Bucheli, Cladistics, 2010) </li></ul></ul><ul><ul><li>'Hard parts' | 'Soft parts' </li></ul></ul><ul><ul><li>(in prep.) </li></ul></ul>Brusatte | MorphoBank m56265 Leidy | MorphBank 143314, 143309
Motivation “ It is commonly believed that there are differences in the evolutionary lability of the crania, dentition, and postcrania of mammals” (Sanchez-Villagra & Williams, 1998) “ ...postcranial characters either from the vertebral column or fins are considerably less used in phylogenetic analyses of lower actinopterygians” (Arratia, Acta Zoologica 2009)
Generally, there are more cranial characters in real data matrices > 60% of vertebrate characters are cranial* * Based upon a stratified sample of 120 datasets across all Vertebrata, published between 2001-2011, excluding matrices that were either 100% cranial or postcranial But there are some groups that significantly differ from this overall trend e.g. studies of: Aves, and Sauropods
Which is set is most homoplasious? <ul><li>There is a difference, but it is NOT statistically significant </li></ul><ul><li>Furthermore, this 'test' is unsound, as number of characters </li></ul><ul><li>(as a variable) has a known negative effect on CI [Klassen et al, 1991] </li></ul><ul><li>CI is not an appropriate statistic to compare between datasets </li></ul><ul><li>Archie's (1989b) Homoplasy Excess Ratio (HER) is better, </li></ul><ul><li>BUT it has problems associated with high levels of missing data </li></ul>Cranial Postcranial or Mean CI = 0.587 Mean CI = 0.563 for N = 50
The Consistency Index is not a good statistic for use in comparative cladistic studies Consistency Index (CI) 1.0 0 0 Number of Characters in the (Whole) dataset 450 N (datasets) = 163
Incongruence Length Difference Out 000000000 000000000 A 001110011 000000011 B 001110000 000001100 C 001100011 000111111 D 110000000 001111100 E 110001101 111111101 F 110001100 111111100 Out 000000000 A 001110011 B 001110000 C 001100011 D 110000000 E 110001101 F 110001100 Out 000000000 A 000000011 B 000001100 C 000111111 D 001111100 E 111111101 F 111111100 L=25 L=11 L=12 ILDvalue = 25 – (11 + 12) = 2 Combined - Separate (Mickevich & Farris, 1981)
Determining the significance of ILD Out 000000000 000000000 A 001110011 000000011 B 001110000 000001100 C 001100011 000111111 D 110000000 001111100 E 110001101 111111101 F 110001100 111111100 Cranial | Postcranial Out 00 0000 00 0 0 00000 000 A 00 1110 01 1 0 00000 011 B 00 1110 00 0 0 00001 100 C 00 1100 01 1 0 00111 111 D 11 0000 00 0 0 01111 100 E 11 0001 10 1 1 11111 101 F 11 0001 10 0 1 11111 100 Out 0 000 0 00 00 0 00 00 00 00 A 0 011 1 00 11 0 00 00 00 11 B 0 011 1 00 00 0 00 00 11 00 C 0 011 0 00 11 0 00 11 11 11 D 1 100 0 00 00 0 01 11 11 00 E 1 100 0 11 01 1 11 11 11 01 F 1 100 0 11 00 1 11 11 11 00 Out 00 0000 000 00 0000 000 A 00 1110 011 00 0000 011 B 00 1110 000 00 0001 100 C 00 1100 011 00 0111 111 D 11 0000 000 00 1111 100 E 11 0001 101 11 1111 101 F 11 0001 100 11 1111 100 Randomized partition 1 Random. Partition 2 … Random. Partition 999 ILD = 2 ILD = 1 ILD = 1 ILD = 3 Cran | Post (Farris et al, 1995a,b)
Significant ILD p-values Key: ILD p-values less than 0.050 are generally considered to indicate SIGNIFICANT incongruence between the data partitions being tested. (Cranial | Postcranial) * 999 random reps, heuristic search, TBR-swapping, maxtrees 10000, hold 1000, RAS 10 Nesbitt 2011 Archosaurs 0.003 Ezcurra 2007 Coelophysoids 0.005 Martinez 2011 Dinosauria 0.014 Makovicky 2010 Ornithomimosaurs 0.026 Chure 2010 Sauropods 0.001 Allain 2008 Sauropods 0.005 Wilson 2002 Sauropods 0.007 Wilson 2009 Sauropods 0.014 Mannion 2011 Sauropods 0.016 Zaher 2011 Sauropods 0.023 Carballido 2010 Sauropods 0.024 Taylor 2011 Sauropods 0.034 Suteethorn 2010 Sauropods 0.035 Taylor 2009 Sauropods 0.036 Mo 2010 Sauropods 0.067
Incongruence Relationship Difference To cut what would otherwise be a long story short... it's like the ILD test … but instead of measuring difference in length (steps) Topological difference (between cladograms) is measured quantitatively using tree-2-tree distances and compared to the distance between cladograms from randomly generated partitions (of the same size). e.g. Symmetric Difference (Robinson-Foulds distance) Agreement Subtrees (AgD1, Goddard, 1994) Subtree Prune and Regraft distance (e.g. Wu, 2008) (a new randomization-based method, Mounce & Wills, in prep.)
The IRD test results: complicated IRD(RF) significance p-values ILD significance p-values The IRD(RF,majrule) results appear to show even more sig. incongruence An example, using data from Chure et al, Naturwissenschaften, 2010 Tanglegram visualization produced using Dendroscope 3, Huson & Scornavacca (2011) 14 MPTs 104 MPTs
Recap & Conclusions <ul><li>'Apparent' significant incongruence of phylogenetic signal between cranial and postcranial data partitions is not uncommon, however one choses to measure it (ILD or IRD) </li></ul><ul><li>Particularly incongruent in Sauropod datasets </li></ul><ul><li>There can only be 3 possible explanations: </li></ul><ul><li>1.) Coding error </li></ul><ul><li>( impossible , many expert authors agree on the characters and how they are coded, differences are relatively few) </li></ul><ul><li>2.) An error of the method </li></ul><ul><li>(unlikely, the ILD test is resilient to missing data) </li></ul><ul><li>3.) Modularity of evolution </li></ul><ul><li>and/or a difference in the rate of evolution (possible?) </li></ul>
Acknowledgements Funding Computational Resources Help and Guidance Matthew Wills Sylvain Gerber Biodiversity Lab 1.07 Ward Wheeler Data … and all authors who kindly provided me their data, knowingly or otherwise www.graemetlloyd.com/data
A brighter future for digitalised paleontology? “ To promote the preservation of and future access to data, the Journal of Vertebrate Paleontology is considering following other key journals in instituting a policy requiring that data supporting the results presented in a publication be archived in a public repository.” Fairbairn, D. J. (2011) The advent of mandatory data archiving. Evolution 65, 1-2 http://dx.doi.org/10.1111/j.1558-5646.2010.01182.x Berta, A. & Barrett, P. M. (2011) Editorial. Journal of Vertebrate Paleontology 31, 1