Phylogenetic models and MCMC methods for the reconstruction of language history
Upcoming SlideShare
Loading in...5
×
 

Phylogenetic models and MCMC methods for the reconstruction of language history

on

  • 2,085 views

 

Statistics

Views

Total Views
2,085
Views on SlideShare
1,945
Embed Views
140

Actions

Likes
0
Downloads
13
Comments
0

6 Embeds 140

http://xianblog.wordpress.com 79
http://www.r-bloggers.com 47
http://www.slideshare.net 10
https://www.linkedin.com 2
https://xianblog.wordpress.com 1
http://translate.googleusercontent.com 1

Accessibility

Categories

Upload Details

Uploaded via as OpenOffice

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Phylogenetic models and MCMC methods for the reconstruction of language history Phylogenetic models and MCMC methods for the reconstruction of language history Presentation Transcript

  • Phylogenetic models and MCMC methods for the reconstruction of language history Robin J. Ryder CEREMADE – Paris Dauphine / CREST – INSEE Joint work with Geoff K. Nicholls at the Department of Statistics, University of Oxford www.slideshare.net/robinryder
  • Carles li reis, nostre emper[er]e magnes Set anz tuz pleins ad estet en Espaigne : Tresqu’en la mer cunquist la tere altaigne. N’i ad castel ki devant lui remaigne ; Mur ne citet n’i est remes a fraindre, Fors Sarraguce, ki est en une muntaigne. Chanson de Roland , 1r (11 th century)
  • La plus commune façon d'amollir les coeurs de ceux qu'on a offensez, lors qu'ayant la vengeance en main, ils nous tiennent à leur mercy, c'est de les esmouvoir par submission à commiseration et à pitié. Montaigne, Essais , I, 1 (1580)
  • Tes yeux sont si profonds qu'en me penchant pour boire J'ai vu tous les soleils y venir se mirer S'y jeter à mourir tous les désespérés Tes yeux sont si profonds que j'y perds la mémoire Aragon, Les Yeux d'Elsa (1942)
  • Et la piaule swingue au son du ghetto, on tape à la porte Chill c'est trop fort ! baisse le son merde ! j'connais A chaque fois c'est pareil tant pis il faut qu'ça pète Et profite en traître des nouveaux albums qu'Rod m'achète Akhénaton, Juste une pression (2005)
  • What to expect
    • Description of the data
    • Model of language diversification
    • MCMC for phylogenetic trees
    • Synthetic studies
    • Analysis of two data sets
  • Indo-European languages
  • Indo-European languages
  • Language diversification Languages change in a way comparable to biological species Similarities between languages indicate that they may be cousins. Most common model : phylogenetic tree
  •  
  • Questions
    • Topology
    • Internal ages
    • Age of the root: 6000-6500 BP or 8000-9500 BP?
    • (BP=Before Present)
  • Core vocabulary
    • 100 or 200 meanings, present in almost all languages : bird, hand, to eat, red...
    • Borrowing is possible (non-tree-like change), but:
    • “ Easy” to detect
    • Uncommon
    • Does not introduce systematic bias
  • Data coding Old English: stierfþ Old High German: stirbit , touwit Avestan: miriiete Old Church Slavonic: umĭretŭ Latin: moritur Oscan: ? Cognacy classes: 1. {stierfþ, stirbit} 2. {touwit} 3. {miriiete, umĭretŭ, moritur}
  • Constraints
    • Constraints on parts of the topology
    • Constraints on some internal ages
    • We use these constraints to infer rates and other ages
  •  
  • Description of the model (1)‏
    • Traits are born at rate λ
    • Trait instances die at rate μ
    • λ and μ are constants
  • Description of the model (2)‏
    • Catastrophes occur at rate ρ
    • At a catastrophe, each trait dies with probability κ and Poiss(ν) traits are born.
    • λ/μ=ν/κ: the number of traits is constant on average.
  • Description of the model (3)‏
    • Observation model: each data point (0s and 1s) is missing with probability ξ
    • Some traits are not observed and are therefore deleted from the data
  • Registration process
  • Registration process
  • Registration process
  • Registration process
  • Posterior distribution
  • Likelihood calculations
  • Prior distribution on trees
    • Our main focus is on the root age
    • We would like the marginal prior on the root age to be (approximately) uniform over (say) 5000-15000BP
  • MCMC moves
    • Random walk on the parameters
    • Various moves on the tree (Drummond et al., 2002)
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  • Checking mixing and convergence
    • Auto-correlations
    • Need statistics on the tree
    • Length of the tree
    • Root age
    • Presence/Absence of a few subtrees
  • Synthetic data True tree, ~40 words/language Consensus tree
  • Synthetic data (2)‏ Death rate (μ)
  • Influence of borrowing True tree, ~40 words/language Borrowing: 10% Consensus tree
  • Influence of borrowing (2) Consensus tree True tree, ~40 words/language Borrowing: 50%
  • Influence of borrowing (3)
    • Topology is reconstructed correctly
    • Dates are underestimated for high levels of borrowing
    Root age Death rate ( μ) Borrowing: 50%
  • Detecting borrowing Confirmed: hardly any borrowing!
  • Data used
    • Indo-European languages
    • Core vocabulary (Swadesh 100 or 200)
    • Two independent data sets
    • Dyen et al. (1997): 87 languages, mostly modern
    • Ringe et al. (2002): 24 languages, mostly ancient
  • Constraints
  • Cross-validation
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  • Root age
  • Conclusions
    • Strong support for the Anatolian hypothesis: root age around 8000BP. No support for the Kurgan hypothesis.
    • Applicable to a variety of linguistic and cultural data sets
    • TraitLab: it's free!
  • Questions otázky spørgsmåler vragen questions Fragen domande pytania questões întrebări вопросы vprašanja preguntes preguntas frågor vrae spurningar quaestiones ερωτήσεις въпроси kesses spørsmåler kláusimai запитанні سوال प्रश्न cwestiwnau