• Like
  • Save
Phylogenetics in R
Upcoming SlideShare
Loading in...5
×
 

Phylogenetics in R

on

  • 13,081 views

Talk given on 18 Nov, 2011 on doing phylogenetics in R.

Talk given on 18 Nov, 2011 on doing phylogenetics in R.

Statistics

Views

Total Views
13,081
Views on SlideShare
11,136
Embed Views
1,945

Actions

Likes
3
Downloads
200
Comments
0

27 Embeds 1,945

http://www.r-bloggers.com 938
http://r-ecology.blogspot.com 770
http://schamberlain.github.com 101
http://raonyguimaraes.com 51
http://feeds.feedburner.com 20
http://schamberlain.github.io 10
http://recology.info 8
http://www.hanrss.com 5
http://r-ecology.blogspot.it 4
http://r-ecology.blogspot.de 4
http://r-ecology.blogspot.in 3
http://recology77.rssing.com 3
http://r-ecology.blogspot.ru 3
http://r-ecology.blogspot.fr 3
http://r-ecology.blogspot.ca 3
http://r-ecology.blogspot.com.au 3
http://r-ecology.blogspot.co.uk 2
http://r-ecology.blogspot.mx 2
http://r-ecology.blogspot.com.br 2
http://r-ecology.blogspot.fi 2
http://www.newsblur.com 2
http://r-ecology.blogspot.no 1
http://localhost 1
http://feedproxy.google.com 1
http://r-ecology.blogspot.jp 1
http://r-ecology.blogspot.cz 1
http://r-ecology.blogspot.com.ar 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Phylogenetics in R Phylogenetics in R Presentation Transcript

    • Phylogenetics in R Scott Chamberlain November 18, 2011
    • What sorts of phylogenetics things can I do in R?
    • The run down
      • Get sequence data
      • Align sequence data
      • Phylogenetic inference
        • NJ, maxlik, parsimony, Bayesian, UPGMA
      • Visualize phylogenies
      • Traits on trees
        • Phylogenetic signal
        • Trait evolution
        • Ancestral state character reconstruction
      • Tree simulations
      • Get trees
      • Phylogenetic community structure
      • Bonus stuff: polytomy resolver
    • Basic trees in R
      • Example
      • require(ape)
      • tr1 <- read.tree(text = &quot;(((B:0.05,C:0.05):0.01,D:0.06):0.04,A:0.1);&quot;)
      • tr1 # print tree summary
      • write.tree(tr1) # print tree in newick format &quot;(((B:0.05,C:0.05):0.01,D:0.06):0.04,A:0.1);&quot;
      • tr1$tip.label # tip labels &quot;B&quot; &quot;C&quot; &quot;D&quot; &quot;A&quot;
      • tr1$edge.length # edge labels 0.04 0.01 0.05 0.05 0.06 0.10
      • tr1$node.label # node labels NULL [MEANING – no node labels]
      • # Assign properties to trees
      • tr1$tip.label <- c('sleepy','happy','grumpy','frumpy') # label tips
      • tr1$tip.label # did it work? &quot;sleepy&quot; &quot;happy&quot; &quot;grumpy&quot; &quot;frumpy“
      • Etcetera for other tree properties
    • Get sequence data
      • # install and load ape
      • install.packages(&quot;ape&quot;); require(ape)
      • # get data from Genbank
      • # make vector of accession numbers, for ITS 1 and 2 region for Gossypium (cotton) species
      • cotton_acc <- c(&quot;U56806&quot;, &quot;U12712&quot;, &quot;U56810&quot;,
      • &quot;U12732&quot;, &quot;U12725&quot;, &quot;U56786&quot;, &quot;U12715&quot;,
      • &quot;AF057758&quot;, &quot;U56790&quot;, &quot;U12716&quot;, &quot;U12729&quot;,
      • &quot;U56798&quot;, &quot;U12727&quot;, &quot;U12713&quot;, &quot;U12719&quot;,
      • &quot;U56811&quot;, &quot;U12728&quot;, &quot;U12730&quot;, &quot;U12731&quot;,
      • &quot;U12722&quot;, &quot;U56796&quot;, &quot;U12714&quot;, &quot;U56789&quot;,
      • &quot;U56797&quot;, &quot;U56801&quot;, &quot;U56802&quot;, &quot;U12718&quot;,
      • &quot;U12710&quot;, &quot;U56804&quot;, &quot;U12734&quot;, &quot;U56809&quot;,
      • &quot;U56812&quot;, &quot;AF057753&quot;, &quot;U12711&quot;, &quot;U12717&quot;,
      • &quot;U12723&quot;, &quot;U12726&quot;)
      • # get data from Genbank
      • require(ape)
      • cotton <- read.GenBank(cotton_acc, species.names = T)
      • # name the sequences with species names instead of access numbers
      • names_accs <- data.frame(species = attr(cotton, &quot;species&quot;), accs = names(cotton))
      • names(cotton) <- attr(cotton, &quot;species&quot;)
    • Align sequence data run external: clustal, mafft
      • # multiple sequence alignment
      • ### Get clustalw here, and install: http://www.clustal.org/
      • # set to your working directory
      • setwd(“/path on your computer to/ClustalW2&quot;)
      • # write fasta file to directory
      • write.dna(cotton, &quot;cotton.fas&quot;, format = &quot;fasta&quot;)
      • # run clustal multiple alignment, prints clustal output to console
      • system(paste('&quot;./clustalw2&quot; cotton.fas')) # should work on OSX or Windows
      • # read the alignment back in to R
      • cotton_clustalaligned <- read.dna(&quot;cotton.aln&quot;, format=&quot;clustal&quot;)
       Manual aligment may have to be done, dare I say it, not in R
    • Get and align sequences DIY
      • Get together with a few other people…or not
        • Choose some species to investigate
        • Get their accession numbers on GenBank
        • Download sequence data from Genbank
        • If you are really adventurous, also align sequences
    • Phylogenetic inference Tools
      • R Packages: ape, phangorn, phyclust, phytools, scaleboot
      • ape has the most functionality for phylogenetic inference
      • You should be able to call MrBayes form R, but I don’t know how – package phyloch?
    • Phylogenetic inference
      • Fitting evol models: see fxn modelTest in package phangorn
      • NJ
        • install.packages(“ape&quot;); require(ape)
        • data(woodmouse)
        • trw <- nj(dist.dna(woodmouse))
        • plot(trw)
      • Maximum likelihood
        • install.packages(&quot;phangorn&quot;); require(phangorn)
        • data(Laurasiatherian)
        • dm <- dist.logDet(Laurasiatherian)
        • njtree <- NJ(dm)
        • MLfit <- pml(njtree, Laurasiatherian) # optimize edge length parameter
        • MLfit_ <- optim.pml(MLfit, model = &quot;GTR&quot;)
        • MLfit_$tree
        • plot(MLfit_$tree)
      • Parsimony
        • install.packages(&quot;phangorn&quot;); require(phangorn)
        • data(Laurasiatherian)
        • dm = dist.logDet(Laurasiatherian)
        • tree = NJ(dm)
        • treepars <- optim.parsimony(tree, Laurasiatherian)
    • Phylogenetic inference---Continued
      • Bayesian
        • You can do this (maybe) with the package phyloch (get here: http://www.christophheibl.de/Rpackages.html ), by calling MrBayes from R…
        • … however, MrBayes is giving way to RevBayes here: http://sourceforge.net/projects/revbayes/ ), fyi
    • Phylogenetic inference DIY
      • With your partners…or not
        • Use the sequence data from GenBank you got earlier
        • (if you didn’t align the sequences, don’t worry about it – OR use data set provided with ape or other package)
        • Do some phylogenetic inference a couple of different ways (e.g., NJ and parsimony)
    • Visualize phylogenies
      • R Packages: ape, ade4, phytools, phylobase, ouch, paleoPhylo
      • # visualize phylogenies
      • install.packages(&quot;ape&quot;)
      • require(ape)
      • tree <- rcoal(10)
      • tree
      • plot(tree)
      • plot(tree, type = &quot;cladogram&quot;)
      • plot(tree, type = &quot;unrooted&quot;)
      • plot(tree, type = &quot;radial&quot;)
      • plot(tree, type = &quot;fan&quot;)
    • Visualize phylogenies DIY
      • Get together with a few other people…or not
        • Use the tree you made, or use one provided with ape, or other packages
        • Do basic plotting, e.g.: plot(mytree)
        • Then see if you can
          • color the branches,
          • label the branches with the edge lengths
          • change the tip labels
          • etc.
    • Traits on trees phylogenetic signal
      • R Packages: ape, picante, caper, phytools
      • Examples from picante and phytools:
      • # phylogenetic signal
      • install.packages(&quot;picante&quot;)
      • require(picante)
      • randtree <- rcoal(20)
      • randtraits <- rTraitCont(randtree)
      • Kcalc(randtraits[randtree$tip.label],randtree)
      • install.packages(&quot;phytools&quot;)
      • require(phytools)
      • tree <- rbdtree(1,0,Tmax=4) # make a tree
      • x <- fastBM(tree) # simulate traits
      • phylosig(tree, x, method=&quot;lambda&quot;, test=TRUE) # calcualte physig, lambda
      • phylosig(tree, x, method=&quot;K&quot;, test=TRUE) # calcualte physig, K
    • Traits on trees modeling trait evolution
      • R Packages: ape, picante, caper, geiger, PHYLOGR, phytools, ade4, motmot
      • Above can do: trait evolution of traits, including: discrete and continuous , and with Brownian motion or OU models
      • See also:
      • Rbrownie
      • Various dev evol modeling frameworks to be included in geiger soon: auteur, mecca, medusa, and fossilmedusa
      • here: http://www.webpages.uidaho.edu/~lukeh/software/index.html
    • Ancestral state reconstruction
      • R Packages: ape, ouch, phytools
      • Function ‘ace’ in the ape package works nicely
      • But very sensitive to parameters
      • Example
      • data(bird.orders)
      • x <- rnorm(23)
      • out <- ace(x, bird.orders)
      • out$ace will have the ancestral character values (which you’ll have to match to nodes of your tree)
    • Tree simulations
      • R Packages: Treesim, geiger, ape, phybase
      • Example
      • require(ape)
      • tree <- rcoal(10) # Make a random tree
      • trait <- rTraitCont(tree, model = &quot;BM&quot;) # Simulate a trait on that tree
      • # Write a function to make a tree, simulate a BM trait, and take the mean of that trait
      • myfunc <- function(n) {
      • tree <- rcoal(n)
      • trait <- rTraitCont(tree, model = &quot;BM&quot;)
      • mean(trait)
      • }
      • # do it 100 times and make a data.frame required for ggplot2 plotting
      • dat <- replicate(100, myfunc(10))
      • dat2 <- data.frame(dat)
      • # plot results
      • require(ggplot2)
      • ggplot(dat2, aes(dat)) + geom_histogram()
    • Get trees
      • rOpenSci’s treeBASE package
      • on CRAN: http://cran.r-project.org/web/packages/treebase/
      • install.packages(&quot;treebase&quot;) # install
      • require(treebase) # load
      • tree <- search_treebase(&quot;Derryberry&quot;, &quot;author&quot;)[[1]] # search
      • metadata(tree$S.id) # metadata for tree
      • plot(tree) # plot the tree
    • Phylogenetic community structure
      • R Packages: picante (includes phylocom functionality)
      • --Although, not bladj for some reason, talk to me if you want to run bladj from R
      • Example
      • Fxn ‘comdistnt’ calculates intercommunity mean nearest taxon index
      • data(phylocom)
      • comdistnt(phylocom$sample, cophenetic(phylocom$phylo), abundance.weighted=FALSE)
      • Also, new approach to phycommstruct in R from Matt Helmus, code here:
      • http://r-ecology.blogspot.com/2011/10/phylogenetic-community-structure-pglmms.html
    • Bonus: Polytomy resolver
      • MEE paper: “ A simple polytomy resolver for dated phylogenies”
      • by Kuhn, Mooers, and Thomas
        • Paper
        • http://onlinelibrary.wiley.com/doi/10.1111/j.2041-210X.2011.00103.x/abstract
        • Supp info has R scripts: http://onlinelibrary.wiley.com/doi/10.1111/j.2041-210X.2011.00103.x/suppinfo
    • Resources
      • Bodega Phylogenetics Wiki:
        • Home: http://bodegaphylo.wikispot.org/Front_Page
        • BROWNIE tutorial: http://bodegaphylo.wikispot.org/Morphological_Diversification_and_Rates_of_Evolution
        • Phylogenetic signal tutorial: http://bodegaphylo.wikispot.org/IV._Testing_Phylogenetic_Signal_in_R
      • R phylo-wiki (from NESCent):
      • http://www.r-phylo.org/wiki/HowTo/Table_of_Contents
      • CRAN task view, Phylogenetics:
      • http://cran.r-project.org/web/views/Phylogenetics.html
      • rmesquite: https://r-forge.r-project.org/R/?group_id=213
      • R-phylogenetics listserve :
        • https://stat.ethz.ch/mailman/options/r-sig-phylo/