Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Generating a bootstrap dendrogram with hierfstat's genet.dist
title author date
Generating a bootstrap dendrogram with hierfstat's genet.dist
Zhian N. Kamvar
2018-08-21

This short tutorial is an answer to the question posed on the poppr forum: https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/poppr/vEQ8vb2oObQ/RjTsndaCEQAJ

My ultimate goal when asking the previous question was to employ aboot in order to create a bootstraped dendrogram based on distances offered by the genet.dist function in hierfstat. That's why I was trying to create a tree based on Nei's distance this way - to verify that it worked! Unfortunately even after reading the aboot documentation, I still fail to understand how to do this. I would be grateful if someone could provide an example of using aboot in combination with genet.dist.

Unfortunately, aboot is not the best tool for this job. Below, I show how to create a dendrogram with boot.phylo and some weird R magic.

library("poppr")
#> Loading required package: adegenet
#> Loading required package: ade4
#> 
#>    /// adegenet 2.1.1 is loaded ////////////
#> 
#>    > overview: '?adegenet'
#>    > tutorials/doc/questions: 'adegenetWeb()' 
#>    > bug reports/feature requests: adegenetIssues()
#> This is poppr version 2.8.0. To get started, type package?poppr
#> OMP parallel support: available
library("hierfstat")
#> 
#> Attaching package: 'hierfstat'
#> The following object is masked from 'package:adegenet':
#> 
#>     read.fstat
library("ape")
#> 
#> Attaching package: 'ape'
#> The following objects are masked from 'package:hierfstat':
#> 
#>     pcoa, varcomp
data("nancycats", package = "adegenet")
n <- genind2hierfstat(nancycats)
head(n)
#>   pop   fca8  fca23  fca43  fca45  fca77  fca78  fca90  fca96  fca37
#> 1 P01     NA 136146 139139 116120 156156 142148 199199 113113 208208
#> 2 P01     NA 146146 139145 120126 156156 142148 185199 113113 208208
#> 3 P01 135143 136146 141141 116116 152156 142142 197197 113113 210210
#> 4 P01 133135 138138 139141 116126 150150 142148 199199  91105 208208
#> 5 P01 133135 140146 141145 126126 152152 142148 193199 113113 208208
#> 6 P01 135143 136146 145149 120126 150156 148148 193195  91113 208208

We can create an initial tree by passing the distance to the Neighbor-Joining function:

plot(nj(genet.dist(n)))

We can also create a function that does this for us and adds tip labels from the population factor

genet.tree <- function(x) {
  res <- nj(genet.dist(x))
  res$tip.label <- levels(x[[1]])
  res
}
plot(genet.tree(n))

boot.phylo will randomly sample the columns of our data frame with replacement and use a tree generating function to generate a new tree from the resampled data. The problem we encounter is that we need to fix the first column in place because genet.dist() needs this in order to calculate the population distance.

To address this, we can create a function-generating function that will hold our population factor and combine it with the incoming data (for details on function-generating functions, see https://adv-r.hadley.nz/function-factories.html)

fungen <- function(pop_column) {
  force(pop_column)
  function(x) {
    dat <- cbind(pop_column, x)
    res <- nj(genet.dist(dat))
    res$tip.label <- levels(dat[[1]])
    res
  }
}

If we compare these functions, they are identical

genet.tree2 <- fungen(n[1])
identical(genet.tree(n), genet.tree2(n[-1]))
#> [1] TRUE

From here, we can pass this function to the boot.phylo function to generate our node labels

catTree <- genet.tree2(n[-1])
catTree$node.label <- boot.phylo(phy = catTree, x = n[-1], FUN = genet.tree2)
#> 
Running bootstraps:       100 / 100
#> Calculating bootstrap values... done.
plot(catTree, show.node.lab = TRUE)
add.scale.bar()

Created on 2018-08-21 by the reprex package (v0.2.0).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment