自举树值与PAST不同

编程入门 行业动态 更新时间:2024-10-25 06:32:41
本文介绍了自举树值与PAST不同的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

当我在RI中计算自举树时,得到的值与使用PAST时得到的值不同( folk.uio.no/ohammer/past/).如何从两个程序获得匹配的输出?

When I compute a bootstrapped tree in R I get different values to when I use PAST (folk.uio.no/ohammer/past/). How can I get the output to match from the two programs?

这是我在R中所做的事情(以下数据):

Here's what I'm doing in R (data below):

library("ape") library("phytools") library("phangorn") library("cluster") # compute neighbour-joined tree f <- function(xx) nj(daisy(xx)) nj_tree <- f(tab) nj_tree_root <- root(nj_tree, 1, r = TRUE) ## bootstrap # bootstrap values do not match PAST output - why is that? nj_tree_root_boot <- boot.phylo(nj_tree, FUN = f, tab, rooted = TRUE) # Are bootstrap values stable? for (i in 1:10){ print(boot.phylo(nj_tree, FUN = f, tab, rooted = TRUE, quiet = TRUE)) } # yes, they seem ok # plot tree with bootstrap values plot(nj_tree_root, use.edge.length = FALSE) nodelabels(nj_tree_root_boot, adj = c(1.2, 1.2), frame = "none")

引导程序的典型输出是[1] 100 6 39 27 23 57 53 75 71,这是曲线图(LHS值应该是100,所以已经以某种方式裁剪了):

Typical output for the bootstrap is [1] 100 6 39 27 23 57 53 75 71 and here's the plot (far LHS value should be 100, it was cropped somehow):

我像这样转换数据以将其发送到PAST:

I transform the data to send it to PAST like so:

tab1 <- t(apply(tab, 1, as.numeric)) write.table(tab1, "tab.txt")

在PAST中,我打开tab.txt文件,执行多变量->群集->邻居加入Euclidian并使用outgroup进行100个bootstrap复制.从PAST中,我得到了这个情节:

In PAST I open the tab.txt file, do multivariate -> cluster -> Neighbour Joining with Euclidian and 100 bootstrap replications, using an outgroup. From PAST I get this plot:

和值是非常不同的.我需要如何处理R以使输出与PAST相匹配? PAST错误吗?

And the values are very different. What do I need to do with R to make the output match that from PAST? Is PAST wrong?

数据:

tab <- structure(list(X1 = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L), .Label = c("0", "1"), class = "factor"), X2 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L), .Label = c("0", "1"), class = "factor"), X3 = structure(c(1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 2L), .Label = c("0", "1"), class = "factor"), X4 = structure(c(2L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 2L), .Label = c("0", "1"), class = "factor"), X5 = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 1L), .Label = c("0", "1"), class = "factor"), X6 = structure(c(1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L), .Label = c("0", "1"), class = "factor"), X7 = structure(c(1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L), .Label = c("0", "1"), class = "factor"), X8 = structure(c(2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L), .Label = c("0", "1"), class = "factor"), X9 = structure(c(1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L), .Label = c("0", "1"), class = "factor"), X10 = structure(c(1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 2L), .Label = c("0", "1"), class = "factor"), X11 = structure(c(1L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 2L), .Label = c("0", "1"), class = "factor"), X12 = structure(c(2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("0", "1"), class = "factor"), X13 = structure(c(2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("0", "1"), class = "factor"), X14 = structure(c(2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("0", "1"), class = "factor"), X15 = structure(c(2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L), .Label = c("0", "1"), class = "factor"), X16 = structure(c(2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L), .Label = c("0", "1"), class = "factor"), X17 = structure(c(2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L), .Label = c("0", "1"), class = "factor"), X18 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L), .Label = c("0", "1"), class = "factor"), X19 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L), .Label = c("0", "1"), class = "factor"), X20 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L), .Label = c("0", "1"), class = "factor"), X21 = structure(c(1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("0", "1"), class = "factor"), X22 = structure(c(2L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 2L, 2L), .Label = c("0", "1"), class = "factor"), X23 = structure(c(1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 2L), .Label = c("0", "1"), class = "factor"), X24 = structure(c(1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 2L, 2L), .Label = c("0", "1"), class = "factor"), X25 = structure(c(1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 1L), .Label = c("0", "1"), class = "factor"), X26 = structure(c(1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 1L), .Label = c("0", "1"), class = "factor")), .Names = c("X1", "X2", "X3", "X4", "X5", "X6", "X7", "X8", "X9", "X10", "X11", "X12", "X13", "X14", "X15", "X16", "X17", "X18", "X19", "X20", "X21", "X22", "X23", "X24", "X25", "X26"), row.names = c("a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k"), class = "data.frame")

推荐答案

经过大量搜索,结果发现答案在ape包常见问题解答Q14 :

After much searching around, it turn out the answer is in the ape package FAQ Q14:

我已经用boot.phylo进行了引导分析,但是有一些引导 根植树后,值似乎在错误的位置.这是因为 引导值被计算为进化的频率,而不是 作为实际的分区.因此,这些值确实与 节点,而不是边缘.结果是一些引导程序 在(重新)植树后,值实际上是在失去其含义 因为这会影响树中进化枝的定义.一种 一个简单的解决方案是将生根过程包括在 函数FUN用作boot.phylo的参数.明显地 估算的树也必须以相同的方法植根 引导程序.在这种情况下,定义FUN更方便 预先.示例代码为:

I have done a bootstrap analysis with boot.phylo but some bootstrap values seem at the wrong place after rooting the tree. This is because the bootstrap values are counted as the frequencies of clades, and not as actual bipartitions. So these values are really associated to the nodes, not to the edges. A consequence is that some of the bootstrap values are lilely to loose their meaning after (re)rooting the tree since this will affect the definition of the clades in the tree. A simple solution is to include the rooting process in the definition of the function FUN that is given as argument to boot.phylo. Obviously the estimated tree must also be rooted in the same way before doing the bootstrap. In this situation, it is more convenient to define FUN beforehand. An example code would be:

outgroup <- 1 # may be several tips, numeric or tip labels foo <- function(xx) root(nj(dist.dna(xx)), outgroup) tr <- foo(X) # X is the matrix of DNA sequences bp <- boot.phylo(tr, X, foo) plot(tr) nodelabels(bp) # will have "100" at the root

在我的问题的具体情况下:

In the specific case of my question:

nj_tree_root_boot <- boot.phylo(nj_tree, FUN = f, tab, rooted = TRUE) plot(nj_tree_root, use.edge.length = FALSE) nodelabels(nj_tree_root_boot, adj = c(1.2, 1.2), frame = "none")

与PAST输出匹配得很好.

Which matches the PAST output quite well.

更多推荐

自举树值与PAST不同

本文发布于:2023-11-24 05:29:30,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1624114.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:举树值

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!