从geom

编程入门 行业动态 更新时间:2024-10-28 12:27:38
geom_smooth()中提取多条趋势线的斜率(Extract slope of multiple trend lines from geom_smooth())

我试图使用ggplot在一个时间序列中绘制多个趋势线(每十年)。

这是数据:

dat <- structure(list(YY = 1961:2010, a = c(98L, 76L, 83L, 89L, 120L, 107L, 83L, 83L, 92L, 104L, 98L, 91L, 81L, 69L, 86L, 76L, 85L, 86L, 70L, 81L, 77L, 89L, 60L, 80L, 94L, 66L, 77L, 85L, 77L, 80L, 79L, 79L, 65L, 70L, 80L, 87L, 84L, 67L, 106L, 129L, 95L, 79L, 67L, 105L, 118L, 85L, 86L, 103L, 97L, 106L)), .Names = c("YY", "a"), row.names = c(NA, -50L), class = "data.frame")

这是脚本:

p <- ggplot(dat, aes(x = YY)) p <- p + geom_line(aes(y=a),colour="blue",lwd=1) p <- p + geom_point(aes(y=a),colour="blue",size=2) p <- p + theme(panel.background=element_rect(fill="white"), plot.margin = unit(c(0.5,0.5,0.5,0.5),"cm"), panel.border=element_rect(colour="black",fill=NA,size=1), axis.line.x=element_line(colour="black"), axis.line.y=element_line(colour="black"), axis.text=element_text(size=15,colour="black",family="serif"), axis.title=element_text(size=15,colour="black",family="serif"), legend.position = "top") p <- p + scale_x_discrete(limits = c(seq(1961,2010,5)),expand=c(0,0)) p <- p + geom_smooth(data=dat[1:10,],aes(x=YY,y=a),method="lm",se=FALSE,color="black",formula=y~x,linetype="dashed") p <- p + geom_smooth(data=dat[11:20,],aes(x=YY,y=a),method="lm",se=FALSE,color="black",formula=y~x,linetype="dashed") p <- p + geom_smooth(data=dat[21:30,],aes(x=YY,y=a),method="lm",se=FALSE,color="black",formula=y~x,linetype="dashed") p <- p + geom_smooth(data=dat[31:40,],aes(x=YY,y=a),method="lm",se=FALSE,color="black",formula=y~x,linetype="dashed") p <- p + geom_smooth(data=dat[41:50,],aes(x=YY,y=a),method="lm",se=FALSE,color="black",formula=y~x,linetype="dashed") p <- p + labs(x="Year",y="Number of Days") outImg <- paste0("test",".png") ggsave(outImg,p,width=8,height=5)

这是结果图像:

我想要/有什么问题

我想提取斜率并将它们添加到图中的趋势线上。 如何从geom_smooth()中提取每条线的斜率?

目前,我正在逐一绘制趋势线。 我想知道是否有一种有效的方法可以在可调节的时间窗口内完成此操作。 例如,假设我想绘制每5年的趋势线。 在上图中,时间窗口为10。

假设,我只想绘制重要的趋势线(即p值<0.05,null:没有趋势或斜率等于0),是否可以用geom_smooth()实现这一点?

我会感激任何帮助。

I am trying to plot multiple trend lines (every ten years) in a time series using ggplot.

Here's the data:

dat <- structure(list(YY = 1961:2010, a = c(98L, 76L, 83L, 89L, 120L, 107L, 83L, 83L, 92L, 104L, 98L, 91L, 81L, 69L, 86L, 76L, 85L, 86L, 70L, 81L, 77L, 89L, 60L, 80L, 94L, 66L, 77L, 85L, 77L, 80L, 79L, 79L, 65L, 70L, 80L, 87L, 84L, 67L, 106L, 129L, 95L, 79L, 67L, 105L, 118L, 85L, 86L, 103L, 97L, 106L)), .Names = c("YY", "a"), row.names = c(NA, -50L), class = "data.frame")

Here's the script:

p <- ggplot(dat, aes(x = YY)) p <- p + geom_line(aes(y=a),colour="blue",lwd=1) p <- p + geom_point(aes(y=a),colour="blue",size=2) p <- p + theme(panel.background=element_rect(fill="white"), plot.margin = unit(c(0.5,0.5,0.5,0.5),"cm"), panel.border=element_rect(colour="black",fill=NA,size=1), axis.line.x=element_line(colour="black"), axis.line.y=element_line(colour="black"), axis.text=element_text(size=15,colour="black",family="serif"), axis.title=element_text(size=15,colour="black",family="serif"), legend.position = "top") p <- p + scale_x_discrete(limits = c(seq(1961,2010,5)),expand=c(0,0)) p <- p + geom_smooth(data=dat[1:10,],aes(x=YY,y=a),method="lm",se=FALSE,color="black",formula=y~x,linetype="dashed") p <- p + geom_smooth(data=dat[11:20,],aes(x=YY,y=a),method="lm",se=FALSE,color="black",formula=y~x,linetype="dashed") p <- p + geom_smooth(data=dat[21:30,],aes(x=YY,y=a),method="lm",se=FALSE,color="black",formula=y~x,linetype="dashed") p <- p + geom_smooth(data=dat[31:40,],aes(x=YY,y=a),method="lm",se=FALSE,color="black",formula=y~x,linetype="dashed") p <- p + geom_smooth(data=dat[41:50,],aes(x=YY,y=a),method="lm",se=FALSE,color="black",formula=y~x,linetype="dashed") p <- p + labs(x="Year",y="Number of Days") outImg <- paste0("test",".png") ggsave(outImg,p,width=8,height=5)

This is the resulting image:

WHAT I WANT/PROBLEMS

I want to extract the slope and add them on the the trend lines in the figure. How can I extract the slope of each line from the geom_smooth()?

Currently, I am plotting the trend lines one by one. I want to know if there is an efficient way of doing this with adjustable time window. Suppose for example, I want to plot the trend lines for every 5 years. In the figure above the time window is 10.

Suppose, I only want to plot the significant trend lines (i.e., p-value < 0.05, null: no trend or slope equals 0), is it possible to implement this with geom_smooth()?

I'll appreciate any help.

最满意答案

因此,在将数据传输到ggplot2之前,最好处理这些任务,但使用tidyverse中的其他一些软件包可以很容易地完成这些任务。

从问题1和2开始:

虽然ggplot2可以绘制回归线,但要提取估计的斜率系数,您需要明确地使用lm()对象。 使用group_by()和mutate() ,您可以添加分组变量(例如,我的代码仅针对5年组执行此操作),然后计算并仅将斜率估计值提取到现有数据框中的列中。 然后可以使用geom_text()调用在ggplot中绘制这些斜率估计值。 我在下面快速而肮脏地完成了这项工作(将每个标签放在它们回归的x和y值的平均值上),但是您可以在数据框中指定它们的确切位置。

对变量和数据准备进行分组也使得问题2变得轻而易举:既然您在数据geom_smooth()明确地具有分组变量,则无需逐个绘制, geom_smooth()接受group审美。

此外,要回答问题3,您可以从lm对象的摘要中提取pvalue,并仅筛选出对您关注的级别有重要意义的pvalue。 如果您将此完整的数据geom_smooth()传递给geom_smooth()和geom_text()您将获得您正在寻找的图表!

library(tidyverse)

 # set up our base plot
 p <- ggplot(dat, aes(x = YY, y = a)) +
  geom_line(colour = "blue", lwd = 1) +
  geom_point(colour = "blue", size = 2) +
  theme(
    panel.background = element_rect(fill = "white"),
    plot.margin = unit(c(0.5, 0.5, 0.5, 0.5), "cm"),
    panel.border = element_rect(colour = "black", fill = NA, size = 1),
    axis.line.x = element_line(colour = "black"),
    axis.line.y = element_line(colour = "black"),
    axis.text = element_text(size = 15, colour = "black", family = "serif"),
    axis.title = element_text(size = 15, colour = "black", family = "serif"),
    legend.position = "top"
  ) +
  scale_x_discrete(limits = c(seq(1961, 2010, 5)), expand = c(0, 0))

# add a grouping variable (or many!)
 prep5 <- dat %>%
  mutate(group5 = rep(1:10, each = 5)) %>%
  group_by(group5) %>%
  mutate(
    slope = round(lm(YY ~ a)$coefficients[2], 2),
    significance = summary(lm(YY ~ a))$coefficients[2, 4],
    x = mean(YY),   # x coordinate for slope label
    y = mean(a)     # y coordinate for slope label
  ) %>%
  filter(significance < .2)   # only keep those with a pvalue < .2 

p + geom_smooth(
  data = prep5, aes(x = YY, y = a, group = group5),  # grouping variable does the plots for us!
  method = "lm", se = FALSE, color = "black",
  formula = y ~ x, linetype = "dashed"
) +
  geom_text(
    data = prep5, aes(x = x, y = y, label = slope),
    nudge_y = 12, nudge_x = -1
  )
 

现在,您可能希望在指定文本标签的位置时比在此处更加小心。 我使用了手段和geom_text()的nudge_*参数来做一个快速的例子,但请记住,因为这些值被显式映射到x和y坐标,你可以完全控制!

由reprex包创建于2018-07-16(v0.2.0)。

So, each of these tasks are best handled before you pipe your data into ggplot2, but they are all made fairly easy using some of the other packages from the tidyverse.

Beginning with questions 1 and 2:

While ggplot2 can plot the regression line, to extract the estimated slope coefficients you need to work with the lm() object explicitly. Using group_by() and mutate(), you can add a grouping variable (my code below does this for 5 year groups just for example) and then calculate and extract just the slope estimate into columns in your existing data frame. Then those slope estimates can be plotted in ggplot using the geom_text() call. I've done this below in a quick and dirty way (placing each label at the mean of the x and y values they regress) but you can specify their exact placement in your dataframe.

Grouping variables and data prep makes question 2 a breeze too: now that you have the grouping variables explicitly in your dataframe there is no need to plot one by one, geom_smooth() accepts the group aesthetic.

Additionally, to answer question 3, you can extract the pvalue from the summary of your lm objects and filter out only those that are significant to the level you care about. If you pass this now complete dataframe to geom_smooth() and geom_text() you will get the plot you're looking for!

library(tidyverse)

 # set up our base plot
 p <- ggplot(dat, aes(x = YY, y = a)) +
  geom_line(colour = "blue", lwd = 1) +
  geom_point(colour = "blue", size = 2) +
  theme(
    panel.background = element_rect(fill = "white"),
    plot.margin = unit(c(0.5, 0.5, 0.5, 0.5), "cm"),
    panel.border = element_rect(colour = "black", fill = NA, size = 1),
    axis.line.x = element_line(colour = "black"),
    axis.line.y = element_line(colour = "black"),
    axis.text = element_text(size = 15, colour = "black", family = "serif"),
    axis.title = element_text(size = 15, colour = "black", family = "serif"),
    legend.position = "top"
  ) +
  scale_x_discrete(limits = c(seq(1961, 2010, 5)), expand = c(0, 0))

# add a grouping variable (or many!)
 prep5 <- dat %>%
  mutate(group5 = rep(1:10, each = 5)) %>%
  group_by(group5) %>%
  mutate(
    slope = round(lm(YY ~ a)$coefficients[2], 2),
    significance = summary(lm(YY ~ a))$coefficients[2, 4],
    x = mean(YY),   # x coordinate for slope label
    y = mean(a)     # y coordinate for slope label
  ) %>%
  filter(significance < .2)   # only keep those with a pvalue < .2 

p + geom_smooth(
  data = prep5, aes(x = YY, y = a, group = group5),  # grouping variable does the plots for us!
  method = "lm", se = FALSE, color = "black",
  formula = y ~ x, linetype = "dashed"
) +
  geom_text(
    data = prep5, aes(x = x, y = y, label = slope),
    nudge_y = 12, nudge_x = -1
  )
 

Now you may want to be a little more careful about specifying the location of your text labels than I have been here. I used means and the nudge_* arguments of geom_text() to do a quick example but keep in mind since these values are mapped explicitly to x and y coordinates, you have complete control!

Created on 2018-07-16 by the reprex package (v0.2.0).

更多推荐

本文发布于:2023-07-26 00:08:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1268279.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:geom

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!