R中的格式用于生存分析的点预测(Format in R for point prediction of survival analysis)

编程入门 行业动态 更新时间:2024-10-28 05:27:54
R中的格式用于生存分析的点预测(Format in R for point prediction of survival analysis)

我对使用R的survival包执行简单预测的格式感到困惑

library(survival) lung.surv <- survfit(Surv(time,status) ~ 1, data = lung)

所以拟合简单的指数回归(仅用于示例目的)是:

lung.reg <- survreg(Surv(time,status) ~ 1, data = lung, dist="exponential")

我如何预测时间= 400时的生存百分比?

当我使用以下内容时:

myPredict400 <- predict(lung.reg, newdata=data.frame(time=400), type="response")

我得到以下内容:

myPredict400 1 421.7758

我期待37%的东西,所以我错过了一些非常明显的东西

I am befuddled by the format to perform a simple prediction using R's survival package

library(survival) lung.surv <- survfit(Surv(time,status) ~ 1, data = lung)

So fitting a simple exponential regression (for example purposes only) is:

lung.reg <- survreg(Surv(time,status) ~ 1, data = lung, dist="exponential")

How would I predict the percent survival at time=400?

When I use the following:

myPredict400 <- predict(lung.reg, newdata=data.frame(time=400), type="response")

I get the following:

myPredict400 1 421.7758

I was expecting something like 37% so I am missing something pretty obvious

最满意答案

这种生存功能的关键在于找到适合生存时间的经验分布。 本质上,你将生存时间与概率联系起来。 一旦你有了这种分配,你可以选择一段时间内的生存率。

尝试这个:

library(survival) lung.reg <- survreg(Surv(time,status) ~ 1, data = lung) # because you want a distribution pct <- 1:99/100 # this creates the empirical survival probabilities myPredict400 <- predict(lung.reg, newdata=data.frame(time=400),type='quantile', p=pct) indx = which(abs(myPredict400 - 400) == min(abs(myPredict400 - 400))) # find the closest survival time to 400 print(1 - pct[indx]) # 0.39

直接来自帮助文档,这是一个情节:

matplot(myPredict400, 1-pct, xlab="Months", ylab="Survival", type='l', lty=c(1,2,2), col=1)

在此处输入图像描述

编辑

你基本上适合于概率分布的回归(因此1到99中的100)。 如果你把它变成100,那么你预测的最后一个值就是inf因为第100百分位的存活率是无限的。 这就是quantile和pct参数的作用。

例如,设置pct = 1:999/1000您可以获得更精确的预测值( myPredict400 )。 此外,如果您将pct设置为某个不是正确概率的值(即小于0或大于1),您将收到错误。 我建议你玩这些值,看看它们如何影响你的生存率。

The point with this survival function is to find an empirical distribution that fits the survival times. Essentially you are associating a survival time with a probability. Once you have that distribution, you can pick out the survival rate for a given time.

Try this:

library(survival) lung.reg <- survreg(Surv(time,status) ~ 1, data = lung) # because you want a distribution pct <- 1:99/100 # this creates the empirical survival probabilities myPredict400 <- predict(lung.reg, newdata=data.frame(time=400),type='quantile', p=pct) indx = which(abs(myPredict400 - 400) == min(abs(myPredict400 - 400))) # find the closest survival time to 400 print(1 - pct[indx]) # 0.39

Straight from the help docs, here's a plot of it:

matplot(myPredict400, 1-pct, xlab="Months", ylab="Survival", type='l', lty=c(1,2,2), col=1)

enter image description here

Edited

You're basically fitting a regression to a distribution of probabilities (hence 1...99 out of 100). If you make it go to 100, then the last value of your prediction is inf because the survival rate in the 100th percentile is infinite. This is what the quantile and pct arguments do.

For example, setting pct = 1:999/1000 you get much more precise values for the prediction (myPredict400). Also, if you set pct to be some value that's not a proper probability (i.e. less than 0 or more than 1) you'll get an error. I suggest you play with these values and see how they impact your survival rates.

更多推荐

本文发布于:2023-07-05 08:50:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1035433.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:格式   Format   point   analysis   survival

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!