(此问题与如何生成具有不同分布的相关变量的数据集?)
在Stata中,说我按照Uniform [0,1]分布创建一个随机变量:
In Stata, say that I create a random variable following a Uniform[0,1] distribution:
set seed 100 gen random1 = runiform()我现在想创建一个与第一个变量相关的第二个随机变量(相关性应为0.75,但以0和1为界).我希望这个第二个变量也或多或少更少Uniform [0,1].我该怎么办?
I now want to create a second random variable that is correlated with the first (the correlation should be .75, say), but is bounded by 0 and 1. I would like this second variable to also be more-or-less Uniform[0,1]. How can I do this?
推荐答案这不是确切的方法,但是NORTA/copula方法应该非常接近并且易于实现.
This won't be exact, but the NORTA/copula method should be pretty close and easy to implement.
相关引用为:
Cario,Marne C.和Barry L. Nelson. 建模和生成具有任意边际分布和相关矩阵的随机向量.西北大学工业工程与管理科学系技术报告,伊利诺伊州埃文斯顿,1997年.
Cario, Marne C., and Barry L. Nelson. Modeling and generating random vectors with arbitrary marginal distributions and correlation matrix. Technical Report, Department of Industrial Engineering and Management Sciences, Northwestern University, Evanston, Illinois, 1997.
可以在此处找到该论文.
从任何分布生成相关随机变量的一般方法是:
The general recipe to generate correlated random variables from any distribution is:
使用 [0,1]统一,第三步非常简单a>:您甚至不需要它.通常,您所获得的相关性的幅度将小于原始(正常)相关性的幅度,因此将这些相关性提高一点可能会很有用.
The third step is pretty easy with the [0,1] uniform: you don't even need it. Typically, the magnitude of the correlations you get will be less than the magnitudes of the original (normal) correlations, so it might be useful to bump those up a bit.
状态代码,用于2个相关系数为0.75的统一变量:
clear // Step 1 matrix C = (1, .75 \ .75, 1) corr2data x y, n(10000) corr(C) double corr x y, means // Steps 2-3 replace x = normal(x) replace y = normal(y) // Make sure things worked corr x y, means stack x y, into(z) clear lab define vars 1 "x" 2 "y" lab val _stack vars capture ssc install bihist bihist z, by(_stack) density tw1(yline(-1 0 1))如果您想改善均匀情况下的近似值 ,则可以像这样转换相关性(请参见链接论文的第5节):
If you want to improve the approximation for the uniform case, you can transform the correlations like this (see section 5 of the linked paper):
matrix C = (1,2*sin(.75*_pi/6)\2*sin(.75*_pi/6),1)这是0.76536686,而不是0.75.
This is 0.76536686 instead of the 0.75.
评论中问题的代码
相关矩阵C编写得更紧凑,我正在应用转换:
The correlation matrix C written more compactly, and I am applying the transformation:
clear matrix C = ( 1, /// 2*sin(-.46*_pi/6), 1, /// 2*sin(.53*_pi/6), 2*sin(-.80*_pi/6), 1, /// 2*sin(0*_pi/6), 2*sin(-.41*_pi/6), 2*sin(.48*_pi/6), 1 ) corr2data v1 v2 v3 v4, n(10000) corr(C) cstorage(lower) forvalues i=1/4 { replace v`i' = normal(v`i') }更多推荐
如何生成相关的Uniform [0,1]变量
发布评论