我有以下数据集,结构如下:
DATE PERCENTAGE FLAG VALUE1 01JAN2017 0.21 1 1.50 04JAN2017 0.05 0 2.43 09JAN2017 0.06 1 2.21 24JAN2017 0.15 1 1.13我必须向数据集添加新变量,以便这些变量满足以下条件:
在FLAG等于1的情况下,它是第一行:NEW_VAR_1 is equal to 500 * PERCENTAGE; NEW_VAR_2 is equal to NEWVAR_1 * (VALUE1 - 1); NEW_VAR_3 is equal to 500 + NEWVAR_2;在FLAG等于1的情况下,它不是第一行,那么:
NEW_VAR_1 is equal to LAG(NEWVAR_3) * PERCENTAGE; NEW_VAR_2 is equal to NEWVAR_1 * (VALUE1 - 1); NEW_VAR_3 is equal to LAG(NEWVAR_3) + NEWVAR_2;在FLAG等于0的情况下,必须将所有NEWVAR_值设置为丢失。
我需要在SAS上运行此脚本,然后编写以下脚本来执行此操作:
DATA BACKTESTING; SET BACKTESTING; IF _N_ EQ 1 AND FLAG EQ 1 THEN DO; K = 500; NEWVAR_1 = PERCENTAGE * K; NEWVAR_2 = NEWVAR_1 * (VALUE_1 - 1); NEWVAR_3 = K + NEWVAR_2; END; ELSE IF _N_ GT 1 AND FLAG EQ 1 THEN DO; NEWVAR_1 = PERCENTAGE * LAG(NEWVAR_3); NEWVAR_2 = NEWVAR_1 * (VALUE_1 - 1); NEWVAR_3 = K + NEWVAR_2; END; END; RUN;该脚本正常工作,在某种意义上我没有在日志窗口中看到错误或警告消息,但是,正如您可以通过阅读脚本注意到的那样,它在滞后变量中找到缺失值时返回缺失值。
有没有办法克服这样的问题,以便只有当FLAG等于1时才让SAS采用NEWVAR_3的滞后?
希望我在这个问题上已经足够清楚了,感谢所有人提前帮助!
I have the following dataset, structured as follows:
DATE PERCENTAGE FLAG VALUE1 01JAN2017 0.21 1 1.50 04JAN2017 0.05 0 2.43 09JAN2017 0.06 1 2.21 24JAN2017 0.15 1 1.13I have to add new variables to the dataset such that those variables will fulfill the following condition:
in the case FLAG is equal to 1 and it's the first row then:NEW_VAR_1 is equal to 500 * PERCENTAGE; NEW_VAR_2 is equal to NEWVAR_1 * (VALUE1 - 1); NEW_VAR_3 is equal to 500 + NEWVAR_2;In the case FLAG is equal to 1 and it's not the first row then:
NEW_VAR_1 is equal to LAG(NEWVAR_3) * PERCENTAGE; NEW_VAR_2 is equal to NEWVAR_1 * (VALUE1 - 1); NEW_VAR_3 is equal to LAG(NEWVAR_3) + NEWVAR_2;In the case FLAG is equal to 0 then all the NEWVAR_ values have to be set on missing.
I need to run this script on SAS and I write the following script down to do that:
DATA BACKTESTING; SET BACKTESTING; IF _N_ EQ 1 AND FLAG EQ 1 THEN DO; K = 500; NEWVAR_1 = PERCENTAGE * K; NEWVAR_2 = NEWVAR_1 * (VALUE_1 - 1); NEWVAR_3 = K + NEWVAR_2; END; ELSE IF _N_ GT 1 AND FLAG EQ 1 THEN DO; NEWVAR_1 = PERCENTAGE * LAG(NEWVAR_3); NEWVAR_2 = NEWVAR_1 * (VALUE_1 - 1); NEWVAR_3 = K + NEWVAR_2; END; END; RUN;The script works correctly, in the sense I did not see error or warning message in the log window, but, as you can note by reading the script, it returns missing values when it finds a missing values in the lagged variables.
Is there a way to overcome such problem, in order to be able letting SAS takes the lag of NEWVAR_3 only when FLAG is equal to 1?
In the hope I've been clear enough in the question, thanks all in advance for the help!
最满意答案
lag的问题是它实际上并没有读取先前的值。 相反,它会在每次调用时将当前值添加到隐藏数组中 - 然后在后续调用中检索该数组。
所以 - 如果你没有在每次迭代时调用lag (或调用它两次),你将得到意想不到的结果。
避免这种情况的一种方法是使用简单的retain ,例如如下:
DATA BACKTESTING; SET BACKTESTING; IF _N_ EQ 1 AND FLAG EQ 1 THEN DO; K = 500; NEWVAR_1 = PERCENTAGE * K; NEWVAR_2 = NEWVAR_1 * (VALUE_1 - 1); NEWVAR_3 = K + NEWVAR_2; END; ELSE IF _N_ GT 1 AND FLAG EQ 1 THEN DO; NEWVAR_1 = PERCENTAGE * LAG_NEWVAR_3; NEWVAR_2 = NEWVAR_1 * (VALUE_1 - 1); NEWVAR_3 = K + NEWVAR_2; END; END; /* create temp retain variable */ retain LAG_NEWVAR_3 0; drop LAG_NEWVAR_3; LAG_NEWVAR_3=NEWVAR_3; RUN;滞后函数的文档: http : //support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000212547.htm
The problem with lag is that it does not actually read the previous value. Instead, it adds the current value to a hidden array every time it is called - which is then retrieved in subsequent calls.
So - if you do not call lag on every iteration (or call it twice) you will get unexpected results.
One way to avoid this is to use a simple retain, eg as follows:
DATA BACKTESTING; SET BACKTESTING; IF _N_ EQ 1 AND FLAG EQ 1 THEN DO; K = 500; NEWVAR_1 = PERCENTAGE * K; NEWVAR_2 = NEWVAR_1 * (VALUE_1 - 1); NEWVAR_3 = K + NEWVAR_2; END; ELSE IF _N_ GT 1 AND FLAG EQ 1 THEN DO; NEWVAR_1 = PERCENTAGE * LAG_NEWVAR_3; NEWVAR_2 = NEWVAR_1 * (VALUE_1 - 1); NEWVAR_3 = K + NEWVAR_2; END; END; /* create temp retain variable */ retain LAG_NEWVAR_3 0; drop LAG_NEWVAR_3; LAG_NEWVAR_3=NEWVAR_3; RUN;Documentation for lag function: http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000212547.htm
更多推荐
发布评论