关于vowpal

编程入门 行业动态 更新时间:2024-10-24 08:31:49
关于vowpal_wabbit的输入格式(About the input format of vowpal_wabbit)

我是这方面的新手。 我想用一系列数据进行训练和预测。 我试过很久了,你能告诉我我的错吗?

我的火车数据看起来像这样(我在这里选择了几行):

-1 '13731#276 |f gender:0 age_range:2 action0:1 action1:0 action2:1 action3:0 -1 '70175#4214 |f gender:0 age_range:4 action0:0 action1:0 action2:1 action3:0 -1 '89370#2598 |f gender:1 age_range:2 action0:8 action1:0 action2:1 action3:0 1 '89371#1250 |f gender:0 age_range:2 action0:0 action1:0 action2:1 action3:0 -1 '89372#2792 |f gender:1 age_range:5 action0:0 action1:0 action2:1 action3:0 1 '89372#962 |f gender:1 age_range:5 action0:0 action1:0 action2:1 action3:0 -1 '89373#4472 |f gender:0 age_range:7 action0:5 action1:0 action2:1 action3:0

测试数据如下:

1 '177796#1807 |f gender:0 age_range:5 action0:5 action1:0 action2:1 action3:0 1 '155638#2445 |f gender:0 age_range:7 action0:3 action1:0 action2:1 action3:0 1 '155639#658 |f gender:1 age_range:2 action0:5 action1:0 action2:1 action3:0 1 '127479#2480 |f gender:0 age_range:7 action0:0 action1:0 action2:1 action3:0 1 '127478#1245 |f gender:0 age_range:4 action0:1 action1:0 action2:1 action3:0 1 '127473#4995 |f gender:1 age_range:4 action0:13 action1:0 action2:1 action3:0 1 '127472#45 |f gender:0 age_range:7 action0:4 action1:0 action2:1 action3:0

是的,他们看起来没什么不同。 我不知道这是不对的。 我看到github上有很多人用这种方式写它们。

我的vw命令如下:

vw -d train.vw --loss_function=logistic -f model.vw vw -d test.vw -t -i model.vw --loss_function=logistic -r shop.preds.txt

好吧,结果是

-2.816693 177796#1807 -2.817430 155638#2445 -2.981194 155639#658 -2.821442 127479#2480 -2.823012 127478#1245 -2.968556 127473#4995 -2.816092 127472#45 -2.820939 127471#4010 -2.975476 127470#593 -2.820105 155634#4103 -2.799539 155635#2980 -3.139279 127475#1469

我不知道为什么会这样,数字变得小于-2,实际上我的理想结果是:

202178#1665,0.67 156148#4730,0.50 132360#2459,0.24 132360#144,0.99 180387#1534,0.48 187963#1360,0.19 158187#2534,0.54 188206#4890,0.70

至少我希望这个数字是正确的,但它都是1.你能告诉我如何解决这个问题吗? 谢谢!

I am new in this aspect. I want to train with a series of data and predict. I have tried to long time, could you tell me what's the wrong with me?

My train data looks like this (I pick top several lines here):

-1 '13731#276 |f gender:0 age_range:2 action0:1 action1:0 action2:1 action3:0 -1 '70175#4214 |f gender:0 age_range:4 action0:0 action1:0 action2:1 action3:0 -1 '89370#2598 |f gender:1 age_range:2 action0:8 action1:0 action2:1 action3:0 1 '89371#1250 |f gender:0 age_range:2 action0:0 action1:0 action2:1 action3:0 -1 '89372#2792 |f gender:1 age_range:5 action0:0 action1:0 action2:1 action3:0 1 '89372#962 |f gender:1 age_range:5 action0:0 action1:0 action2:1 action3:0 -1 '89373#4472 |f gender:0 age_range:7 action0:5 action1:0 action2:1 action3:0

test data like this:

1 '177796#1807 |f gender:0 age_range:5 action0:5 action1:0 action2:1 action3:0 1 '155638#2445 |f gender:0 age_range:7 action0:3 action1:0 action2:1 action3:0 1 '155639#658 |f gender:1 age_range:2 action0:5 action1:0 action2:1 action3:0 1 '127479#2480 |f gender:0 age_range:7 action0:0 action1:0 action2:1 action3:0 1 '127478#1245 |f gender:0 age_range:4 action0:1 action1:0 action2:1 action3:0 1 '127473#4995 |f gender:1 age_range:4 action0:13 action1:0 action2:1 action3:0 1 '127472#45 |f gender:0 age_range:7 action0:4 action1:0 action2:1 action3:0

yes, they looks no different. I don't know if it is right. I see many people on github write them in this way.

and my vw command is as follow:

vw -d train.vw --loss_function=logistic -f model.vw vw -d test.vw -t -i model.vw --loss_function=logistic -r shop.preds.txt

Well, the result is

-2.816693 177796#1807 -2.817430 155638#2445 -2.981194 155639#658 -2.821442 127479#2480 -2.823012 127478#1245 -2.968556 127473#4995 -2.816092 127472#45 -2.820939 127471#4010 -2.975476 127470#593 -2.820105 155634#4103 -2.799539 155635#2980 -3.139279 127475#1469

I don't know why is that, the number become less than -2, in fact my ideal result is like:

202178#1665,0.67 156148#4730,0.50 132360#2459,0.24 132360#144,0.99 180387#1534,0.48 187963#1360,0.19 158187#2534,0.54 188206#4890,0.70

At least I want the number to be correct, but it is all 1. Could you tell me how to fix this? Thanks!

最满意答案

如果你想预测概率,那么你应该使用vw -d test.vw -t -i model.vw --loss_function=logistic --link=logistic -p shop.preds.txt代替vw -d test.vw -t -i model.vw --loss_function=logistic -r shop.preds.txt vw -d test.vw -t -i model.vw --loss_function=logistic --link=logistic -p shop.preds.txt

如果你想获得最可能的标签(-1或+1),请使用vw -d test.vw -t -i model.vw --loss_function=logistic --binary -p shop.preds.txt

请参阅https://github.com/JohnLangford/vowpal_wabbit/wiki/Predicting-probabilities

If you want to predict probabilities, then instead of vw -d test.vw -t -i model.vw --loss_function=logistic -r shop.preds.txt you should use vw -d test.vw -t -i model.vw --loss_function=logistic --link=logistic -p shop.preds.txt

If you want to get the most probable label (-1 or +1), use vw -d test.vw -t -i model.vw --loss_function=logistic --binary -p shop.preds.txt

See https://github.com/JohnLangford/vowpal_wabbit/wiki/Predicting-probabilities

更多推荐

本文发布于:2023-08-02 09:43:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1372287.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:vowpal

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!