pyBrain中的序列化,分类,机器学习,预测(Serialization, classification in pyBrain, machine learning, prediction)

编程入门 行业动态 更新时间:2024-10-28 17:24:46
pyBrain中的序列化,分类,机器学习,预测(Serialization, classification in pyBrain, machine learning, prediction)

我有这样的训练数据的例子(我有1000部电影用于训练),我需要预测每部电影的“预算”:

film_1 = { 'title': 'The Hobbit: An Unexpected Journey', 'article_size': 25000, 'producer': ['Peter Jackson', 'Fran Walsh', 'Zane Weiner'], 'release_date': some_date(2013, 11, 28), 'running_time': 169, 'country': ['New Zealand', 'UK', 'USA'], 'budget': dec('200000000') }

诸如'title' , 'producer' , 'country'等关键字可以被视为机器学习中的特征,而诸如'The Hobbit: An Unexpected Journey' , 25000等等的值可被视为用于学习的值处理。 但是,在训练中,输入主要被接受为实数而不是字符串格式。 我是否需要将'title' , 'producer' , 'country' (字符串为字段)等字段转换为int (应该进行分类或序列化等操作?)或其他一些操作以使我能够使用这些字段数据作为我的网络的训练集?

I have such example of my training Data(i have 1000 films for training), I need to predict a 'budget' of each film:

film_1 = { 'title': 'The Hobbit: An Unexpected Journey', 'article_size': 25000, 'producer': ['Peter Jackson', 'Fran Walsh', 'Zane Weiner'], 'release_date': some_date(2013, 11, 28), 'running_time': 169, 'country': ['New Zealand', 'UK', 'USA'], 'budget': dec('200000000') }

The keys such as 'title', 'producer', 'country' can be viewed as features in machine learning, while values such as 'The Hobbit: An Unexpected Journey', 25000, etc.,can be viewed as values used for learning process. However, in training, the input is mostly accepted as real numbers rather than strings format. Do I need to convert such fields like 'title', 'producer', 'country' (fields which are strings) to int( such thing like classification or serialization should take place?) or some other manipulations to make me able to use these data as training set for my network?

最满意答案

我想知道这是否是你需要的:

film_list=['title','article_size','producer','release_date','running_time','country','budget'] flist = [(i,j) for i, j in enumerate(film_list)] label = [ seq[0] for seq in flist ] name = [ seq[1] for seq in flist ] print label print name >>[0, 1, 2, 3, 4, 5, 6] ['title', 'article_size', 'producer', 'release_date', 'running_time', 'country', 'budget']

或者你可以直接使用你的字典,

labels = film_1.keys() print labels # But the keys are sorted, labels[0] will give you 'producer' instead of 'title': >>['producer', 'title', 'country', 'release_date', 'budget', 'article_size', 'running_time']

I was wondering whether this is what you need:

film_list=['title','article_size','producer','release_date','running_time','country','budget'] flist = [(i,j) for i, j in enumerate(film_list)] label = [ seq[0] for seq in flist ] name = [ seq[1] for seq in flist ] print label print name >>[0, 1, 2, 3, 4, 5, 6] ['title', 'article_size', 'producer', 'release_date', 'running_time', 'country', 'budget']

Or you can use your dictionary directly,

labels = film_1.keys() print labels # But the keys are sorted, labels[0] will give you 'producer' instead of 'title': >>['producer', 'title', 'country', 'release_date', 'budget', 'article_size', 'running_time']

更多推荐

本文发布于:2023-08-05 20:25:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1438595.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:机器   序列化   Serialization   pyBrain   learning

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!