Python中的标准化/标准化测试数据(Standardization/Normalization test data in Python)

编程入门 行业动态 更新时间:2024-10-27 00:30:49
Python中的标准化/标准化测试数据(Standardization/Normalization test data in Python)

我正在做一个sklearn作业,我不明白为什么要用训练均值和sd来标准化和标准化测试数据。 我如何在Python中实现它? 这里是我的火车数据实现:

digits = sklearn.datasets.load_digits() X= digits.data Y= digits.target X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.3,train_size=0.7) std_scale = preprocessing.StandardScaler().fit(X_train) X_train_std = std_scale.transform(X_train) #X_test_std=??

对于火车我认为这是正确的,但对于测试?

I'm working in a sklearn homework and I don't understand why one should standardize and normalize the test data with the training mean and sd. How can I implement this in Python? Here is my implementation for the train data:

digits = sklearn.datasets.load_digits() X= digits.data Y= digits.target X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.3,train_size=0.7) std_scale = preprocessing.StandardScaler().fit(X_train) X_train_std = std_scale.transform(X_train) #X_test_std=??

For the train i think it's correct, but for the test?

最满意答案

为什么?

因为您的分类器/回归器将接受这些标准化值的培训。 你不想用你的训练分类器来预测其他统计数据。

怎么样:

std_scale = preprocessing.StandardScaler().fit(X_train) X_train_std = std_scale.transform(X_train) X_test_std = std_scale.transform(X_test)

适合一次,改变你需要改变的任何东西。 这是基于类的StandardScaler (您已经选择的)的优势,与之后没有适用转换所需的信息(基于适合期间获得的统计数据)所需的信息的比例相比。

Why?

Because your classifier/regressor will be trained on those standardizes values. You don't want to use your trained-classifier to predict data which has other statistics.

How:

std_scale = preprocessing.StandardScaler().fit(X_train) X_train_std = std_scale.transform(X_train) X_test_std = std_scale.transform(X_test)

Fitting once, transforming whatever you need to transform. That's the advantage of the class-based StandardScaler (which you already had chosen) compared to scale which does not hold the needed information needed for applying transformations (based on these statistics obtained during fit) at a later time.

更多推荐

本文发布于:2023-08-06 06:00:00,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1445039.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:测试数据   Standardization   Python   data   test

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!