关于机器学习中的核心技巧的直觉(Intuition about the kernel trick in machine learning)

编程入门行业动态更新时间:2024-10-28 12:31:10

我已经成功实现了一个使用RBF内核的内核感知器分类器。我明白，内核技巧将特征映射到更高维度，以便可以构造线性超平面来分离点。例如，如果您有要素（x1，x2）并将其映射到三维要素空间，则可能得到： K(x1,x2) = (x1^2, sqrt(x1)*x2, x2^2) 。

如果你将它插入感知器决策函数w'x+b = 0 ，你最终得到： w1'x1^2 + w2'sqrt(x1)*x2 + w3'x2^2 ，它给你一个圆形决策边界。

虽然内核技巧本身非常直观，但我无法理解这种线性代数方面。有人可以帮助我理解我们如何使用内部产品来映射所有这些附加功能而无需明确指定它们吗？

谢谢！

I have successfully implemented a kernel perceptron classifier, that uses an RBF kernel. I understand that the kernel trick maps features to a higher dimension so that a linear hyperplane can be constructed to separate the points. For example, if you have features (x1,x2) and map it to a 3-dimensional feature space you might get: K(x1,x2) = (x1^2, sqrt(x1)*x2, x2^2).

If you plug that into the perceptron decision function w'x+b = 0, you end up with: w1'x1^2 + w2'sqrt(x1)*x2 + w3'x2^2which gives you a circular decision boundary.

While the kernel trick itself is very intuitive, I am not able to understand the linear algebra aspect of this. Can someone help me understand how we are able to map all of these additional features without explicitly specifying them, using just the inner product?

Thanks!

最满意答案

简单。

给我一些x和y的数值结果（x + y）^ 10。

你宁愿做什么，“欺骗”并总结x + y，然后将该值转换为第10的权力，或者扩大写出的确切结果

x^10+10 x^9 y+45 x^8 y^2+120 x^7 y^3+210 x^6 y^4+252 x^5 y^5+210 x^4 y^6+120 x^3 y^7+45 x^2 y^8+10 x y^9+y^10

然后计算每个术语然后将它们加在一起？显然，我们可以评估10次多项式之间的点积，而不明确地形成它们。

有效的内核是点积，我们可以“欺骗”并计算两点之间的数值结果，而不必形成它们的显式特征值。有很多这样的可能的内核，尽管只有少数人在论文/练习中被大量使用。

Simple.

Give me the numeric result of (x+y)^10 for some values of x and y.

What would you rather do, "cheat" and sum x+y and then take that value to the 10'th power, or expand out the exact results writing out

x^10+10 x^9 y+45 x^8 y^2+120 x^7 y^3+210 x^6 y^4+252 x^5 y^5+210 x^4 y^6+120 x^3 y^7+45 x^2 y^8+10 x y^9+y^10

And then compute each term and then add them together? Clearly we can evaluate the dot product between degree 10 polynomials without explicitly forming them.

Valid kernels are dot products where we can "cheat" and compute the numeric result between two points without having to form their explicit feature values. There are many such possible kernels, though only a few have been getting used a lot on papers / practice.

更多推荐

本文发布于:2023-08-01 14:34:00，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1359291.html