如何在Spark Table中创建索引？(How to create index in Spark Table?)

系统教程行业动态更新时间:2024-06-14 16:57:18

我知道Spark Sql与Hive几乎相同。

现在我已经创建了一个表，当我在做Spark sql查询来创建表索引时，它总是给我这个错误：

SQL语句中的错误：AnalysisException：创建索引语句中的输入''期望AS接近'）'不匹配

我正在使用的Spark sql查询是：

CREATE INDEX word_idx ON TABLE t (id)

id的数据类型是bigint。在此之前，我还尝试在此表的“word”列上创建表索引，它给了我同样的错误。

那么，无论如何通过Spark sql查询创建索引？

I know Spark Sql is almost same as Hive.

Now I have created a table and when I am doing Spark sql query to create the table index, it always gives me this error:

Error in SQL statement: AnalysisException: mismatched input '' expecting AS near ')' in create index statement

The Spark sql query I am using is:

CREATE INDEX word_idx ON TABLE t (id)

The data type of id is bigint. Before this, I have also tried to create table index on "word" column of this table, it gave me the same error.

So, is there anyway to create index through Spark sql query?

最满意答案

实际上，通过Spark SQL查询无法做到这一点。但是有一个名为zipWithIndex的RDD函数。您可以将DataFrame转换为RDD ，执行zipWithIndex ，并将生成的RDD转换回DataFrame 。

请参阅此社区Wiki文章，了解完整的解决方案。

另一种方法可能是使用Spark MLLib String Indexer 。

There's no way to do this through a Spark SQL query, really. But there's an RDD function called zipWithIndex. You can convert the DataFrame to an RDD, do zipWithIndex, and convert the resulting RDD back to a DataFrame.

See this community Wiki article for a full-blown solution.

Another approach could be to use the Spark MLLib String Indexer.

更多推荐

本文发布于:2023-04-12 20:58:00，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/dzcp/02fd016405dd6e20cf573ace04184794.html

索引如何在 Spark Table create

上一篇：如何通过点击它和相应的按钮与Tkinter在子框架中绘制几何图形(How to draw geometry in sub
下一篇： MVVM按钮可见性(MVVM Button Visibility)

发布评论取消回复

评论列表（有 0 条评论）

如何在Spark Table中创建索引？(How to create index in Spark Table?)

最满意答案

发布评论取消回复

最近发表

热门文章

标签列表