本文介绍了如何从一个卡桑德拉集群复制的ColumnFamily到另一个?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我只有主机的IP(对于源和目标集群),端口,key_space名称和column_family名。 我的目标群已经创建的元数据(仅数据需要复制)。 大多数preferentially ,我想这在单/多火花的作业(创建的中等数据框,然后将其保存)使用火花卡桑德拉连接器的Java API来完成。 中等preferentially ,使用卡桑德拉-Java驱动程序从datastax。 最低preferentially ,使用卡桑德拉-JDBC驱动程序和火花卡桑德拉连接器的Java API。
I have only host's IP(for both source and target clusters), port, key_space name and column_family name.
I already created metadata in target cluster(only data needs to be copy).
Most preferentially,I want this to be done in single/multiple spark jobs(creating DataFrame intermediately and then saving it) using spark-cassandra connector JAVA API.
Moderate preferentially, using cassandra-java driver from datastax.
Least preferentially, using cassandra-jdbc driver and spark-cassandra connector JAVA API.
如何复制的ColumnFamily从一个卡桑德拉集群到另一个?
情景:
任何帮助将提前AP preciated.Thanks。
Any help will be appreciated.Thanks in advance.
推荐答案投入很多,我们发现解决这个努力之后。该解决方案是非常简单和疯狂。我们可以很好地做到这一点使用的火花,让我们看到我们做到了。
After putting lot of efforts we found solution for this. This solution is very simple and crazy. We can very well do this using spark, let see we did.
我们在做什么(没有工作):
// Reading from first cassandra cluster dataframe = cassandraSQLContext.read().format("org.apache.spark.sql.cassandra").options("otherOptionsMap").option("spark.cassandra.connection.host","firstClusterIP").load(); // Writing to second cassandra cluster dataframe.write.mode("saveMode").options("otherOptionsMap").option("spark.cassandra.connection.host","secondClusterIP").save();什么工作得很好:
// Reading from first cassandra cluster dataframe = cassandraSQLContext.read().format("org.apache.spark.sql.cassandra").options("otherOptionsMap").option("spark_cassandra_connection_host","firstClusterIP").load(); // Writing to second cassandra cluster dataframe.write.mode("saveMode").options("otherOptionsMap")option("spark_cassandra_connection_host","secondClusterIP").save();是的,这就是正确的,你只需要改变的期间(。) 以下划线( _ )于火花卡桑德拉主机财产一样。我不知道这是火花卡桑德拉连接器的bug。
Yes, thats right you just have to change period(.) to underscore( _ ) for property in spark-cassandra host property. I don't know if this is a bug in spark-cassandra connector.
更多推荐
如何从一个卡桑德拉集群复制的ColumnFamily到另一个?
发布评论