Clickhouse的clickhouse

编程入门 行业动态 更新时间:2024-10-16 16:16:41

<a href=https://www.elefans.com/category/jswz/34/1768534.html style=Clickhouse的clickhouse"/>

Clickhouse的clickhouse

目录

  • 1. clickhouse-client
    • 1.1 交互式协议
    • 1.2 非交互模式
  • 2. Java/Scala使用JDBC连接Clickhouse
  • 3. Pyspark连接Clickhouse

1. clickhouse-client

底层是基于tcp协议

1.1 交互式协议

[root@clickhouse1 ~]#
[root@clickhouse1 ~]# clickhouse-client -h clickhouse1 --port 9000 -u default --password default123 -m -n
ClickHouse client version 21.6.5.37 (official build).
Connecting to clickhouse1:9000 as user default.
Connected to ClickHouse server version 21.6.5 revision 54448.clickhouse1 :)
clickhouse1 :) select 1;SELECT 1Query id: a58a5e62-75cb-4b3c-a8e8-5ed8d2917b85┌─1─┐
│ 1 │
└───┘1 rows in set. Elapsed: 0.016 sec. clickhouse1 :)
clickhouse1 :) select 2;SELECT 2Query id: 35771efe-e306-403d-b39a-7a58abba4412┌─2─┐
│ 2 │
└───┘1 rows in set. Elapsed: 0.006 sec. clickhouse1 :)
clickhouse1 :) select hostName();SELECT hostName()Query id: 7d08f34a-489e-4e05-91be-a2e72f4c08fa┌─hostName()──┐
│ clickhouse1 │
└─────────────┘1 rows in set. Elapsed: 0.010 sec. clickhouse1 :) 

执行的记录被保存clickhouse-client-history

[root@clickhouse1 ~]# 
[root@clickhouse1 ~]# pwd
/root
[root@clickhouse1 ~]# 
[root@clickhouse1 ~]# cat .clickhouse-client-history
......省略部分......
### 2021-07-09 04:37:28.414
select 1;
### 2021-07-09 04:39:48.674
select 2;
### 2021-07-09 04:39:51.683
select hostName();

1.2 非交互模式

导入数据

[root@clickhouse1 ~]# 
[root@clickhouse1 ~]# pwd
/root
[root@clickhouse1 ~]# 
[root@clickhouse1 ~]# cat insert.csv 
3,san
4,si
[root@clickhouse1 ~]#
[root@clickhouse1 ~]# clickhouse-client -h clickhouse1 --port 9000 -u default --password default123 -m -n --format_csv_delimiter="," --query "insert into distribute_test_all format CSV" < /root/insert.csv 
[root@clickhouse1 ~]#

导出数据

[root@clickhouse1 ~]#
[root@clickhouse1 ~]# clickhouse-client -h clickhouse1 --port 9000 -u default --password default123 -m -n --query "select * from distribute_test_all" > /root/select.csv 
[root@clickhouse1 ~]# 
[root@clickhouse1 ~]# pwd
/root
[root@clickhouse1 ~]# 
[root@clickhouse1 ~]# cat select.csv 
1	一
4	si
1	yi
2	er
3	san
[root@clickhouse1 ~]#

2. Java/Scala使用JDBC连接Clickhouse

底层基于HTTP接口,Java版本如下:

pom.xml

        <dependency><groupId>com.clickhouse</groupId><artifactId>clickhouse-jdbc</artifactId><version>0.3.2-patch8</version></dependency>

Clickhouse_JDBC_test.java

import ru.yandex.clickhouse.BalancedClickhouseDataSource;
import ru.yandex.clickhouse.settings.ClickHouseProperties;import java.sql.Connection;
import java.sql.PreparedStatement;public class Clickhouse_JDBC_test {public static void main(String[] args) throws Exception {String url = "jdbc:clickhouse://192.168.23.41:8123,192.168.23.42:8123,192.168.23.43:8123,192.168.23.44:8123/default";ClickHouseProperties clickHouseProperties = new ClickHouseProperties();clickHouseProperties.setUser("default");clickHouseProperties.setPassword("default123");BalancedClickhouseDataSource balanced = new BalancedClickhouseDataSource(url, clickHouseProperties);//对每个host进行ping操作, 排除不可用的连接balanced.actualize();Connection conn = balanced.getConnection();String insertSql = "insert into test.java_table_test_all(id, name) values(?, ?)";PreparedStatement pst = conn.prepareStatement(insertSql);for (int i = 1; i < 9999; i++) {pst.setInt(1, i);pst.setString(2, "name" + i);pst.addBatch();if (i % 1000 == 0) {// 执行conn.setAutoCommit(false);会报Transactions are not supported异常// 所以不能执行connmit();// 只能执行pst.executeBatch();由clickhouse进行后台提交。测试时插入数据条数是正确的pst.executeBatch();pst.clearBatch();}}// 最后一批数据插入pst.executeBatch();pst.clearBatch();pst.close();conn.close();}
}

3. Pyspark连接Clickhouse

添加驱动jar包

  1. 从maven仓库的clickhouse-jdbc下载clickhouse-jdbc-0.3.2.jar驱动包
  2. 从maven仓库的clickhouse-client下载clickhouse-client-0.3.2.jar驱动包
  3. 从maven仓库的clickhouse-http-client下载clickhouse-http-client-0.3.2.jar驱动包
  4. 将下载的jar包放到所有Spark服务器的spark-3.1.2-bin-hadoop3.2/jars目录下
  5. 然后重启Spark服务

pySpark示例程序如下:


from pyspark.sql import SparkSession
from pyspark import SparkConfif __name__ == '__main__':conf = SparkConf() \.setAppName("clickhouseTest")spark = SparkSession.builder.config(conf=conf).getOrCreate()spark.sparkContext.setLogLevel("WARN")df = spark.read.format("jdbc") \.option("driver", "com.clickhouse.jdbc.ClickHouseDriver") \.option("url","jdbc:clickhouse://bigdata001:8123,bigdata002:8123,bigdata003:8123/clickhouse_test_db") \.option("user","default") \.option("password","default123") \.option("dbtable","(select user_id, name, age from user_tb where age between 10 and 30) tempTable") \.option("partitionColumn", "age") \.option("fetchsize", 25000) \.option("lowerBound", 10) \.option("upperBound", 30) \.option("numPartitions", 15) \.option("queryTimeout", 600) \.load()df.show(n=3, truncate=False)spark.stop()

更多推荐

Clickhouse的clickhouse

本文发布于:2024-02-26 23:40:53,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1704417.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:Clickhouse   clickhouse

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!