将数据从一个hbase表复制到另一个

编程入门 行业动态 更新时间:2024-10-25 12:22:20
本文介绍了将数据从一个hbase表复制到另一个的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

我创建了一个hivetest表,它也创建了名为'hbasetest'的hbase表。现在我想用相同的模式将'hbasetest'数据复制到另一个hbase表(如logdata)中。那么,谁能帮助我如何将数据从'hbasetest'复制到'logdata'而不使用配置单元。

hivetest(cookie字符串,timespent字符串,pageviews字符串,访问字符串,logdate字符串) STORED BY'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES(hbase.columns.mapping =m:timespent,m:综合浏览量,m:访问量,m:logdate) TBLPROPERTIES(hbase.table.name=hbasetest);

更新后的问题:

我已经创建了像这样的表logdata。但是,我收到以下错误。

创建'logdata',{NAME => 'm',BLOOMFILTER => 'NONE',REPLICATION_SCOPE => '0',VERSIONS => '3',COMPRESSION => 'NONE',MIN_VERSIONS =>'0',TTL => '2147483647',BLOCKSIZE => '65536',IN_MEMORY => 'false',BLOCKCACHE => 'true'} 13/09/23 12:57:19信息mapred.JobClient:Task Id:attempt_201309231115_0025_m_000000_0,Status:FAILED org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException :755行为失败:org.apache.hadoop.hbase.regionserver.NoSuchColumnFamilyException:列族m不存在于区域logdata ,, 1379920697845.30fce8bcc99bf9ed321720496a3ec498。在'logdata'表中,{NAME => 'm',DATA_BLOCK_ENCODING => 'NONE',BLOOMFILTER => 'NONE',REPLICATION_SCOPE => '0',COMPRESSION => 'NONE',VERSIONS => '3',TTL => '2147483647',MIN_VERSIONS => '0',KEEP_DELETED_CELLS => '假',BLOCKSIZE => '65536',ENCODE_ON_DISK => 'true',IN_MEMORY => 'false',BLOCKCACHE => 'true'} at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3773) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method。 at org.apache.hadoop.hbase.ipc.WritableRpcEngine $ Server.call(WritableRpcEngine.java:320) at org.apache.hadoop.hbase.ipc.HBaseServer $ Handler。运行(HBaseServer.java:1426):755次,服务器出现问题:master:60020,位于org.apache.hadoop.hbase.client.HConnectionManager $ HConnectionImplementation.processBatchCallback(HConnectionManager.java:1674 ) at org.apache.hadoop.hbase.client.HConnectionManager $ HConnectionImplementation.processBatch(HConnectionManager.java:1450) at org.apache.hadoop.hbase.client.HTable.flushCommits(H Table.java:916) at org.apache.hadoop.hbase.client.HTable.close(HTable.java:953) at org.apache.hadoop.hbase.mapreduce.TableOutputFormat $ TableRecordWriter。关闭(TableOutputFormat.java:109) at org.apache.hadoop.mapred.MapTask $ NewDirectOutputCollector.close(MapTask.java:651) at org.apache.hadoop.mapred.MapTask.runNewMapper( MapTask.java:766) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child $ 4.run(Child.java :255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop .security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249) 13/09/23 12 :57:29信息mapred.JobClient:任务ID:attempt_201309231115_0025_m_000000_1,状态:FAILED org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException:失败7 55个操作:org.apache.hadoop.hbase.regionserver.NoSuchColumnFamilyException:列族m在区域logdata中不存在,, 1379920697845.30fce8bcc99bf9ed321720496a3ec498。在'logdata'表中,{NAME => 'm',DATA_BLOCK_ENCODING => 'NONE',BLOOMFILTER => 'NONE',REPLICATION_SCOPE => '0',COMPRESSION => 'NONE',VERSIONS => '3',TTL => '2147483647',MIN_VERSIONS => '0',KEEP_DELETED_CELLS => '假',BLOCKSIZE => '65536',ENCODE_ON_DISK => 'true',IN_MEMORY => 'false',BLOCKCACHE => 'true'} at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3773) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method。 at org.apache.hadoop.hbase.ipc.WritableRpcEngine $ Server.call(WritableRpcEngine.java:320) at org.apache.hadoop.hbase.ipc.HBaseServer $ Handler。运行(HBaseServer.java:1426):755次,服务器出现问题:master:60020,位于org.apache.hadoop.hbase.client.HConnectionManager $ HConnectionImplementation.processBatchCallback(HConnectionManager.java:1674 ) at org.apache.hadoop.hbase.client.HConnectionManager $ HConnectionImplementation.processBatch(HConnectionManager.java:1450) at org.apache.hadoop.hbase.client.HTable.flushCommits(H Table.java:916) at org.apache.hadoop.hbase.client.HTable.close(HTable.java:953) at org.apache.hadoop.hbase.mapreduce.TableOutputFormat $ TableRecordWriter。关闭(TableOutputFormat.java:109) at org.apache.hadoop.mapred.MapTask $ NewDirectOutputCollector.close(MapTask.java:651) at org.apache.hadoop.mapred.MapTask.runNewMapper( MapTask.java:766) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child $ 4.run(Child.java :255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop .security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249) 13/09/23 12 :57:38信息mapred.JobClient:任务ID:attempt_201309231115_0025_m_000000_2,状态:FAILED org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException:失败7 55个操作:org.apache.hadoop.hbase.regionserver.NoSuchColumnFamilyException:列族m在区域logdata中不存在,, 1379920697845.30fce8bcc99bf9ed321720496a3ec498。在'logdata'表中,{NAME => 'm',DATA_BLOCK_ENCODING => 'NONE',BLOOMFILTER => 'NONE',REPLICATION_SCOPE => '0',COMPRESSION => 'NONE',VERSIONS => '3',TTL => '2147483647',MIN_VERSIONS => '0',KEEP_DELETED_CELLS => '假',BLOCKSIZE => '65536',ENCODE_ON_DISK => 'true',IN_MEMORY => 'false',BLOCKCACHE => 'true'} at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3773) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method。 at org.apache.hadoop.hbase.ipc.WritableRpcEngine $ Server.call(WritableRpcEngine.java:320) at org.apache.hadoop.hbase.ipc.HBaseServer $ Handler。运行(HBaseServer.java:1426):755次,服务器出现问题:master:60020,位于org.apache.hadoop.hbase.client.HConnectionManager $ HConnectionImplementation.processBatchCallback(HConnectionManager.java:1674 ) at org.apache.hadoop.hbase.client.HConnectionManager $ HConnectionImplementation.processBatch(HConnectionManager.java:1450) at org.apache.hadoop.hbase.client.HTable.flushCommits(H Table.java:916) at org.apache.hadoop.hbase.client.HTable.close(HTable.java:953) at org.apache.hadoop.hbase.mapreduce.TableOutputFormat $ TableRecordWriter。关闭(TableOutputFormat.java:109) at org.apache.hadoop.mapred.MapTask $ NewDirectOutputCollector.close(MapTask.java:651) at org.apache.hadoop.mapred.MapTask.runNewMapper( MapTask.java:766) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child $ 4.run(Child.java :255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop .security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249) 13/09/23 12 :57:53信息mapred.JobClient:工作完成:job_201309231115_0025 13/09/23 12:57:53信息mapred.JobClient:计数器:7 13/09/23 12:57:53信息mapred.JobClient:作业计数器 13/09/23 12:57:53信息mapred.JobClient:SLOTS_MILLIS_MAPS = 34605 13/09/23 12:57:53信息mapred.JobClient:所花费的总时间减少等待后预留插槽(ms)= 0 13/09/23 12:57:53信息mapred.JobClient:预留插槽后等待的所有地图花费的总时间(毫秒)= 0 13/09/23 12:57:53信息mapred.JobClient:机架本地地图任务= 4 13/09/23 12:57:53信息mapred.JobClient:启动地图任务= 4 13/09/23 12:57:53信息mapred.JobClient:SLOTS_MILLIS_REDUCES = 0 13/09/23 12:57:53信息mapred.JobClient:失败的地图任务= 1

解决方案

其实我使用的是hive-0.9.0。其中有一个错误

issues.apache/jira/browse/HIVE-3243。

因此,在创建HBDeviceHandler表时,SerDe不会忽略逗号和列家族。因此你需要删除空格。那么它会正常工作。

I have created one table hivetest which also create the table in hbase with name of 'hbasetest'. Now I want to copy 'hbasetest' data into another hbase table(say logdata) with the same schema. So, can anyone help me how do copy the data from 'hbasetest' to 'logdata' without using the hive.

CREATE TABLE hivetest(cookie string, timespent string, pageviews string, visit string, logdate string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = "m:timespent, m:pageviews, m:visit, m:logdate") TBLPROPERTIES ("hbase.table.name" = "hbasetest");

Updated question :

I have created the table logdata like this. But, I am getting the following error.

create 'logdata', {NAME => ' m', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => 'NONE', MIN_VERSIONS =>'0', TTL => '2147483647', BLOCKSIZE=> '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'} 13/09/23 12:57:19 INFO mapred.JobClient: Task Id : attempt_201309231115_0025_m_000000_0, Status : FAILED org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 755 actions: org.apache.hadoop.hbase.regionserver.NoSuchColumnFamilyException: Column family m does not exist in region logdata,,1379920697845.30fce8bcc99bf9ed321720496a3ec498. in table 'logdata', {NAME => 'm', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', ENCODE_ON_DISK => 'true', IN_MEMORY => 'false', BLOCKCACHE => 'true'} at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3773) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426) : 755 times, servers with issues: master:60020, at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1674) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1450) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:916) at org.apache.hadoop.hbase.client.HTable.close(HTable.java:953) at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.close(TableOutputFormat.java:109) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249) 13/09/23 12:57:29 INFO mapred.JobClient: Task Id : attempt_201309231115_0025_m_000000_1, Status : FAILED org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 755 actions: org.apache.hadoop.hbase.regionserver.NoSuchColumnFamilyException: Column family m does not exist in region logdata,,1379920697845.30fce8bcc99bf9ed321720496a3ec498. in table 'logdata', {NAME => 'm', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', ENCODE_ON_DISK => 'true', IN_MEMORY => 'false', BLOCKCACHE => 'true'} at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3773) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426) : 755 times, servers with issues: master:60020, at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1674) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1450) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:916) at org.apache.hadoop.hbase.client.HTable.close(HTable.java:953) at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.close(TableOutputFormat.java:109) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249) 13/09/23 12:57:38 INFO mapred.JobClient: Task Id : attempt_201309231115_0025_m_000000_2, Status : FAILED org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 755 actions: org.apache.hadoop.hbase.regionserver.NoSuchColumnFamilyException: Column family m does not exist in region logdata,,1379920697845.30fce8bcc99bf9ed321720496a3ec498. in table 'logdata', {NAME => 'm', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', ENCODE_ON_DISK => 'true', IN_MEMORY => 'false', BLOCKCACHE => 'true'} at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3773) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426) : 755 times, servers with issues: master:60020, at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1674) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1450) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:916) at org.apache.hadoop.hbase.client.HTable.close(HTable.java:953) at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.close(TableOutputFormat.java:109) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249) 13/09/23 12:57:53 INFO mapred.JobClient: Job complete: job_201309231115_0025 13/09/23 12:57:53 INFO mapred.JobClient: Counters: 7 13/09/23 12:57:53 INFO mapred.JobClient: Job Counters 13/09/23 12:57:53 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=34605 13/09/23 12:57:53 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 13/09/23 12:57:53 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 13/09/23 12:57:53 INFO mapred.JobClient: Rack-local map tasks=4 13/09/23 12:57:53 INFO mapred.JobClient: Launched map tasks=4 13/09/23 12:57:53 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0 13/09/23 12:57:53 INFO mapred.JobClient: Failed map tasks=1

解决方案

Actually i am using hive-0.9.0. Which has a bug

issues.apache/jira/browse/HIVE-3243.

So, while creating the table SerDe of HBaseStorageHandler doesn't ignore white space between comma and column family. Hence you need to remove the white spaces. Then it will work fine.

更多推荐

将数据从一个hbase表复制到另一个

本文发布于:2023-10-07 13:04:48,感谢您对本站的认可!
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:数据   hbase

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!