流程如下:
临时表如下:
CREATE TABLE `student`(
`s_id` string,
`s_name` string,
`s_birth` string,
`s_sex` string)
ROW FORMAT SERDE
‘org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe‘
WITH SERDEPROPERTIES (
‘field.delim‘=‘\t‘,
‘serialization.null.format‘=‘‘)
STORED AS INPUTFORMAT
‘org.apache.hadoop.mapred.TextInputFormat‘
OUTPUTFORMAT
‘org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat‘
LOCATION
‘hdfs://hbc-cluster/user/hive/warehouse/tmp.db/student‘
语法如下:
$ bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=a,b,c -Dimporttsv.bulk.output=hdfs://storefile-outputdir <tablename> <hdfs-data-inputdir>
必须要指定 HBASE_ROW_KEY,且为第一个字段。
hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dmapreduce.job.queuename=queue -Dimporttsv.bulk.output=hdfs://hbc-cluster/tmp/hbase -Dimporttsv.columns="HBASE_ROW_KEY,cf:s_name,cf:s_birth,cf:s_sex" stream_data_warehouse:student hdfs://hbc-cluster/user/hive/warehouse/tmp.db/student
语法如下:
$ bin/hbase org.apache.hadoop.hbase.tool.LoadIncrementalHFiles <hdfs://storefileoutput> <tablename>
hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles hdfs://hbc-cluster/tmp/hbase/ stream_data_warehouse:student
4、使用 ImportTsv 将 Hive 数据导入 Hbase
原文:https://www.cnblogs.com/xiexiandong/p/13139846.html