首页 > 其他 > 详细

HDFS定时导入Hive的分区表

时间:2021-05-27 16:32:49      阅读:20      评论:0      收藏:0      [点我收藏+]

过程:

此代码在shell中进行编辑,并采用crontab进行定时运行

1.先将每天的数据导导到一张临时表wfbmal.wfbwall_log_url_tmp表中,此表可为内部表

2.然后再将临时表的数据导入到目标表中 wfbmal.wfbwall_log_url

 

#!/bin/sh
# upload logs to hdfs
yesterday=$(date -d "yesterday" +%Y%m%d)   #获取昨日时间
/opt/soft/hive/bin/hive -e "               
use wfbmal;
load data  inpath /flume/data/logs/`date +%Y%m`/${yesterday}/ overwrite into table wfbmal.wfbwall_log_url_tmp;
insert into table wfbmal.wfbwall_log_url PARTITION (dt=${yesterday}) select * from wfbmal.wfbwall_log_url_tmp;

"

 

附带两表的建表语句

临时表

create   table  wfbmal.wfbwall_log_url_tmp (
log_time       string,
log_key        string,
url_detail     string,
url_briefly    string,
url_action     string,
time_situation string
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY #  ;

目标表

create  external table  wfbmal.wfbwall_log_url (
log_time       string,
log_key        string,
url_detail     string,
url_briefly    string,
url_action     string,
time_situation string
)
PARTITIONED BY(`dt` string)   -- 分区字段
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ,      -- 分隔符,即导入进来数据的默认分隔符
NULL DEFINED AS ‘‘
STORED AS TEXTFILE
LOCATION  /hive/warehouse;

 

HDFS定时导入Hive的分区表

原文:https://www.cnblogs.com/cstark/p/14817281.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!