impala是一种内存计算的数据库,查询性能相比于hive官网称是快100倍,其向表中插入数据的方法如下:
?
1、insert into
[slave12:21000] > insert into parquet_snappy select * from raw_text_data; Inserted 1000000000 rows in 181.98s
?
2、CTAS
?
[slave12:21000] > create table test_table ?STORED AS PARQUET as select * from table;
Query: create table?test_table ?STORED AS PARQUET as select * from table
+-------------------------+
| summary???????????????? |
+-------------------------+
| Inserted 80000 row(s) |
+-------------------------+
3、load data?
?
[slave12:21000] > load data inpath ‘/user/hive/warehouse/test.db/table‘ into table test_table;
Query: load data inpath ‘/user/hive/warehouse/test.db/table‘ into table?test_table
+----------------------------------------------------------+
| summary????????????????????????????????????????????????? |
+----------------------------------------------------------+
| Loaded 1 file(s). Total files in destination location: 1 |
+----------------------------------------------------------+
?
此处注意,此种方法只能导入hdfs上的文件,不支持导入本地文件,不能像hive一样,加入local去导入本地文件,同时load之后,原表需要refresh,否则会报错
?
原文:http://daizj.iteye.com/blog/2257814