set hive.optimize.sampling.orderby=true;
set hive.optimize.sampling.orderby.number=10000;
set hive.optimize.sampling.orderby.percent=0.1f;
?
?
记录一下,Hive中并行排序参数;
?
hive.optimize.sampling.orderby
??? Default Value: false
??? Added In: Hive 0.12.0 with HIVE-1402
Uses sampling on order-by clause for parallel execution.
hive.optimize.sampling.orderby.number
??? Default Value: 1000
??? Added In: Hive 0.12.0 with HIVE-1402
With hive.optimize.sampling.orderby=true, total number of samples to be obtained to calculate partition keys.
hive.optimize.sampling.orderby.percent
??? Default Value: 0.1
??? Added In: Hive 0.12.0 with HIVE-1402
With hive.optimize.sampling.orderby=true, probability with which a row will be chosen.
原文:http://superlxw1234.iteye.com/blog/2155436