在 DB 系统中,有一些分区表使用DateKey(int 数据类型)作为Partition Column,每个月作为一个Partition,由于在Fore-End呈现的报表大多数是基于Month的查询,按照Month分区的设计能够提高查询性能,但是,前任DBA没有创建Schedule来维护Partition Function,导致不能增加新的Partition,末端Partition的数据量增加。以何种方式来快速增加分区,才能减少对系统的负面影响?
CREATE PARTITION FUNCTION [funcPartition_DateKey](int) AS RANGE RIGHT FOR VALUES (20100701, 20100801, <.....> , 20150501, 20150601) GO CREATE PARTITION SCHEME [schePartition_DataKey] AS PARTITION [funcPartition_DateKey] TO ([PRIMARY], <....>, [PRIMARY]) GO
Best Practices
Always keep empty partitions at both ends of the partition range to guarantee that the partition split (before loading new data) and partition merge (after unloading old data) do not incur any data movement. Avoid splitting or merging populated partitions. This can be extremely inefficient, as this may cause as much as four times more log generation, and may also cause severe locking.
一,直接修改Partition Function,通过拆分末端Partition来增加分区
由于很多 Big Table 使用相同的Partition Schema进行分区,简单地从末端Partition为起点,逐个增加分区,在短时间内产生海量的IO操作,对系统的影响,很明显。例如
declare @DateKey int set @DateKey=20150701 declare @at varchar(1000) while @DateKey<20200101 begin select @at= ‘DateKey:‘+cast(@DateKey as varchar(8)) +‘ at:‘+convert(varchar(100),getdate(),121); print @at; alter partition scheme [schePartition_DataKey] Next Used [primary]; alter partition function [funcPartition_DateKey]() split range(@DateKey); set @DateKey=cast(convert(nvarchar(8),dateadd(month,1,cast(cast(@DateKey as nvarchar(8)) as date)),112) as int); end
二,更改分区表的Partition Schema
创建新的Partition function 和 新的 Partition Schema,逐个Table修改其Partition Schema,这个Workaround,虽然实现过程比较繁琐,但是对系统运行的副作用最小。
Script1,创建新的Partition design
--create Partition function declare @DateKeyList varchar(max) declare @DateKey int --set initial DateKey set @DateKey=20140701; while @DateKey<20200101 begin set @DateKeyList=coalesce(@DateKeyList+‘,‘+cast(@DateKey as varchar(8)),cast(@DateKey as varchar(8))) --Increase iterator set @DateKey=cast(convert(nvarchar(8),dateadd(month,1,cast(cast(@DateKey as nvarchar(8)) as date)),112) as int); end --print DateKey List --select @DateKeyList declare @sql nvarchar(max) set @sql=N‘ CREATE PARTITION FUNCTION [funcPartition_new_DateKey](int) AS RANGE RIGHT FOR VALUES (‘+@DateKeyList+N‘);‘ EXEC sys.sp_executesql @sql GO --create partition schema CREATE PARTITION SCHEME [schePartition_new_DataKey] AS PARTITION [funcPartition_new_DateKey] all TO ([PRIMARY]); GO
Partition scheme ‘schePartition_new_DataKey‘ has been created successfully. ‘PRIMARY‘ is marked as the next used filegroup in partition scheme ‘schePartition_new_DataKey‘.
Script2,逐个更新Table的Patition Schema
由于Patition Table中,可能存在不止一个Index,在rebuild table时,使用 drop_existing=on 能够减少分区表上nonclustered indexes的重建时间。
--drop columnstore index drop index [idx_ColumnStore_SchemaName_TableName] on SchemaName.TableName; --rebuild table create unique clustered index [PK__SchemaName_TableName_KeyColumn] on SchemaName.TableName([KeyColumn],[CreatedDateKey]) with(data_compression=page,drop_existing=on) on [schePartition_new_DataKey]([CreatedDateKey]); --rebuild columnstore index CREATE NONCLUSTERED COLUMNSTORE INDEX [idx_ColumnStore_SchemaName_TableName] ON [SchemaName].[TableName] ( column list.... )
Appendix
1,创建Patition Schema时,使用 ALL 关键字指定所有的Partition 都创建在同一个FileGroup上,在Patition Schema创建成功之后,默认会将该FileGroup标记为Next Used。在Patition schema中,只能Mark一个FileGroup作为Next Used。
ALL Specifies that all partitions map to the filegroup provided in file_group_name, or to the primary filegroup if [PRIMARY] is specified. If ALL is specified, only one file_group_name can be specified.
If [PRIMARY] is specified, the partition is stored on the primary filegroup. If ALL is specified, only one file_group_name can be specified. Partitions are assigned to filegroups, starting with partition 1, in the order in which the filegroups are listed in [,...n]. The same file_group_name can be specified more than one time in [,...n]. If n is not sufficient to hold the number of partitions specified in partition_function_name, CREATE PARTITION SCHEME fails with an error.
If partition_function_name generates less partitions than filegroups, the first unassigned filegroup is marked NEXT USED, and an information message displays naming the NEXT USED filegroup. If ALL is specified, the sole file_group_name maintains its NEXT USED property for this partition_function_name. The NEXT USED filegroup will receive an additional partition if one is created in an ALTER PARTITION FUNCTION statement. To create additional unassigned filegroups to hold new partitions, use ALTER PARTITION SCHEME.
In a partition scheme, only one filegroup can be designated NEXT USED.
2,在创建Patition Schema时 指定 ALL 关键字只会标记一个FileGroup作为Next Used,在完成一次 Split Patition之后,必须显式标记一个FileGroup作为Next Used,否则,Split 操作失败。
A filegroup must exist online and be marked by the partition scheme that uses the partition function as NEXT USED to hold the new partition. Filegroups are allocated to partitions in a CREATE PARTITION SCHEME statement. If a CREATE PARTITION SCHEME statement allocates more filegroups than necessary (fewer partitions are created in the CREATE PARTITION FUNCTION statement than filegroups to hold them), then there are unassigned filegroups, and one of them is marked NEXT USED by the partition scheme. This filegroup will hold the new partition. If there are no filegroups marked NEXT USED by the partition scheme, you must use ALTER PARTITION SCHEME to either add a filegroup, or designate an existing one, to hold the new partition. A filegroup that already holds partitions can be designated to hold additional partitions. Because a partition function can participate in more than one partition scheme, all the partition schemes that use the partition function to which you are adding partitions must have a NEXT USED filegroup. Otherwise, ALTER PARTITION FUNCTION fails with an error that displays the partition scheme or schemes that lack a NEXT USED filegroup.
If you create all the partitions in the same filegroup, that filegroup is initially assigned to be the NEXT USED filegroup automatically. However, after a split operation is performed, there is no longer a designated NEXT USED filegroup. You must explicitly assign the filegroup to be the NEXT USED filegroup by using ALTER PARITION SCHEME or a subsequent split operation will fail.
显式标记一个FileGroup 作为 Next Used
ALTER PARTITION SCHEME partition_scheme_name
NEXT USED [ filegroup_name ] [ ; ]
3,拆分分区
ALTER PARTITION FUNCTION partition_function_name()
{
SPLIT RANGE ( boundary_value )
| MERGE RANGE ( boundary_value )
} [ ; ]
ALTER PARTITION FUNCTION repartitions any tables and indexes that use the function in a single atomic operation. However, this operation occurs offline, and depending on the extent of repartitioning, may be resource-intensive.
ALTER PARTITION FUNCTION can only be used for splitting one partition into two, or merging two partitions into one. To change the way a table is otherwise partitioned (for example, from 10 partitions to 5 partitions), you can exercise any of the following options. Depending on the configuration of your system, these options can vary in resource consumption:
Create a new partitioned table with the desired partition function, and then insert the data from the old table into the new table by using an INSERT INTO...SELECT FROM statement.
Create a partitioned clustered index on a heap.
Perform a sequence of ALTER PARTITION FUNCTION statements.
参考doc:
Rebuilding Existing Partitioned Tables to a New Partition Scheme
ALTER PARTITION FUNCTION (Transact-SQL)
ALTER PARTITION SCHEME (Transact-SQL)
原文:http://www.cnblogs.com/ljhdo/p/5700615.html