Friday, March 23, 2012

Partitioning Strategery

Hi all,

I've got several large fact tables that I want to partition by time. What's the best way to strategy to manage this? I have a C# app that uses AMO to manage the re-processing currently, but right now I'm fairly naive and just re-process every measure group.

What I'd like is a way to automatically partition by month / day [depending on measure group], and then enhance my re-process app to only process the latest partition.

My current idea is to establish a naming structure for the existing MGs/partitions, [i.e. "Foo 12-2006"], parse the date, adjust the named query, create a new partition if needed, and re-process if appropriate. Though it's quite doable, it seems a bit complex for something that should be a more of a mainstream scenario.

How have you guys implemented this functionality? I saw something about specifying the "slice" dimension on partitions, but I'm not sure if that can be used to help this.

Cheers,

Arjun

The follwoing is my experience for different cases when refreshing data in a partitioned cube.

C1. New data for a new time period

Incrementally processing the time dimension, creating the new partition for the new time period and fully process it.

C2. New data for existing time period

Create meta data in RDBMS such as batch etc, merge the new data into existing partition

C3. Updated or deleted data for existing period

Fully process the partition

In SSAS2005, slice is automatically set for msolap partition, and it is only used for querying, not for processing, so you maybe need not to consider it at all.

|||

CJ - thanks for your insight.

Typically my cases fall under (C1) - new data for a new time period. What's the best way to manage new partition creation? Does SSAS provide any functionality to make this easier, or do I need to build my own infrastructure?

|||

If a partition strategery works for you, it is a good one.

I always build the partitions logic by myself.

No comments:

Post a Comment