When people think of the structure of an OLAP cube, they usually think of a group of hierarchical dimensions and a set of measures that use summation to determine the measures' values. Often, cubes include measures that are aggregated by using a summation. A variety of other aggregation types exist, but Analysis Services doesn't natively support all of them. A classic example of nontraditional aggregation is inventory analysis. When you analyze inventory, you don't want to determine the quarterly inventory by summing the inventory numbers from the 3 months; you want to use only the inventory quantity from the last month in the quarter. You can easily implement this type of aggregation in Analysis Services, but it's not one of the native aggregation types you can choose from. Let's look at the native types, then learn how to implement a couple of non-native aggregation types.
Native Aggregations
First, let's establish some terminology. A cube cell is a location in the cube where the dimensions intersect. Just as you can use numeric values on X, Y, and Z axes to identify a point in three-dimensional space, you can use a member from each dimension to identify a cell in a multidimensional cube. Because each dimension has a default member, you don't have to explicitly list a member from each dimension for a cube cell to be uniquely identified. For example, the tuple (USA, Drinks) is enough to identify a unique cell in the FoodMart 2000 Sales cube, even though the Sales cube has 12 dimensions, because the other 10 dimensions are positioned on their default members (i.e., they're assumed, so you don't need to explicitly name them).
Every cube cell has a value for each of the cube's measures. Some cube cells' measure values are loaded (i.e., come directly from the source data), and some cube cells' measure values are determined by an aggregation. An aggregation is a value that Analysis Services determines by performing some math function on other cube cells. To put it most simply, any leaf-level cell's values are loaded values, meaning that the values are calculated based on the source data, not based on the values in other cells. The nonleaf-level cell values are determined by aggregations. Leaf-level cells are cells that are identified entirely by the bottom-most dimension membersdimension members that have no children.
Table 1 shows Analysis Services' native aggregation types (math functions) and their descriptions. Distinct count is a special type and warrants an explanation. Analysis Services allows only one distinct-count measure in a cube, and the values of a distinct-count measure are never aggregated (determined from other cell values). Analysis Services derives distinct-count measure values directly from the fact table when the cube is processed. A distinct-count value is determined by the number of distinct types of fact-table records that a particular cube cell identifies. For example, if the cube cell is (1997, Beverages), then the distinct count is the number of different types of fact-table records that occurred during 1997 in the Beverages product category. When you create the distinct-count measure, you specify a column in the fact table that determines the type of fact-table record. Distinct-count measures are different from typical count measures because a distinct count isn't the count of fact-table records; it's the count of distinct values of a particular column in the fact table. DBAs typically use a distinct count to determine the number of unique transactions that exist when more than one fact-table record comes from the same transaction.
One characteristic of the native aggregation types is that Analysis Services treats them the same across all cube dimensions. The non-native inventory aggregation doesn't have this characteristic. When aggregating inventory quantities, you use summation for all dimensions except the Time dimension. You aggregate the Time dimension by using the last child (i.e., the most recent value). Figure 1 shows how the Product dimension aggregates inventory totals as a sum, whereas the Time dimension uses the last child.
Going Non-Native
Because Analysis Services doesn't offer native support for a last-child aggregation, you have to create your own aggregation support. You have several choices for how to do this. You can use a calculated measure, a calculated cell, or a custom rollup formula in the Time dimension. I prefer to use the method that limits performance effects to the smallest percentage of the cube. For example, the FoodMart Sales cube contains other measures besides the inventory quantity measure, so I suggest picking a method that doesn't add extra processing to the other measures. Analysis Services evaluates a custom rollup formula for every measure, so a custom rollup isn't a good choice for this problem. You can use either the calculated cell or calculated measure technique to isolate the aggregation processing so that it affects only the inventory quantity measure. You need to consider a couple of facts when you're choosing between a calculated cell and a calculated measure. Note that calculated cells are available only with the Enterprise Edition of SQL Server. But calculated measures can be problematic in cubes that support drillthrough because the drillthrough operation doesn't work when you specify a calculated measure as part of the tuple (i.e., the list of members) that identifies the cube's cell.
For inventory analysis, I suggest using a calculated measure simply because most OLAP front-end tools offer the ability to create calculated measures. For example, when you use the following calculated measure formula in the FoodMart Sales cube, it returns [Unit Sales] as if it were an inventory quantity:
IIF( Time.CURRENTMEMBER.LEVEL
.NAME <> "Month",
Time.CURRENTMEMBER.LASTCHILD,
(Time.CURRENTMEMBER, [Unit Sales]) )
This recursive formula determines whether the current Time member's level name is the lowest level in the Time dimension, and if not, it returns the last child of the current member. The LASTCHILD portion of the formula causes the formula to reevaluate recursively until the level's name is Month; then, the formula returns the Unit Sales value for the current Month.