But the process types available are limited to those in Table 1, page 50. So where is the Incremental flag? There is none. The OLAP Manager made it look as if there was one by offering Incremental update as one of the processing choices, along with Full Process and Refresh. But OLAP Services actually uses only the DSO methods available in Table 1. OLAP Services creates a new temporary partition that contains the filter you entered and adds the new data to this partition. Then, OLAP Services merges the temporary partition with the original one, which is possible only as long as the structures of the two partitions are the same.
To create a temporary partition, you use the cube object, as in:
Dim dsoTmpPartition as dsoCube.MDStores
Set dsoTmpPartition = dsoCube.MDStores.AddNew('~tmpPartition')
Note that I placed a tilde character before the temporary partition name. DSO handles any partition that begins with a tilde as a temporary partition, and destroys the partition after the connection to it is released. This is an important point because OLAP Services will add any remaining partitions to the parent cube measures whether they were intended to be temporary or not. Using the tilde character ensures that the partition will disappear even after an interruption of the merge.
To be merged, the temporary and original partitions need to have the same dimension and level structure. Instead of programmatically creating the structures one by one, you can use a method designed for that purpose, as in:
dsoPartition.Clone(dsoTmpPartition)
This method ensures that the temporary partition is structurally the same as the original. To limit the amount of data that goes into this partition, you need to apply a filter, as in the manual process. To apply the filter, you need to find the highest ID value in the cube. You need to use ADO multidimensional (ADO MD) to extract the information from the cube, as Listing 1 shows.
After recovering the value of the highest ID in the cube, you can create the filter for the partition. DSO uses the filter as a WHERE clause for the fact table, so you need to structure the filter as such, as Listing 2 shows.
After you process the partition, you need to remove the existing filter before merging the temporary and original partitions. Otherwise, OLAP Services will add the temporary filter to the original partition's filter, which usually isn't the result you want. Also, keep in mind that attempting to merge an empty partition will cause a run-time error, so you need to test the temporary partition to see whether any rows were added before you attempt the merge, as in Listing 3. And with that step, you've automatically processed the cube.
Using DTS to Incrementally Update Cubes
Recently Microsoft introduced a new DTS add-in task called the OLAP Services Processing Task, which lets you apply incremental updates to cube partitions. Because it lets you control task flow in a GUI environment, the OLAP Services Processing Task provides a convenient way to schedule these updates and integrate them with scheduled fact table tasks. You can download the add-in for free from http://www.microsoft.com/sql/bizsol/DTSKit.htm.
To create a package without adding any custom DTS code, I created a SQL Server table that serves only to store the last highest ID in the cube. This method lets you create a generic filter on the partition. A SQL query in the data pump will gather the information relating to the highest ID present in the cube. This SQL query gets the maximum value of the ID field from the fact table immediately before OLAP Services processes the cube. This information will later help filter the fact table without your having to modify the WHERE clause in the partition before each update. In this case, I called the table SALESID and created two fields, as Listing 4 shows.
As the name suggests, the ID field holds the highest ID found in the cube and the BATCHDATE field helps determine which is the last record entered in the table. You can also use the BATCHDATE field as an internal audit to see when OLAP Services attempted to gather that value.
You need to create one connection for the cube and another for the database where the SALESID table was created. To create the OLAP connection, open the OLE DB connection, select the Microsoft OLE DB Provider for OLAP Services data source, and enter the name of the OLAP server in the Server field, as Screen 4 shows. The connection icon will automatically change to a cube-like object, as Screen 5 shows.
Then, you need to add the connection to the SQL Server database and add a transformation task that adds a record to the SALESID table containing the highest ID value in the cube at that time. Again, at processing time, the filter in the partition will use this value to determine which records are needed to add measures to the cube. You need to use an MDX query as the source of the transformation to get the highest ID value, as Screen 6 shows. Be sure to link the time-related dimension information to the ID field, as Screen 7 shows. In this case, the TIMEDATA column contains the cell that needs to be sent to the ID field.
If the cube update process enters a measure into the cube that depends on dimensions that don't yet exist, you will get an error that prevents the cube from processing. To avoid this error, create a task that updates the dimension data before updating the cube partitions. Successful partition processing depends on the success of this step. To set up a dimension processing task, choose the OLAP Services Processing Task icon represented by a solid yellow cube, as Screen 8, page 52, shows.
After OLAP Services successfully updates the dimension, the partition processing can begin. In this case, you need to add to the partition portion of the task a filter that brings only records from the fact table that have a higher ID value than the highest one in the cube. After you choose the partition and then choose incremental update as the processing option, you'll see fields that let you change the fact table and the filter. In this case, change only the filter, as in Screen 9, page 52. Unlike the filter that's applied directly to the partition through the OLAP Manager, this filter exists only while the task is running and no longer exists in the partition after the task is run. Screen 10, page 52, shows the executing DTS package.
You can run the task on a scheduled basis, but remember that if no new records need processing, the partition processing portion of the task will fail with an error. In the DSO code from the previous section, I avoided processing the partition if no new records existed in the temporary partition. Because DTS merges a temporary partition with the original one to accomplish incremental processing, you might get an error when you use DTS because it doesn't check for cases in which the temporary partition is empty. If you want to avoid this error, simply include some checks in the DTS package that prevent that portion of the task from beginning if no new records are in the fact table.
Automating Data Integration
The ability of cubes to provide meaning to data becomes even more valuable when the data is up to date. Understanding how OLAP Services treats incoming data and how you can take advantage of that knowledge to automate your data integration process is crucial to managing servers containing large, multiple cube structures that are refreshed on a regular basis.