• subscribe
February 17, 2004 12:00 AM

13 Bonus BI Tips

SQL Server Pro
InstantDoc ID #41617
Downloads
41617.zip

Using Dynamic Properties to Stabilize DTS
To cut down on coding and thereby minimize errors, Microsoft added the Dynamic Properties task to Data Transformation Services (DTS) in SQL Server 2000. With the assistance of this task, you don't have to create bulky ActiveX Script tasks to dynamically set a DTS property, such as a username that you use to establish a connection. This task lets you change the value of any nonidentifying property that’s accessible through the DTS object model (e.g., non-name/ID properties of a step, connection, task, package, or global variable). What once took 3 weeks to stabilize, you can now write and stabilize in less than a day. Using the Dynamic Properties task gives you faster performance than writing the same process with an ActiveX Script task because DTS doesn't resolve the ActiveX Script task until runtime.

Leave Snowflakes Alone
Analysis Services lets you source dimensions from either a normalized snowflake schema or a flattened star schema. Microsoft recommends flattening snowflake dimensions into stars for performance reasons, a practice that most Analysis Services developers follow. However, unless the relational data mart is consumed by something other than Analysis Services, this practice has few benefits and considerable drawbacks. For these reasons, resist the urge to flatten:

  • A snowflake schema provides the benefits of a normalized design. With a star schema, managing attributes for the repeating non-leaf members is awkward at best.
  • A snowflake gives you unique keys at each level. This lets you import data into a cube at any level of granularity, a critical ability in financial-planning applications, for example.
  • Because dimension tables aren't queried at runtime (except for in the notoriously slow relational OLAP—ROLAP—mode), snowflake dimensions have no impact on query performance. The only downside to a snowflake dimension is that it (the dimension, not the cube) is slower to process than a star because of the joins that are necessary. However, the time it takes to process dimensions is a minor factor compared to the time necessary for cube processing. Unless the dimension is huge and the time window in which processing must occur is tight, snowflakes are the way to go.

Create Grouping Levels Manually
No dimension member can have more than 64,000 children, including the All member. This limit isn't as onerous as it sounds; usability is apt to nail you before the hard limit does. A member with even 10,000 children usually presents a usability problem—that's a lot of rows to dump on a user drilling down into the dimension.

Whether you're fighting the limit or simply working to design your dimension so that it provides bite-size drilldowns, the solution is to build deep, meaningful hierarchies. But when there's no raw material from which to build a meaningful hierarchy, you must resort to a grouping level, aka a Rolodex level, such as the first letter of the last name for a customer dimension. Analysis Services has a feature (create member groups in the Dimension Wizard) that can create a grouping level for you automatically. Don't use it! You won't have control over the grouping boundaries. Instead, construct the level manually. This entails adding a new level to the dimension, then modifying the Member Name Column and Member Key Column properties. For instance, you might define the member key column and member name column for the grouping level as follows:

LEFT("CustomerDimTable"."
CustomerName", 1)

This expression bases the level on the first letter of the customer name, providing Rolodex-style navigation. Bear in mind, however, that this is a SQL pass-through; the expression is passed to the RDBMS, so the RDBMS dictates the syntax. That is, T-SQL has a LEFT() function, but another RDBMS might not.

Understand the Role of MDX
Did you ever try to swim without getting wet? For all but the simplest of databases, that's what it's like when you try to design an OLAP solution without using MDX. Because shrink-wrap client software often negates the need to write MDX SELECT statements, many developers think they can successfully avoid MDX. This is folly. Sure, not every project requires MDX SELECT statements; commercial software is adequate for many situations. But MDX calculations should play an important role in most Analysis Services solutions, even those that aren't calculation-intensive on the surface.

Perhaps the most common example is a virtual cube that's based on two or more source cubes. Calculated members are usually required to "glue" the virtual cube together into a seamless whole. Although the MDX isn't necessarily complex, developers unaware of the role of MDX wind up making costly mistakes. Either they avoid virtual cubes entirely, or they shift logic that's easily implemented in MDX to the extraction-transformation-load (ETL) process, where it's more complicated and rigidly set.

Using NON EMPTY to Include Empty Cells

Many multidimensional cubes have empty cells, which occur because a user didn't load data into the cube for these members. For example, if you inspect the Sales cube in the FoodMart sample, its creators didn't load any data for 1998. You must use the NON EMPTY modifier to write an MDX query that includes 1998, as Listing A shows.



ARTICLE TOOLS

Comments
    There are no comments to display. Be the first one!
You must log on before posting a comment.

Are you a new visitor? Register Here