The MDX language is powerful but not easy to use. On the surface, MDX looks like SQL, but it can quickly become more complex because of the multidimensional nature of the underlying cube data. After more than 3 years of using MDX, I've found I'm more productive when I apply design and debugging techniques that help me better understand MDX and create more accurate MDX statements. The techniques I use for developing MDX are similar to those I use for developing software in other languages: for complex problems, I use pseudo coding and debug the code by displaying intermediate results. Let's look at an example of how I use these techniques, and along the way, I'll show you how to use a .NET language to develop MDX user-defined functions (UDFs).
If you have any formal software-development education, you know that to solve a complex problem, you first break the problem into parts and solve each part independently. Then, it's always a good idea to step through each line of your code, using a debugger to verify that the code works as intended. Most software developers know that these practices are good habits, but not enough programmers apply them. These good programming habits can help you effectively deal with MDX's complexity.
For example, say you need to answer a typical business question such as, "Based on unit sales, what are the top three brand names for each product department?" The MDX query that Listing 1 shows answers the question; Figure 1 shows the results. I used the familiar FoodMart 2000 Sales cube that comes with Analysis Services as the basis for my example. I have enough experience with MDX that when I wrote this query, it ran the first time (thus I skipped the good habit of breaking the code into parts). But the query is complicated because it performs ranking (TOPCOUNT) inside an iterative loop (GENERATE), and I wasn't sure I was getting the answer I really wanted. Let's see how I work through the problem in a way that emphasizes modularity (i.e., addressing each part of the problem separately) and accuracy. First, I use a design methodology called pseudo coding. Pseudo coding is a process of writing in plain language the steps for how you plan to implement your solution. For this problem, I want my code to follow the process that the pseudo code below describes.
For each product department,
- find the set of all brand names for this product department
- return the product department name
- return the three brand names that have the most unit sales
When I start to translate this pseudo code into MDX, I get the following:
<<ANSWER>> = GENERATE( [Product
Department].MEMBERS, <<Dept
and Top Brands>> )
Here, the GENERATE() function steps through a set of items and evaluates an MDX expression for each item in the set. This statement shows that to get the answer, I need to determine the product department name and the top brand names within it for each product department. Next, I expand the <<Dept and Top Brands>> item in the previous statement to call out the current product department. The following expression shows that I need another expression to determine the top brands within this department:
<<Dept and Top Brands>> =
{ [Product].CURRENTMEMBER, <<Top
Brands Within Dept>> }
To determine the top brands within the product department, I use the TOPCOUNT() function and specify that I want the top three brands based on unit sales:
<<Top Brands Within Dept>>
= TOPCOUNT( <<Brands Within
Dept>>, 3, [Unit Sales] )