Data warehousing is a business analyst's dreamall the information about the organization's activities gathered in one place, open to a single set of analytical tools. But how do you make the dream a reality? First, you have to plan your data warehouse system. You must understand what questions users will ask it (e.g., how many registrations did the company receive in each quarter, or what industries are purchasing custom software development in the Northeast) because the purpose of a data warehouse system is to provide decision-makers the accurate, timely information they need to make the right choices.
To illustrate the process, we'll use a data warehouse we designed for a custom software development, consulting, staffing, and training company. The company's market is rapidly changing, and its leaders need to know what adjustments in their business model and sales practices will help the company continue to grow. To assist the company, we worked with the senior management staff to design a solution. First, we determined the business objectives for the system. Then we collected and analyzed information about the enterprise. We identified the core business processes that the company needed to track, and constructed a conceptual model of the data. Then we located the data sources and planned data transformations. Finally, we set the tracking duration.
Step 1: Determine Business Objectives
The company is in a phase of rapid growth and will need the proper mix of administrative, sales, production, and support personnel. Key decision-makers want to know whether increasing overhead staffing is returning value to the organization. As the company enhances the sales force and employs different sales modes, the leaders need to know whether these modes are effective. External market forces are changing the balance between a national and regional focus, and the leaders need to understand this change's effects on the business.
To answer the decision-makers' questions, we needed to understand what defines success for this business. The owner, the president, and four key managers oversee the company. These managers oversee profit centers and are responsible for making their areas successful. They also share resources, contacts, sales opportunities, and personnel. The managers examine different factors to measure the health and growth of their segments. Gross profit interests everyone in the group, but to make decisions about what generates that profit, the system must correlate more details. For instance, a small contract requires almost the same amount of administrative overhead as a large contract. Thus, many smaller contracts generate revenue at less profit than a few large contracts. Tracking contract size becomes important for identifying the factors that lead to larger contracts.
As we worked with the management team, we learned the quantitative measurements of business activity that decision-makers use to guide the organization. These measurements are the key performance indicators, a numeric measure of the company's activities, such as units sold, gross profit, net profit, hours spent, students taught, and repeat student registrations. We collected the key performance indicators into a table called a fact table.
Step 2: Collect and Analyze Information
The only way to gather this performance information is to ask questions. The leaders have sources of information they use to make decisions. Start with these data sources. Many are simple. You can get reports from the accounting package, the customer relationship management (CRM) application, the time reporting system, etc. You'll need copies of all these reports and you'll need to know where they come from.
Often, analysts, supervisors, administrative assistants, and others create analytical and summary reports. These reports can be simple correlations of existing reports, or they can include information that people overlook with the existing software or information stored in spreadsheets and memos. Such overlooked information can include logs of telephone calls someone keeps by hand, a small desktop database that tracks shipping dates, or a daily report a supervisor emails to a manager. A big challenge for data warehouse designers is finding ways to collect this information. People often write off this type of serendipitous information as unimportant or inaccurate. But remember that nothing develops without a reason. Before you disregard any source of information, you need to understand why it exists.
Another part of this collection and analysis phase is understanding how people gather and process the information. A data warehouse can automate many reporting tasks, but you can't automate what you haven't identified and don't understand. The process requires extensive interaction with the individuals involved. Listen carefully and repeat back what you think you heard. You need to clearly understand the process and its reason for existence. Then you're ready to begin designing the warehouse.
Step 3: Identify Core Business Processes
By this point, you must have a clear idea of what business processes you need to correlate. You've identified the key performance indicators, such as unit sales, units produced, and gross revenue. Now you need to identify the entities that interrelate to create the key performance indicators. For instance, at our example company, creating a training sale involves many people and business factors. The customer might not have a relationship with the company. The client might have to travel to attend classes or might need a trainer for an on-site class. New product releases such as Windows 2000 (Win2K) might be released often, prompting the need for training. The company might run a promotion or might hire a new salesperson.
The data warehouse is a collection of interrelated data structures. Each structure stores key performance indicators for a specific business process and correlates those indicators to the factors that generated them. To design a structure to track a business process, you need to identify the entities that work together to create the key performance indicator. Each key performance indicator is related to the entities that generated it. This relationship forms a dimensional model. If a salesperson sells 60 units, the dimensional structure relates that fact to the salesperson, the customer, the product, the sale date, etc.
Then you need to gather the key performance indicators into fact tables. You gather the entities that generate the facts into dimension tables. To include a set of facts, you must relate them to the dimensions (customers, salespeople, products, promotions, time, etc.) that created them. For the fact table to work, the attributes in a row in the fact table must be different expressions of the same event or condition. You can express training sales by number of seats, gross revenue, and hours of instruction because these are different expressions of the same sale. An instructor taught one class in a certain room on a certain date. If you need to break the fact down into individual students and individual salespeople, however, you'd need to create another table because the detail level of the fact table in this example doesn't support individual students or salespeople. A data warehouse consists of groups of fact tables, with each fact table concentrating on a specific subject. Fact tables can share dimension tables (e.g., the same customer can buy products, generate shipping costs, and return times). This sharing lets you relate the facts of one fact table to another fact table. After the data structures are processed as OLAP cubes, you can combine facts with related dimensions into virtual cubes.
Prev. page  
[1]
2
next page