Building a data warehouse isn’t a simple task and it shouldn’t be done by one person working alone. Because a data warehouse combines the best of business practices and information systems technology it requires the cooperation of both business and IT, continuously coordinating in order to align all the needs, requirements, tasks and deliverables of a successful data warehouse implementation. I’d like to share the approach I use when planning and managing any database project—this includes transactional databases, data warehouses, and hybrid databases. I live in the world of relational databases and data warehouses and the extraction, transformation, and loading (ETL) processes that support them, so I’ll focus my approach in this realm. However, you can extend this approach to the entire stack—OLAP cubes and information delivery applications such as reports, ad-hoc analysis, scorecards, and dashboards.
I’m not presuming how to tell a bona-fide project manager (PM) how to do his or her job. Rather, I’m writing this for DBAs and developers who don’t have the good fortune to be working with an experienced PM, or for those IT pros who have been summarily ordered to “build a data warehouse” and act as their own PM. My discussion won’t be complete, but I hope that it’ll give you enough information to start the project ball rolling.
A data warehouse project has three tracks—the data track, the technology track, and the application layer track, as illustrated in Figure 1. When you’re putting together any database project plan I recommend using these three tracks as a template to manage and synchronize your activities. You can also use Figure 1 as a high-level overview when explaining the plan to technical decision makers (TDMs), business decision makers (BDMs), and all other participants in the data warehouse project.
Using a Lifecyle Management Method
I encourage you to take advantage of the resources that your organization may offer, such as techniques and methodologies for designing, developing, and deploying systems and software. If your company hasn’t adopted any formalized methodology for doing this work, then go ahead and use the technique that I’ve developed for my own database projects, namely, the 7D Database Lifecycle Management Method™, familiarly known as the “7D Method™.”
My 7D Database Lifecycle Management Method™ addresses lifecycle management of the database, not the lifecycle of hardware or software (applications) that touch the database. I’ve included both hardware and software tracks in Figure 1, but I won’t be expanding on the management of either. It’s necessary to align and synchronize milestones in the database lifecycle with both hardware and applications in order to successfully implement the database lifecycle methodology.
You never truly finish building a data warehouse. Unlike a traditional database, which often remains relatively static for some extended period of time after deployment, a data warehouse is constantly in a state of flux, responding to changes in the business conditions for which it was created. Today’s business environment is more complex and deals with a faster rate of change than ever before in recorded history. Managing nearly constant change is one of the greatest challenges of the enterprise. That’s why it’s so important that everyone on the data warehouse team, BDMs and TDMs alike, should be on the same book and page, using the same type of lifecycle approach, so that they’re totally aligned in their thinking. Only by doing this is there any hope of aligning the implemented data warehouse with the vision and purpose of the enterprise. In Figure 1 I’ve laid out the seven steps of my 7D Method™, and I’ll walk you through each step in this article.
Step 1: Discover
I guarantee that any database project of any appreciable size and scope will fail if you don’t start with a discover stage. Also known as “requirements analysis and definition,” the Discover step requires a business-centric approach, especially in data warehousing projects, since the output from a data warehouse needs to support the organizational goals. The Discover step is essentially an investigation, and you should be constantly asking six basic questions (what, how, where, who, when, and why), recording the answers, and incorporating these answers into the solutions you craft.
In the first three steps (Discover, Design, and Develop) there must be concentrated coordination between thebusiness owners and the technology specialists; the PM should be enabling this process. As an independent professional who’s primarily concerned that the project stays on time, within budget, and works as promised, the PM is in charge of establishing critical paths, milestones, and success metrics, after getting feedback from all parties. If there’s no PM on the project these will be your tasks.
In the Discover stage the PM has to collect information for all three tracks shown in Figure 1, the technology track, the data track, and the application layer track. Among other tasks, the PM has to specify stakeholders and users, and must understand each of their roles and their data/visualization needs. The PM has to be aware of the organization’s performance management strategy: What are the objectives, initiatives and supporting metrics/KPIs used to track the health of the business and the project? If any of these portions of the strategy are missing, then there’s a high probability that the project will miss its mark with end-users, which would result in low adoption rates and no future funding. In other words, the project wouldfail, no matter how flawlessly the project tasks were executed.