The next release of SQL Server, SQL Server 2005, will contain many features and extensions to make you more productive, especially if you're doing database-driven Web-application development. (For a preview of SQL Server 2005, check out the May 2004 issue of SQL Server Magazine.) However, Microsoft has delayed the final release of SQL Server 2005 until the first half of next yearand many production shops won't migrate to the new database system for at least a year after that. In the meantime, you can do plenty to optimize your SQL Server 2000 databases.
I'm a data modeler and design architect, so my instinct is to look at and optimize the table schema. If you're a regular reader of my column, Solutions by Design, you know I advocate strong table normalization, which is essential to maintaining data integrity when you're adding and managing data. The benefits of normalization are worth the extra overhead of the multitable joins required for retrieving data from the tables. But you can cut the overhead of joining four or five tables to extract data from a well-normalized database or of aggregating large amounts of data in a summary report. To accomplish this, you can create a view, and instead of having the SQL Server engine dynamically reconstruct the view each time it's used in a query, you can "materialize" it. Instead of a virtual table, your view becomes a physical table.
A view is really a derived virtual table. Its purpose is to enhance visibility of the data by presenting it in a more informational context and to control access to the data by hiding critical or sensitive data from users who don't need to see it. SQL Server "materializes"or makes reala view by creating a unique clustered index on it, so these views are sometimes called indexed views. Like a clustered index, a materialized view has user data associated with it.
Materialized views aren't new to the database world. Although indexed views were new in SQL Server 2000, they've existed for years in other database management systems (DBMSs) such as Oracle and IBM's DB2. Large-platform DBMS vendors developed the materialized view to enhance their data-warehousing systems. A data warehouse is technically data that you've integrated from several different sources into one large data store. This data store might house detail data from operational applications, summarized data from decision-support systems, or a combination of both. In a data warehouse, the data is usually summarized along several dimensions (e.g., time, location, product), then stored for aggregate query processing by OLAP and decision-support applications. Let's look at how useful a materialized view can be in environments other than the data warehouse.
Why Use Materialized Views?
If materialized views were intended to enhance performance in a data warehouse, why should you consider them for a transactional database? After all, when you design a transactional database, you're aiming for speed and agility, like a sports-car designer, so you want to minimize the number of structures, such as indexes, that degrade performance. In a transactional database, you use indexes to enhance performance during data retrievals, but standard practice says to index sparingly and only when necessary because of the overhead involved in inserting or updating data. In addition, because a materialized view is a copy of one or more tables, your data-storage requirements could easily double.
Materialized views provide very quick access to data. The performance increase generally compensates for the extra disk storage and processor overhead involved in keeping table data synchronized with data in the materialized view. I can't tell you definitely whether the performance enhancements a materialized view brings are worth the extra space and processing cycles to keep your data synchronized; so much depends on your situation. But I can suggest the following typical scenarios in which you might want to test whether a materialized view would work for you. Note that because indexed views add a lot of overhead during data modification, the best candidates are tables that are fairly static or that receive additions and updates during off-peak hours.
Summarizing data in an operational database. You know that aggregating data in an operational database requires a lot of system resources. The process involves more than calculating the sums, counts, or averages; SQL Server's lock manager has to balance requests for the data and might even have to delay update requests while the aggregation is in progress. In addition, each time an aggregating query executes, if the records aren't already in the data cache, SQL Server has to physically scan the data from the hard disk into memory; the more data you're aggregating, the more physical I/O you need. SQL Server can use many techniques to minimize the delay, such as releasing data pages immediately after processing the records, but the bottom line is that aggregating queries can cause slowdowns in performance, especially if your system is already carrying a heavy load.
Listing 1 shows an example of a materialized summary view of the Northwind database's Orders table, which contains data about where every order was shipped. A summary report of orders by postal codewhich postal codes Northwind is delivering towould be useful for analyzing sales trends. Figure 1 shows a partial result set from running Listing 1. From this report, Northwind Traders can easily tell which postal codes are receiving the most products.
Prev. page  
[1]
2
3
next page