Applying principles of an ancient art can help you achieve balance in your database design
The premise behind the ancient practice of feng shui is that the positioning of objects in your home or office can either impede or encourage the flow of energy in and around you. Flowing energy, called chi, can improve your creativity, your energy level, and your overall outlook on life. Impediments to the flow of energy or objects that send that energy in the wrong direction can have negative effects on the world around you. Similarly, after 20 years of programming, building, and designing databases, I've noticed that data has its own energy. When a database design is appropriate for the demands placed on it, data flows easily. When a database design is missing important components such as indexes, the action of one user can disrupt the flow of data for many users. Let's examine some of the design choices that can help you improve the flow of data in your databases.
Find the Proper Orientation
The orientation of the user to the data is a fundamental concern in database design. Online transaction processing (OLTP) applications typically perform many insertions, updates, and deletions, whereas analysis applications (including OLAP applications) perform mostly SELECT queries. In other words, OLTP applications change the data; analysis applications read the data. These two operations have always been fundamentally opposed to each other.
The main way that opposition manifests itself in SQL Server is in lock contention. OLTP applications must use exclusive locks to maintain data integrity; however, those locks disrupt the flow of data to analysis programs. Many of the improvements to SQL Server's engine over the past 5 years have focused on decreasing the time that exclusive locks stay in place, the speed with which they're imposed and removed, and the granularity of what they lock. The idea behind changing from page-level locking to row-level locking is that you can reduce overall contention for data by spreading the locks across more discrete units. Whether this idea is valid depends not only on the database's size relative to the number of concurrent users but also on how applications manipulate the data.
As an example, let's look at what might happen if you used the Northwind database to support Northwind Traders' e-commerce Web site. As you might expect, most of the records will be in the Order Details table. Given the relationship between Orders and Order Details, you'll have several rows in Order Details for each row in Orders. The rows in Order Details are 22 bytes each, so if you use page-level locking, the system could support as many as six (2155 rows ÷ 368 rows per page) users simultaneously making changes that require an exclusive lock. Similarly, while many hundreds of users might be able to read rows out of the table, a single shared lock could prevent just as many people from being able to change their orders. As the number of rows in Order Details increases, the potential number of concurrent users increases proportionately because the probability that two users will want the same page decreases. In this way, growth in a table can actually improve data flow.
Row-level locking is a significant improvement over page-level locking because locking affects smaller sections of the table (i.e., one lock affects only one row instead of all 368 on a page). However, the level of contention will still increase as the number of users grows. Row-level locking just increases the number of concurrent users because the locks are more granular and more people can hold locks on individual rows with less chance of interference. However, as a table's row size grows toward the page size, row-level locking loses some of its benefits because the locking granularity approaches that of page-level locking. Therefore, in systems with row-level locking, heavily used tables should have the most possible rows per page.
Additionally, the server itself can be a potential impediment to data flow. As the number of users increases, so does the demand for space in the data cache. The load on the hard disk subsystem also increases as SQL Server searches for specific records. If Northwind Traders doesn't regularly archive rows out of the Order Details table, the time to search for rows and load them into memory will increase linearly until the hard disk subsystem reaches capacity. When the disk subsystem exceeds capacity, the search time tends to increase logarithmically because of delays in transferring data from the disk to memory. In this case, the constraints of the server's hardware can limit even the best database design.
The fact that the Products table contains a column for storing the current inventory levels is a different sort of problem. The concern is that Products will be a part of nearly every operation the database supports. The SELECT query issued by someone browsing for items to buy will prevent an update of the inventory levels by someone placing an order. Orders being finalized also cause a problem for everyone on the system, and someone running a long sales report could prevent a large number of customers from being able to complete their orders.
Realtime inventory management is an age-old problem with many different solutions. Consider what happens when a customer on the Web site places 100 bottles of rare wine in a shopping cart. If the application reserves those bottles by decreasing the amount in the Products table, other customers who want that wine might find that the inventory on hand is less than they need and go somewhere else to shop. If the application adds the item to the cart but doesn't reserve the inventory, customers might buy more bottles than Northwind Traders has in stock. This business decision is quite difficult.
Prev. page  
[1]
2
3
next page